ORACLE UCM重要的组件FileStoreProvider

来源:互联网 发布:三星s41959网络吗 编辑:程序博客网 时间:2024/05/22 03:53

File Store Provider



Date:     

April 27, 2007

Product and Version:

Content Server 10gR3

 

Prerequisites and Recommendations:

The FileStoreProvider component extends thestandard file store provider that ships with the 10gR3 core. The standarddefault provider is the mechanism used to access all the files managed by thecontent server and in the out-of-the-box version, the files are stored by theusual means.

FileStoreProvider组件继承UCM10gR3中发布的标准文件存储提供程序(file store provider)。默认的文件存储提供程序是一种机制,该机制被用来访问存储在内容服务器管理的文件,这些文件是以通常的方式进行存储的。

 

The FileStoreProvider is recommended forvery large systems for placing files into storage devices. Once the providerhas been installed and configured, removing it will cause the system to loseknowledge about the location of files. Similarly, reconfiguring a file storeprovider in an active system may cause the system to report files as missing.

FileStoreProvider组件通常推荐使用 在很大的内容服务器上以把文件存储到存储设备上。但是一旦该组件被安装并且已经配置使用,再卸载该组件将导致内容服务器丢失已经存在的内容文件的地址。类 似的,安装配置后,已经开始使用该组件,如果重新配置,系统也将认为原先的文件已经丢失。

Background Information:

In the Content Server, data managementrequires the management of files and their associated metadata. File managementconsists of the ability for users to store and access their checked in files aswell as the files that may have been generated as a result. Initially a singlestorage is all that is needed, but as the volume of content items and filesincreases, it is necessary to disperse the storage. This can be done by addingmore storage devices and/or by creating a sparser directory structure. Theformer option allows for greater storage space, while the later increases thefile access performance.

在内容服务器中,数据管理包括文件管理和跟文件相关的 元数据的管理。文件管理中有包括用户对自己捡入到内容服务器的文件以及系统处理过后生成的文件。刚开始的时候可能只有一个存储区域,但是随着内容项目类别 和文件的增加,有可能要增加分区。通过增加存储设备与(或者)对当前存储进行简单目录划分,前者会增加更多的存储空间,而后者呢,则会提高系统的访问性 能。

 

The second half of the data management isstoring the metadata, which in the case of the Content Server is done in arelational database involving primarily three database tables. The metadata isused to allow users to catalogue the files and to provide a means for creatingfile descriptors for retrieval. For end users, the retrieval is done by theContent Server and how and where the file is stored may be completely hidden.For component and feature writers, who may need to generate or manipulatefiles, the metadata provides a means of completely accessing the desired files.

另外对文件相关的元数据的存储,内容服务器把它们存储在关系型数据库中,其中主要实现在三张表中。用户可以用元数据来对文件进行编目以及生成对文件的一种描述。对于客户端的用户来说,取得文件是通过文件服务器来完成的,如何取得以及从哪里取来的文件用户完全不知道。对于组件以及其他的操作手段,如果要生成内件或者是操作内件,元数据可以提供对文档的访问方式。

 

The location of files in the Content Serverhas remained static over the years. By using the revision information specifiedby the doc type, security group and account, the files and their associatedrenditions are placed into particular directories. For example, the vault (ornative) files, are the files that the user has checked in. Its location,traditionally, has been defined to be

内容服务器存储文件的位置一直以来是固定的。根据内容的类型(DocType)、安全组(sccurity group)、账户(account)等文件以及它的各种版本的文件被存储在一定的目录结构下。例如,vault文件(或者本地文件)是用户捡入的文件原件,一般情况下这类文件的存储位置被定义为

 

~/vault/<dDocType>/<account>/<dID>.<dExtension>

 

where dDocType is the content type providedby the user, dID is the system generated id that uniquely identifies thisrevision and dExtension is the extension of the file checked. In the standardmodel, the system uses the dDocType metadata field to disperse the files acrossthe vault directory. This is rather straightforward calculation andconsequently, is quite transparent to component and feature writers giving themknowledge about where files are located and how to manipulate them. However, ithas also had the effect of limiting storage management. Without carefulmanagement of the location metadata mentioned above, directories can becomesaturated causing the system to slow down. Also, under the standardconfiguration, it is difficult to use extra storage devices or to opt out ofthe creation of, for example, the web renditions.

dDocType指的是内容的类型,在捡入时由用户指定。dID是系统生成用来标识本内容的唯一标识符。dExtention是捡入文件的扩展名。默认的情况下,系统将在把捡入的内容在vault目录下再以该内容的dDocType值进行目录划分。这样看起来非常的简单易懂,对于其他组件和其他操作模块来说更容易知道文件的存储位置。然而,这样也给存储的管理造成了负面的影响。如果对目录中的元数据管理不善的话,很可能某一个目录下的文件将达到饱和而是系统性能受到影响。另外,使用这种存储方式,如果要添加新的存储设备或者to opt out of the creation of, for example, the web renditions.

 

As a consequence of dealing with largesystems, the following features became highly desirable and have been addressedby the more advanced features of the FileStoreProvider component:

-         The ability to relocate files

-         The ability to partition filesacross multiple storage devices

-         The ability to have theweb-viewable be optional

-         The ability to manage andcontrol directory saturation

-         The ability to store files inthe database

-         Provide an API to extend andenhance to different storage paradigms

为了应付比较大的系统就需要下面的几个特性来支持,而这些也已经集中的在FileStoreProvider已经实现。

-           能够重新配置文档的存储路径

-           能够把文件分部分存储到多个存储设备上

-           使得web-layout文件的生成具有可选择性(选择是否生成浏览器可浏览的文件格式)

-           能够管理目录的饱和程度(某个目录下如果文件个数达到一定的数目则自动切换到其他目录进行存储)

-           文件也可以选择存储到数据库中

-           Provide an API to extend and enhance to different storage paradigms

Installation:

Warning: Stellent recommends you deploy andconfirm this component in your development environment before using this in aproduction environment.

警告:Stellent强烈建议您先在开发环境下安装部署该组件,待确认使用方案并测试通过,然后再实施到正式的生产环境下。

 

1)     Download the file FileStoreProvider.zip.

 

2)     The FileStoreProvider.zip is a Stellent component. Use the ComponentWizard or the Admin Server Component Manager to install and enable thecomponent.

 

3)     Restart the Content Server.

 

4)     Configure the file store provider.

 

(一)  下载FileStoreProvider.zip文件

(二)  FileStoreProvider.zipStellent的一个组件。是可用组件安装向导或者组件管理器安装FileStoreProvider组件并启用它

(三)  重新启动系统

(四)  配置file store provider组件。

Renditions and Storage:

In the most cases, a content item consistsof metadata, a primary file and potentially an alternate file. The primary fileis stored in the vault and any web-viewable files are stored in the web. Ifthere is no refinery on the system, the web file is a copy of the primary fileor if it exists, the alternate. If there is a conversion engine or refineryavailable, the primary file may be sent to a conversion engine and create aweb-viewable renditions as well as additional renditions, e.g. thumbnails.Similarly, other components may create auxiliary renditions of the file in thevault and/or the web.

通常的情况下,一个内容项都包含元数据,一个主文件(vault文件)以及一个可选择的alternate文件。主文件一般存储在vault目录下,而网络可视化格式则存储在weblayout目录下。如果系统中没有定义格式转化的话,weblayout文件会保存vault文件的一个拷贝。如果安装了格式转换引擎或者定义了转换器的话,系统会把vault发送到格式转换引擎,创建vault的网络可视化文件以及其他附加的文件,比如缩略图等。类似地,其他的组件也可能创建在vaultweblayout目录下创建其他的辅助的文件。

 

From the web browser, a file can beaccessed dynamically via a Content Server service request or statically. Thestatic weburl is only used when there is a guarantee that the file is on thefile system. Otherwise, the dynamic delivery of the file is used. On the UI,the file store provider only allows the configuration of the static delivery.However, the administrator may decide that the ‘static’ delivery be done as aContent Server service request and in essence be dynamic. By definition, werefer to the dynamic access to the file as weburl and the static access asweburl file.

在浏览器上,文件可以使用动态的一个服务提供给用户或者直接通过静态地址的方式提供。只有系统对文件提供保护的时候才会提供静态地址,否则一般会使用动态服务的方式。在UI上,文件存储提供程序只允许配置静态传送。然而管理者可以限定静态的传送后台实际上是以动态服务的形式实现。根据定义我们可以通过是以动态方式访问文件或者以静态地址的方式访问文件。

 

This brings us to the terms we will beusing for the rest of the discussion. When we say rendition, we mean theprimary file, web viewable, alternate file or any of the additional renditions.When we say storage class, we are referring to the vault, web or weburl. So, arendition is a version of the file, while the storage class is a grouping ofrenditions by either where it is stored or how it is accessed.

下面介绍一些我们在以后的讨论过程中经常要遇到的一些名词术语。我们在说到“rendition”的时候是指主文件(primary file)、网络可浏览文件(weblayout文件)以及其他一些可能用到的文件。当我们说到“storage class”就是指的vaultweb或者是weburl。所以当我们说起“rendition”是指一个文件的各种版本,而“storage class”则是指一组文件,这些组是按照存储的位置或者是访问的方式来划分。

 

The rendition and storage class are tiedtogether via the storage rule. The storage rule is how the system determineshow a content item has its renditions stored in the various devices, e.g. filesystem or database. Note the content item is assigned a storage rule or rathergiven a content item; the storage rule can easily be deduced.

renditon”和“storage class”通过“storage rule”结合到一起。那么什么是storage rule呢?storage rule定义了一个文件的各种版本如何被存储在各种存储设备上,比如数据库或者文件系统等。注意,每一个内容项,都会被赋予一个“storage rule”或者be given a content item;每一个storage rule可以简单地被追溯到。

 

One of way of understanding therelationship between rendition, storage class and storage rule is to walkthrough a few simple examples. For all examples below, a content item is addedto a system consisting of only a primary file.

通过下面的一些例子可以更好地理解renditonstorage rulestorage class之间的关系。在下面的例子中一个只包含主文件(primaty)的内容被捡入到内容服务器。

 

  1. A storage rule is defined to be of type FileStorage.

In thisscenario, the system makes a copy of the primary file into the web directory.

 

  1. A storage rule is defined to be of type FileStorage and as a webless storage.

This is similarto (1) above, except the web file does not exist. When there is a request forthe web-viewable file, the system returns the vault file, i.e. primary file.

 

  1. A storage rule is defined to be of type JdbcStorage.

Both the vaultand web files are stored in the database. However, one should note that thejdbc storage is built on of the file storage and when necessary, a file can beforced onto the file system. This generally occurs during indexing orconversion.

 

  1. A storage rule is defined to be of type JdbcStorage with all renditions on the web stored on the file system.

This is similarto (3) above, except that the web-viewable renditions are on the file system.

 

(一)  Storage rule是默认的FileStorage类型。在该配置下,系统将在weblayout目录下存储vault的拷贝。

(二)   Storage ruleFileStorage类型并且被定义为没有网络版本。此时的情况跟第一种差不多,但是没有网络版本的文件,即weblayout目录下不会存储vault的拷贝了。当客户端访问该文件时,系统会发送该内容的primary file作为响应。

(三)   文件存储类型为JdbcStorageVault文件和weblayout文件都将被存储在数据库中。尽管如此,这种方式也要以文件存储为基础的,必要时,文件会被放在文件目录下进行处理。主要在索引文件以及对文件进行格式转换的时候会这样做。

(四)  Storage rule定义为JdbcStorage,此时也可以把文件的web-viewable文件存储在文件系统上。该形式跟第三种相似,只是把web-layout文件都存储在文件系统中。

Configuration:

On a successful install, the providers’page now gives the administrator the ability to update the default file storeprovider to be a file system provider. Edit the default file store provider and click on update. From here, youcan change the web, vault and weburl path expressions.

在正确安装FIleStoreProvider之后,会有一些及面提供给我们做管理使用,可以对原来默认的文件存储提供程序进行更新。点击“default file provider”信息中的更新按钮。在这里你可以重新配置webvault以及weburl等的地址表达式。

 

Before the file system provider is fully functional,partitions need to be configured. The partitions are used to define the rootpath of the rendition’s location.

在比较充分地使用文件系统提供程序之前,要先配置一下partitionPartition定义了rendition存储位置的一些根路径。

 

Also, on installation, the component addsthe three metadata fields, xPartitionId, xWebFlag and xStorageRule. Thesemetadata fields are used as follows:

另外,在安装完该组件之后会在元数据项中添加项元数据xPartitionIdxWebFlagxStorageRule。这三个元数据的使用如下:

 

xPartitionId – This metadata field is used in conjunction with the PartitionListtable to determine the root location of the content item files. It isrecommended that this field be hidden on the UI, since the partition selectionalgorithm provides a value.

xPartitionId – 该元数据字段配合PartitionList表使用来决定内容的文件存储在什么根目录下。建议该元数据字段不显示在用户的界面上,因为partition selection algoritym会计算提供这个根路径。

 

xWebFlag – This metadata field is used to determine whether a content itemhas a web-viewable file. Consequently, if the system has content items thathave only vault files, then removing this metadata field will cause the systemto expect the presence of a web-viewable and may cause harm to the system. Themetadata field may be specified by the configuration value WebFlagColumn.

xWebFlag该元数据字段用来决定内容是否有一个weblayout版本的文件存在。因此,如果原先是被定义为不生成weblayout文件,但是最后又删除掉该字段,这样会使得系统认为内容是存在一个weblayout文件的,这样会对系统造成一定的伤害。该字段的取值可以被WebFlagColumn配置项来决定(在配置storage rule的时候也有设置该字段的选项)。

 

xStorageRule – This metadata field isused to track the rule that was used to determine how the file is to be stored.The metadata field may be specified by the configuration value StorageRuleField.

xStorageRule通过该元数据字段可以知道内容使用了那一个storagerule。该值也可以通过配置StorageRuleField环境变量来赋值。

 

The above metadata fields are added by thecomponent on startup. If the metadata fields are deleted and should remainabsent from the system, then use the configuration flag FsAddExtraMetaFieldsto stop the adding of these fields.

上面的字段组件被安装后在系统启动的时候被加入的。如果你不想在系统中保留着几个字段的话要使用FsAddExtraMetaFields变量来配置是否自动生成这些字段。

 

Also, on installation the component addsthe database tables FileStorage and FileCache. These tables areused exclusively by the JdbcStorage file store provider.  The FileStorage table contains the contentsof the files and it uses the dID of the content item and rendition to uniquelyidentify what renditions belong to which content item. The FileCache table isused to remember which files have been downloaded to the system’s file system.These are files when the system for one reason or another required a file onthe file system. These files are for the most part temporary and the system deletesthem as part of a scheduled event.

另外,在安装的时候,组件也会在数据库表中添加FileStorageFileCache两张表。这两张表是在file store provider使用JdbcStorage的时候使用。FileStorage表会存储内容的文件,并使用dIdrendition来唯一确认文件是属于哪一个内容的哪一个版本。FileCache表用来记录哪些文件被下载到文件系统中了。这些文件因为某些穷狂需要下载到文件系统中,这些文件大都是暂时性的并根据一定的时间表会被删除掉。

 

Note that the system only supports oneprimary file store provider and by default it has been named the‘DefaultFileStore’ provider.

 

Configuration Resource tables:

There are four main tables used toconfigure and handle variations in file path locations. The PartitionList tableis initially empty and has a UI allowing a user to add, edit or delete rows.The PathMetaData and PathConstruction are used for path locations and theprovided defaults cover most scenarios. Finally, the FileSystemFileStoreAlgorithmFiltersrequires a component along with java code to enhance.

该组件总共添加了四张表(不是数据库中的表,而是使用html的形式存储的表)来配置和处理各种路径。PartitionList表初始化的时候为空,系统会提供用户接口让用户添、编辑和删除该表中的内容。PathMetaDataPathConstruction用来生成存储位置并提供了一些默认的值。FileSystemFileStoreAlgorithmFilters表需要添加java代码的组件来赠强功能。

 

Note the PathMetaData, PathConstruction andFileSystemFileStoreAlgorithmFilters tables are defined in the providers.hda andare provider specific, while the PartitionList table is defined in the~/data/filestore/config/fsconfig.hda file and has a more global use.

注意PathMetaData表、PathConstructionFileSystemFileStoreAlgorithmFilters表是在providers.hda中定义的并且有provider来指定,而PartitionList则被定义在~/data/filestore/config/fsconfig.had目录下,并且可以做全局的使用。

 

FileSystemFileStoreAlgorithmFilters:

This table is used to map an algorithm name to animplementation of the FilterImplementor interface. The algorithm can bereferenced in the PathMetaData table and is used to calculate the desired pathfield. The class implementing the algorithm must return the required metadatafields it uses for calculation, when the file parameters object is null. Viathe ExecutionContext, the doFilter method is passed in information about thefield, content item, and file store provider that initiated the call. In particular, for the file system provider, thealgorithm will be passed the following information via the ExecutionContext.Bear in mind that other file store providers may choose to pass in more orpossibly different information.

该表用来把一个算法的名字映射到一个FilterImplementor的接口实现上。算法可以被引用到PathMetaData表中用来计算想要得到的路径字段。

 

PropertiesfieldProperties = (Properties)     context.getCachedObject("FieldProperties");

Parameters data =(Parameters)

    context.getCachedObject("FileParameters");

Map localData = (Map)context.getCachedObject("LocalProperties");

String algorithm =(String) context.getCachedObject("AlgorithmName");

 

PathMetaData:

This table is used to determine whatmetadata is used to determine the location of a file. The metadata may comedirectly from the content item’s metadata or be calculated via an algorithm.

该表用来提供哪些用来决定路径的各种元数据。这些元数据可以直接来源于内容的元数据项也可以是通过某种算法计算出来的值。

 

The columns are definedand used as follows:

这些列被定义和使用如下说明:

 

  • FieldName - name of the field as it appears in the path expression
  • FieldName – 出现在路径中的该字段的名字。

 

  • GenerationAlgorithm - if defined, specifies the algorithm used to resolve or compute the value for the field.
  • GenerationAlgorithm – 该字段可有可无,如果定义将被用来解决和计算该字段的值。

 

  • RequiredForStorage - defines for which storage class this metadata is required. Possible values are #all, web, vault. The field is optional for all renditions not specified. Consequently, if this column is empty, then the metadata field is optional for all renditions or storage classes. Note that if an algorithm has been specified, this value is empty. The algorithm uses the value specified in the ArgumentFields column to dictate which fields are required.
  • RequiredForStorage – 定义哪一个storage class中的路径该字段是必须的。可以娶到的值有#all, web, vault。如果不指定的话(即该字段为空)每一个storage class中的路径该字段可有可有。如果在该PathMetaData中指定了一个算法,该字段为空。算法会使用ArgumentFields字段来指示那些字段是必须的。

 

  • OverrideClientValue - By default is false. When set to true, the file store provider will override the value even if one is provided by the user. This value is only used if a GenerationAlgorithm is specified.
  • OverrideClientValue – 默认该值去false。如果设置成true的话,file store provider会覆盖该

 

  • Arguments - optional arguments passed into the filter.
  • Arguments – 传向filter的可选的一些参数。

 

  • ArgumentFields – comma-separated list of fields required by the arguments and consequently required by the algorithm.

 

PartitionList:

This table is used to describe thepartitions that are available for the partitionSelection algorithm. The columnsof the table are used as follows:

该表用来描述partitions。这些partitionpartitionSelection算法用来决定要使用那个根路径存储文件。该表的字段描述如下:

 

  • PartitionName – specifies the name of the partition. This name is referenced in the path expression.

 

  • PartitionRoot – argument passed into the partitionSelection algorithm.

 

  • IsActive – determines if the partition is currently active and accepts new files.

 

  • CapacityCheckInterval – specifies the interval in seconds used in determining the available disk space. This may not work on all platforms.

 

  • SlackBytes – determines if there is sufficient space. If the available space is lower than the slack bytes, the partition is no longer used for contribution.

 

  • DuplicationMethods – available methods are link and copy. Note not all methods are available on all platforms and the ‘copy’ method is recommended by default.

 

 

PathConstruction:

This table provides a mapping of the fileto a path. The path is made up of components, where a component may becalculated via an algorithm, IdocScript variable, environment variable or ametadata lookup.

该表提供路径和文件的映射。路径是有很多部分组成,这些部分可以是通过某个算法计算出来的值,也可以是环境变量或者内容元数据项。(PathConstruction表定义的是StorageRule的一部分,最后一个字段将指示该PathConstruction属于哪一个StorageRule

 

The columns of the tableare defined as follows:

 

  • FileStore - specifies the storage that is being calculated. Possible values are web, vault, weburl, weburl.file.

 

  • PathExpression – defines the path. This path is parsed into components, which are resolved via the PathMetadata field definitions as described below.

 

  • AutoCreateLimit - specifies the depth of the directories that may be created.

 

  • IsWritable - specifies if the storage location is writable.

 

  • StorageRule – specifies the rule this path construction belongs to.

 

The most interesting and important columnof the PathConstruction table is the PathExpression column. As mentioned, itdefines the path or location of the file and consists of components. A path isbroken into its constituent pieces or components by slashes. Each component canbe made of a static string or a sequence of dynamic parts. A dynamic part is encapsulatedby ‘$’. If the part is dynamic then it can have the following interpretations:

PathConstruction表中最重要的字段是PathExpression字段。该字段定义各种文件所存储的位置并且是由很多部分组成的。使用斜线把路径的各个部分分开。每一部分可以是一个静态的字符串或者是一个动态计算的结果。两边用两个$括起来。动态计算的值可以为如下的几种情况

 

  • It may be a field defined in the PathMetaData table. If it is defined in the PathMetaData table, it may be mapped to an algorithm, e.g. $dDocType$

可以是PathMetaData表中定义的一个字段。如果是的话可能映射到某一个算法,如$dDocType$

 

  • If it has the prefix  #env., it is an environment variable, e.g. $#env.VaultDir$

如果有一个#env前缀的话,则说明是一个系统的环境变量。比如$#env.VaultDir$

 

 

  • It may be an IdocScript variable, e.g. $HttpWebRoot$

也可以是一个IdocScript的变量,比如$HttpWebRoot$

 

 

 

For example, the standard vault location isdefined as

例如,标准默认的vault路径定义如下

 

$PartitionRoot$vault/$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$$dExtension$

 

When parsed this turns into 5 components,which will be interpreted according the rules specified in the PathMetaDatatable as follows:

解析的时候该表达式将被分成五部分根据PathMetaData定义的规则进行解释

 

1.     $PartitionRoot$ – this is mapped to the partitionSelection algorithm and uses thexPartitionId as a lookup into the ParitionList table to determine the root.

$PartitionRoot$ 该表达式将映射到partitionSelection算法,使用xPartitionId字段作为key值查询PartitionList表中定义的根值。

 

2.     /vault/ -a string, i.e. no calculation or substitution

/vault/ - 嘿嘿,就是一个字符串,不用什么计算啦。

 

3.     $dDocType$– by the PathMetaData table this is a look up in the file parameters

$dDocType$ - 根据PathMetaData表的定义,该表达式即将查找文件的元数据字段。

 

4.     $dDocAccount$ - this is mapped to a documentAccount algorithm whichtakes dDocAccount and parses it into the standard content server accountpresentation with all the appropriate delimiters.

$dDocAccount$ - 该字段是被映射到documentAccount算法的。(下面的一点就会意翻译啦)该算法将取得dDocAccount的值,然后对应每一个值将建立一个路径,并根据内容的元数据把文件放到合适的目录下。

 

5.     $dID$$ExtensionSeparator$$dExtension$ – this component has three parts

a)     $dID$ - similar to dDocType, this is defined in the file parameters and is a required field.

$dID$ - dDocType相似,该字段是来源于文件的元数据参数该元数据在系统中是必填的字段。

b)    $ExtensionSeparator$ – determined by an algorithm.

$ExtensionSeparator$ 使用某个算法指定

c)     $dExtension$ – similar to dDocType.

$dExtension$dDocType相似。

 

StorageRules:

This table is used to describe the rulesused for storing a content item’s files. The rule specifies which pathexpression to use for which storage class and it also determines it is to bestored and by what mechanism.

该表是用来描述存储文件所用到的所有的storage rule的。Storage rule会定义各种storage class路径表达式,并决定内容的存储方式和存储机制。

 

  • StorageRule– name of the rule. The rule’s name is computed via the dynamic include and stored in the content items metadata field xStorageRule.

 

  • StorageType– determines the storage implementation. Current accepted values are FileStorage and JdbcStorage. In FileStorage, the files are only stored on the file system. JdbcStorage the files by default are stored in the database.

 

  • IsWeblessStore- Used to specify if this is a system that allows webless files. When set to true, it is assumed by default that a newly created content item does not have a web-viewable file. In certain circumstances it is, however, desirable or necessary to insist on a web-viewable. Consequently, an argument in the calling code can be used to specify that a web file needs to be created. This information (if it has a web file or not) is stored in the xWebFlag metadata field.

 

  • RenditionsOnFileSystem – Used by JdbcStorage to determine if any files are to be stored on the file system instead of the database.

 

 

On upgrading the default file storeprovider, a default rule is created. Also, note that deleting or editing astorage rule may result in the system misplacing files.

在更新默认的file store provider时,一个默认的rule将被创建。另外要注意删除和编辑一个storage rule可能会使系统把文件存储到错误的位置。

URL parsing guidelines:

In the standard configuration, the URLcontains security and dDocType information as well as the dDocName andextension. The URL and the web location is constructed as follows:

 

…/groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$.$dWebExtension$

 

The ‘groups’ separator is an indication to the system that the directories thatfollow are the name of the security group the content item belongs to and theaccounts. Note that the accounts are optional and consequently computed by analgorithm. After the security information, we have the ‘documents’ separator, which is immediately followed by the dDocType, i.e. content type. The last part of the URL above is the dDocName and its format.

 

Since the URL is expected in this format,the system can successfully extracted metadata from it. More importantly, itcan determine the security information for the content item and derive theaccess privileges for particular user.

 

The parsing guidelines have been expandedto allow for dispersion in the web directory. We keep the ‘groups’ separator, but replace the ‘documents’ separator with ‘sg’. Whenthe parse encounters the ‘sg’ separator,it no longer assumes that the remaining part of the URL is /$dDocType$/$dDocName$.$dWebExtension$. Instead, the parser looks for the dispersion end marker ‘d’. Once the ‘d’ isencountered, the system assumes that the following information contains the dDocName and dWebExtensionas before. This means that the system can now successfully parse URLs of theform

 

../groups/$dSecurityGroup$/$dDocAccount$/sg/<dispersion>/dispersion>…/d/$dDocName$.$dWebExtension$

 

Database Configuration:

The following configuration values are usedto control when the file cache is to be cleaned up. Note that the system onlycleans up files that have an entry in the FileCache table

 

FsCacheThreshold – The threshold of when the system starts deleting files that areolder than the minimum age, as specified by the FsMinimumFileCacheAgeparameter. The default unit is megabytes and is set to 100.

 

FsMaximumFileCacheAge – All files older than this are to be deleted. Default is 1 year.The default increment is days and is set to 365.

 

FsMinimumFileCacheAge – This parameter is used in conjunction with the FsCacheThresholdparameter to delete files.

 

Configuration Parameters:

We briefly discuss some of theconfiguration parameters and their locations.

 

In the intradoc.cfg, the followingparameter may be specified:

 

StorageDir – setto a root directory to be used as the root directory for all partitions wherethe PartitionRoot column value has not been specified. In this case, thestorage directory plus the partition name will be used to create thePartitionRoot parameter.

 

In the provider definition file,provider.hda, the following parameters and classes are standard for a filesystem store provider.

 

ProviderType=FileStore

ProviderClass=intradoc.filestore.BaseFileStore

IsPrimaryFileStore=true

 

# Configuration information specific to a filesystem store provider.

ProviderConfig=intradoc.filestore.filesystem.FileSystemProviderConfig

EventImplementor=intradoc.filestore.filesystem.FileSystemEventImplementor

DescriptorImplementor=intradoc.filestore.filesystem.FileSystemDescriptorImplementor

AccessImplementor=intradoc.filestore.filesystem.FileSystemAccessImplementor

 

Usage Examples:

In this section, we explicitly list thecontents of the tables contained in the provider definition file for each ofthe examples. This may give the misleading impression that the administrator isrequired to edit the provider definition file manually. However, configuringthe file store provider does not require manual editing of this file, since thesystem through the user interface creates all the tables necessary and providessufficient defaults for most scenarios.

 

In most of our examples below, we use thefollowing PathMetaData table and its definitions. Note the table has beentrimmed of some it columns to reduce real-estate space and to provide a betterpresentation.

 

@ResultSet PathMetaData

6

FieldName

GenerationAlgorithm

RequiredForStorage

…<trimmed columns>

dID

 

#all

dDocName

 

#all

dDocAccount

documentAccount

 

dDocType

 

#all

dExtension

 

#all

dWebExtension

 

weburl

dSecurityGroup

 

#all

dRevisionID

 

#all

dReleaseState

 

#all

dStatus

 

web

xPartitionId

partitionSelection

 

ExtensionSeparator

extensionSeparator

 

xWebFlag

 

 

RenditionId

 

#all

RevisionLabel

revisionLabel

 

RenditionSpecifier

renditionSpecifier

 

@end

 

 

How to configurethe component to use the usual or standard file paths:

The file system store provider can beconfigured to place the files in the standard locations. The first step is todefine the storage rule. In this case, the storage rule will be of typeFileStorage, since all the files are to be stored on the file system. Next, thepath construction for each of the storage classes needs to be defined for therule. In general, the tail end of the path should be standard for all usageexamples, unless you are willing to limit the system’s functionality. Forexample, by using a non-standard filename, the system will not work well withhcs* files. However, the root path can be changed at will and should not affectfunctionality.

 

@ResultSet StorageRules

4

StorageRule

StorageType

IsWeblessStore

RenditionsOnFileSystem

default

FileStorage

 

 

@end@

 

@ResultSet PathConstruction

4

FileStore

PathExpression

AutoCreateLimit

IsWritable

StorageRule

vault

$#env.VaultDir$$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$$dExtension$

6

true

default

weburl

$HttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

3

false

default

web

$#env.WeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

3

true

default

@end

 

In this configuration, the vault, web andweburl storage classes need to be defined in the PathConstruction table. Thepath expression for the ‘vault’ has already been discussed. So we will onlylook at the path expression for web, which is quite similar to weburl in thatit only differs in its root. That is the ‘web’ path is an absolute path on thefile system, while the weburl is (as its name implies) a URL and served up by aweb server.

 

The path expression of ‘web’ is defined tobe

 

$#env.WeblayoutDir$/groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$ExtensionSeparator$$dWebExtension$

 

This is parsed into its component piecesand they are as follows:

 

  1. $#env.WeblayoutDir$ - look up in the shared environment for the value ‘WeblayoutDir’. This is defined by the content server to be the physical root path of the weblayout directory.

 

1a. (alternatefor weburl) $HttpWebRoot$ - is an IdocScript variable.

 

  1. /groups/ - a string

 

  1. $dSecurityGroup$ - by the PathMetaData table this is a required field and must consequently be provided by the caller or descriptor creator. It is part of the content items metadata information.

 

  1. $dDocAccount$ - this is mapped to a documentAccount algorithm which takes dDocAccount and parses it into the standard content server account presentation with all the appropriate delimiters.

 

  1. /documents/ - a string

 

  1. $dDocType$ - same as dSecurityGroup above

 

  1. $dDocName$ - same as dSecurityGroup above

 

  1. $RenditionSpecifier$ - the rendition specifier is provided by the renditionSpecifier, which is only of interest if the system is creating additional renditions, e.g. thumbnails. Otherwise, this returns an empty string.

 

  1. $RevisionLabel$ - the revision label is provided by the revisionLabel algorithm, which depending on the status of the content item adds a ‘~dRevLabel’ to the path.

 

  1. $ExtensionSeparator$ - the extensionSeparator algorithm is used here and by default it just returns ‘.’.

 

  1. $dWebExtension$ - same as dWebExtension. The dWebExtension is a required field for the web and weburl storage classes and is passed in via the file parameters.

 

 

How to have awebless or optional web store:

The storage rule from above is nowconfigured to have IsWeblessStore set to true and consequently the web-viewablefile will not be created by default. However, if the document is processedthrough the IBR or WebForms or any other component that requires aweb-viewable, the web file will be created. The location of the files is as abovein the ‘standard’ configuration. However, since a file may not have a webrendition, the weburl path needs to be adjusted. Also, note the use ofweburl.file. This is used to compute the URL when the web-viewable actuallyexists. The metadata field xWebFlag is used to determine how the file is to beserved up in the browser.

 

@ResultSet StorageRules

4

StorageRule

StorageType

IsWeblessStore

RenditionsOnFileSystem

default

FileStorage

true

 

@end@

 

@ResultSet PathConstruction

4

FileStore

PathExpression

AutoCreateLimit

IsWritable

vault

$#env.VaultDir$$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$$dExtension$

6

true

default

weburl.file

$HttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

3

false

default

web

$#env.WeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

true

default

@end

 

How to configurethe files to be stored in the database:

To store files in the database, we need astorage rule that is of type JdbcStorage. By default, all content itemsbelonging to this rule have their files stored in the database. However, eventhough the files are stored in the database, there is the presumption of anunderlying file system and the system may need to temporarily cache a file onthe file system. In particular, this may happen for indexing or for someconversions.

 

Note: a rule can be configured to alwaysstore renditions belonging to a given storage class on the file system. Thisprobably most useful for systems that want to store vault files in thedatabase, but web files on the file system.

 

In the ‘default’ rule below, all files arestored in the database, while the ‘filesInWeb’ rule stores the vault files inthe database and the web files on the file system. The path construction is asbefore.

 

@ResultSet StorageRules

4

StorageRule

StorageType

IsWeblessStore

RenditionsOnFileSystem

default

JdbcStorage

 

 

filesInWeb

JdbcStorage

 

web

@end@

 

@ResultSet PathConstruction

4

FileStore

PathExpression

AutoCreateLimit

IsWritable

StorageRule

vault

$#env.VaultDir$$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$$dExtension$

6

true

default

weburl.file

$HttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

3

true

default

web

$#env.WeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

3

true

default

@end

 

Altered paths and algorithms at work:

Up to now, the examples have kept the filepaths to be consistent with the standard configuration. However, for very largesystems this is likely to result in directory saturation. Below are someexamples to aid in file dispersion.

到目前为止,上面的例子已经说明如何配置该组件。然而,一个较大的系统很容易造成目录的饱和而不能正确存储文件,下面将告诉大家如何分散放置文件。

How to usePartitioning:

如何使用Partition

The file system store provider makes iteasy to use partitions to create a sparser directory structure. By default, thexPartitionId metadata field is used and becomes a part of the revisionsmetadata information. It is recommended that this field is hidden from the UIand let the partition selection algorithm determine the partition to use. Thepartition selection algorithm looks at all the active partitions, and as a newcontent enters the system, the partitions are round robined. Each partition hasan entry in the PartitionList table and can be declared active. The PartitionRoot is calculated from the xPartitionId, wherethe value is a look up key into the PartitionList table. If no xPartitionId isspecified, the system finds the next available and active partition and usesthis value for the location calculation. The xPartitionId is then stored aspart of the content item’s metadata.

File storage provider使得使用partition来创建目录非常的容易。默认情况下,xPartitionId元数据字段会被使用并且成为内容元数据信息的一部分。建议要把xPartition元数据字段在用户的页面中隐藏,让partition选择算法来决定使用哪一个partition。当内容被捡入到服务器时,Partition Slection算法将查找各个partition,它知道使用哪一个partition。每一个partition都在PartitionList表中定义,并且是被设置成active的。

 

To use the partition selection, define thevault storage class in the PathConstruction table as follows:

要使用partition selection算法的话,只要在PathConstruction表中像下面一样定义vaultstorage class即可:

vault

$PartitionRoot$/$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$$dExtension$

6

true

 

If at any point in time, the administratorfeels a particular partition should now no longer be open to contribution, heshould edit the partition (via the partition UI) to no longer be active, i.e.IsActive is false.

如果管理员觉得某一个partition已经太多内容了不能再打开该partition了,可以在编辑partition的页面中把该partitionIsActive设置成false即可。

 

How to limit thenumber of files in a directory:

如何限制一个文件夹下的文件的个数

Another way of dispersingfiles is to alter the path so that files get partitioned out by the dID of thecontent item. In the example below, the directories are limited to 10,000 filesplus extra files for additional renditions.

 

Note the dID[-12:-10:0]in the path expression. This is interpreted as follows: get the charactersstarting at 12 back from the end of the string until you get the character 10back from the end of the string. Pad the resulting string to length 2, which12-10, with 0 characters.

 

For example, if you pathexpression contains:

 

$dID[-12:-10:0]/$dID[-10:-8:0]$/$dID[-8:-4:0]$

 

And dID is 1234567890,the result is 00/12/3456

 

 

Additional Keywords:

file, database, webless, vaultless,storage, configuration, install, jdbc, file system

For More Information:

 

 

-----------------------------

Rev: April 27, 2007

原创粉丝点击