【翻译自mos文章】 怎么对Microsoft (Office) Word Document 2007 索引化?

来源:互联网 发布:云计算阅读理解及答案 编辑:程序博客网 时间:2024/05/16 18:55


怎么对Microsoft (Office) Word Document 2007 索引化?
来源于:How To Index a Microsoft (Office) Word Document 2007 ? (文档 ID 752710.1)

适用于:
Oracle Text - Version: 11.1.0.7 to 11.2.0.3 - Release: 11.1 to 11.2
Information in this document applies to any platform.

目标
本文解释了对一个表中 含有 Microsoft Word 2007 document (new Microsoft formatting,DOCX格式)的 blob 列进行索引化的方法。

从Oracle Database 11.1.0.7开始,Oracle Text使用Oracle Outside In HTML Export技术(额外注:Oracle Outside In HTML Export技术来源于Oracle 公司的如下产品线:Middleware > Content Management > Oracle Outside In Technology > )进行文档过滤,该技术替代了Autonomy Inc公司授权给Oracle公司的filtering technology。
因此,这将会允许从Oracle Database 11.1.0.7+开始来对Microsoft (Office) Word 2007 documents进行索引化。
Kindly refer to the Appendix B of Oracle Text Reference for a complete list of filter-supported document formats in 11.1.0.7.

Oracle Text Reference 11g Release 1 (11.1)
Part Number B28304-03
http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/afilsupt.htm#i634493
B.2 Supported Document Formats


解决方案:

请按照下面的步骤来完成对 Microsoft Word 2007 document的搜索

Step 1 - Within the /tmp directory place all the files to be used from this note.

docx1.sqldocx2.sqltest.txttest.docx


--如上4个文档已经上传到csdn资源中,地址如下:
http://download.csdn.net/download/msdnchina/9480052


Step 2 - Create the necessary schema and privileges

connect system/manager       or as any privileged user...

create user testdocx identified by testdocx;grant connect, resource, create any directory to testdocx;connect testdocx/testdocx


Step 3 - Create the necessary objects (refer to the docx1.sql script)...

SQL> @/tmp/docx1.sql

 

Step 4 - Check a couple of terms inside the documents (refer to the docx2.sql script) ...

SQL> @/tmp/docx2.sql



 

0 0
原创粉丝点击