Lucene5.3.1中的应用学习（一）

来源：互联网发布：python if 不等于编辑：程序博客网时间：2024/04/20 01:01

创建索引的步骤

1.创建Directory，使用磁盘目录打开，现在的FSDirectory.open(),打开的是path类型的路径文件，所以代码要加一句向path类型转换的语句。该目录是索引所在目录，如果有就打开，如果没有就新建。

File file =new File("D:\\data");

Path path = file.toPath();

Directory dir = FSDirectory.open(path);

2.在指定索引目录文件中创建IndexWriter.创建这个之前要先建所需的配置IndexWriterConfig,指定所需要的分词器StandardAnalyzer或者其他。

Analyzer luceneAnalyzer = new StandardAnalyzer();
IndexWriterConfig iwc = new IndexWriterConfig(luceneAnalyzer);
iwc.setOpenMode(OpenMode.CREATE);
IndexWriter iw = new IndexWriter(dir, iwc);

3.创建Document文档，并往文档里面添加域Field

Document d = new Document();
Reader txtR = new FileReader(dataFiles[i]);
Field fp = new Field("path", dataFiles[i].getPath(), Field.Store.YES, Field.Index.NOT_ANALYZED);//pass the path of file
Field fb = new Field("content",txtR);//pass the content of file
d.add(fp);
d.add(fb);

4.最后将创建好的Document文档添加到IndexWriter中就可以为该文档使用指定的分词器创建索引。

iw.addDocument(d);

最后，将IndexWriter关闭。为什么要关闭？

因为索引的读写都是非常消耗资源的，如果处理的文件特别大，就会消耗内存。另外，如果需要多次使用索引的读或者写，重复打开或关闭也是非常消耗资源时间，那么就不关闭，但是如果要更新索引，对索引处理之后要提交，即commit。最后的最后，索引的关闭随着应用的打开或者关闭来打开或关闭。

通过索引查找步骤

1.创建Directory打开索引目录

File file = new File("D:\\index");
Path path = file.toPath();
Directory dir = FSDirectory.open(path);

2.创建IndexReader读目录，但是实际上是用DirectoryReader.open打开读取的目录。

IndexReader ir = DirectoryReader.open(dir);

3.创建查询IndexSearcher,在索引中查询

IndexSearcher is = new IndexSearcher(ir);

4.获得查询query字段和相关字段的内容

QueryParser queryparse = new QueryParser("content", new StandardAnalyzer());
Query query = queryparse.parse("学习");

5.使用IndexSearcher查询要查询的内容query

TopDocs topDoc = is.search(query,1);第二个参数表示查询出几条相关结果

6.将查询得到的条目根据得分获得最相关的几个，然后将获得的查询结果存放中Document中并get相关字段内容输出。

ScoreDoc[] scoreDocs = topDoc.scoreDocs;
for(ScoreDoc scoreDoc:scoreDocs){
Document document = is.doc(scoreDoc.doc);
System.out.println(document.get("path"));

}

0 0