Lucene学习笔记(摘抄整理版)

来源：互联网发布：sql将某字段内容累加编辑：程序博客网时间：2024/05/18 01:37

Lucene In Action阅读笔记之一

　　最近在学习Lucene，苦于没有教程，自有一本旧书外加下载了一个 Lucene-2.2.0的源码压缩包。

按照书里的例子动手实践一下，可总是报错，很烦闷。折腾了一上午才知道源码包里的API已经是2.2，

书里还是1.4。搜索到的好多也是 Lucene-1.4.3版本的，于是就准备摘抄整理一份笔记，为的是自己不忘记，同时也希望能给看到这篇文章的朋友一点帮助。

先从Lucene的1.4.3版和2.2.0版之间的比较开始。

　这是书里的代码，建立索引

Document doc= new Document();doc.add(Field.Text("contents",new FileReader(f))); //<----这里doc.add(Field.Keyword("filename",f.getCanonicalPath())); //<----这里

　　索引机制在2.0版早就变了，还傻乎乎地跟着书里写，编译必然通不过：
　　改成这样：

Document doc= new Document();doc.add(new Field("contents",new FileReader(f)));doc.add(new Field("filename",f.getCanonicalPath(),Field.Store.YES,Field.Index.UN_TOKENIZED));

                            处理方式
字段类型
Stored
Indexed
Tokenized
Keyword
Y
Y
N
UnIndexed
Y
N
N
UnStored
N
Y
Y
Text: String
Y
Y
Y
Text : Reader
N
Y
Y
Lucene 2.0.0版本和1.4.3版本中关于Field方法改动比较

1.4.3版本中的方法

2.2.0版本中的方法对应替代方法

用途

Keyword(String name, String value)

Field(String name, String value, Field.Store.YES, Field.Index.UN_TOKENIZED)

存储、索引、不分词，用于URI（比如MSN聊天记录的日期域、比如MP3文件的文件全路径等等）

UnIndexed(String name, String value)

Field(String name, String value,Field.Store.YES, Field.Index.NO)

存储、不索引、不分词，比如文件的全路径

UnStored(String name, String value)

Field(String name, String value,Field.Store.NO, Field.Index.TOKENIZED)

不存储、索引、分词，比如HTML的正文、Word的内容等等，这部分内容是要被索引的，但是由于具体内容通常很大，没有必要再进行存储，可以到时候根据URI再来挖取。所以，这部分只分词、索引，而不存储。

Text(String name, String value)

Field(String name, String value, Field.Store.YES, Field.Index.TOKENIZED)

存储、索引、分词，比如文件的各种属性，比如MP3文件的歌手、专辑等等。

Text(String name, Reader value)

Field(String name, Reader reader)

不存储、索引、分词。

更多参考见：
　　http://blog.chinaunix.net/u1/42750/showart_350905.html
　　这位大侠写得很清楚，另外请多翻API。

　　搜索的程序段，书里的代码：

Directory fsDir = FSDirectory.getDirectory(indexDir,false);IndexSearcher is = new IndexSearcher(fsDir);Query query = new QueryParser("contents",new StandardAnalyzer()).parse(q);

　　API里，FSDirectory.getDirectory早就废了，新方法看起来更懒：

IndexSearcher is = new IndexSearcher(IndexReader.open(Const.INDEX_DIR));Query query = new QueryParser("contents",new StandardAnalyzer()).parse(q);

下一篇计划学习Lucene 的工作原理