java鬼混笔记:lucene 1、简单的创建索引和查询

来源:互联网 发布:淘宝联盟佣金怎么获得 编辑:程序博客网 时间:2024/05/29 04:46
这次学习笔记是lucene最简单的索引创建和查询(什么是lucene不做介绍和说明了,,,)
首先先创建3个txt文件,我创建的是in_the_end.txt、iridescent .txt、numb.txt(都是林肯公园的歌,可惜主唱今年飞天了。。。),里面都是歌词内容
准备好了,现在就创建索引(流程大概是:拿到文件,分词器分词,分好后根据语来创建索引,保存索引),上代码

索引创建


package cn;import java.io.BufferedReader;import java.io.File;import java.io.FileInputStream;import java.io.InputStreamReader;import org.apache.lucene.analysis.Analyzer;import org.apache.lucene.analysis.standard.StandardAnalyzer;import org.apache.lucene.document.Document;import org.apache.lucene.document.Field.Store;import org.apache.lucene.document.TextField;import org.apache.lucene.index.IndexWriter;import org.apache.lucene.index.IndexWriterConfig;import org.apache.lucene.index.IndexWriterConfig.OpenMode;import org.apache.lucene.store.Directory;import org.apache.lucene.store.FSDirectory;import org.apache.lucene.util.Version;// 创建全文索引public class Create {public static void main(String[] args) throws Exception {// 建立存储目录(索引创建后存储放的位置)// 相当于一个数据库Directory directory = FSDirectory.open(new File(System.getProperty("user.dir")+File.separator+"dir"));//到时这个目录下有.frq,.prx,.tim等等文件// 创建分词器(拆分文字),中文的话这个分词器都是分成单个的Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);// 创建索引配置信息IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);config.setOpenMode(OpenMode.CREATE);// 创建索引写入类IndexWriter iw = new IndexWriter(directory, config);// 哪些文件需要创建索引File needIndex = new File(System.getProperty("user.dir")+File.separator+"src"+File.separator+"txt");for(File f : needIndex.listFiles()) {Document d = new Document();// 相当于数据库里的一条记录System.out.println(f.getName());d.add(new TextField("name", f.getName(), Store.YES));// 相当于一条记录里的字段和字段的内容,比如字段是name,保存的信息是txt的名字,Store.YES:是否要保存索引d.add(new TextField("content", readTxt(f), Store.YES));// 相当于一条记录里的字段和字段的内容,比如字段是contetn,保存的信息是txt的所有内容iw.addDocument(d);iw.commit();}iw.close();}public static String readTxt(File file) {try {BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"));String str = null;StringBuffer sbf = new StringBuffer();while((str = br.readLine()) != null){sbf.append(str);}br.close();return sbf.toString();} catch (Exception e) {e.printStackTrace();return "";}}}
索引创建好了,在dir文件夹下会生成一堆文件。接下来查询一下,现在就直接查单词'end' 在哪个txt文件下
查询代码:


package cn;import java.io.File;import org.apache.lucene.document.Document;import org.apache.lucene.index.DirectoryReader;import org.apache.lucene.index.IndexReader;import org.apache.lucene.index.Term;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TermQuery;import org.apache.lucene.search.TopDocs;import org.apache.lucene.store.Directory;import org.apache.lucene.store.FSDirectory;// 查询public class Search {public static void main(String[] args) throws Exception {// 存储索引目录路径Directory directory = FSDirectory.open(new File("G:\\eclipseworkspace\\lucence\\dir"));// 索引读取工具IndexReader read = DirectoryReader.open(directory);// 索引搜索工具IndexSearcher searcher = new IndexSearcher(read);// 查询Query query = new TermQuery(new Term("content","end"));// 相当于在字段content中查找‘end’// 查询返回记录TopDocs top = searcher.search(query, 3);// 返回前3条System.out.println("得到"+top.totalHits+"条记录");// 返回全部有效的数量,虽然只查出前3条// 拿到有效的docScoreDoc[] scoreDocs = top.scoreDocs;// 遍历获取for (ScoreDoc scoreDoc :scoreDocs){Document doc = searcher.doc(scoreDoc.doc);// 找出这条记录System.out.println(doc.get("name") + "      " + doc.get("content"));// 找出内容}read.close();// 关闭}}

ok,就这么简单,也是个简单入门例子。
显示结果:


得到2条记录
in_the_end.txt      it starts with one thingi don t know whyit doesn t even matterhow hard you trykeep that in mindi designed this rhymeto explain in due timeall i knowtime is a valuable thingwatch it fly byas the pendulum swingswatch it count downto the end of the daythe clock ticks life awayit s so unrealdidn t look out belowwatch the time goright out the windowtrying to hold on,but didn t even knowwasted it all justto watch you goi kept everything inside andeven though i tried,it all fell apartwhat it meant to me willeventually be amemory of a time wheni tried so hardand got so farbut in the end it doesn t even matteri had to fallto lose it allbut in the end it doesn t even matterone thing,i don t know whyit doesn’t even matterhow hard you try,keep that in mindi designed this rhyme,to remind myself howi tried so hardin spite of the wayyou were mocking meacting like i waspart of your propertyremembering all thetimes you fought with mei m surprised it got so (far)things aren t the waythey were beforeyou wouldn t evenrecognise me anymorenot that youknew me back thenbut it all comesback to me (in the end)you kept everything insideand even though i tried,it all fell apartwhat it meant to me willeventually be amemory of a time when ii tried so hardand got so farbut in the endit doesn t even matteri had to fallto lose it allbut in the endit doesn t even matteri ve put my trust in youpushed as far as i can gofor all thisthere s only one thing you should knowi ve put my trust in youpushed as far as i can gofor all thisthere s only one thing you should knowi tried so hardand got so farbut in the endit doesn t even matteri had to fallto lose it allbut in the endit doesn t even matter


numb.txt      i'm tired of being what you want me to be feeling so faithless lost under the surface don't know what you're expecting of me put under the pressure of walking in your shoes (caught in the undertow just caught in the undertow) every step i take is another mistake to you (caught in the undertow just caught in the undertow) i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do is be more like me and be less like you can't you see that you're smothering me holding too tightly afraid to lose control cause everything that you thought i would be has fallen apart right in front of you (caught in the undertow just caught in the undertow) every step that i take is another mistake to you (caught in the undertow just caught in the undertow) and every second i waste is more than i can take i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do is be more like me and be less like you and i know i may end up failing too but i know you were just like me with someone disappointed in you i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do  is be more like me and be less like you i've become so numb is everything what you want me to be i've become so numb is everything what you want me to be 

阅读全文
0 0
原创粉丝点击