java鬼混笔记:lucene 1、简单的创建索引和查询
来源:互联网 发布:淘宝联盟佣金怎么获得 编辑:程序博客网 时间:2024/05/29 04:46
首先先创建3个txt文件,我创建的是in_the_end.txt、iridescent .txt、numb.txt(都是林肯公园的歌,可惜主唱今年飞天了。。。),里面都是歌词内容
准备好了,现在就创建索引(流程大概是:拿到文件,分词器分词,分好后根据语来创建索引,保存索引),上代码
索引创建
package cn;import java.io.BufferedReader;import java.io.File;import java.io.FileInputStream;import java.io.InputStreamReader;import org.apache.lucene.analysis.Analyzer;import org.apache.lucene.analysis.standard.StandardAnalyzer;import org.apache.lucene.document.Document;import org.apache.lucene.document.Field.Store;import org.apache.lucene.document.TextField;import org.apache.lucene.index.IndexWriter;import org.apache.lucene.index.IndexWriterConfig;import org.apache.lucene.index.IndexWriterConfig.OpenMode;import org.apache.lucene.store.Directory;import org.apache.lucene.store.FSDirectory;import org.apache.lucene.util.Version;// 创建全文索引public class Create {public static void main(String[] args) throws Exception {// 建立存储目录(索引创建后存储放的位置)// 相当于一个数据库Directory directory = FSDirectory.open(new File(System.getProperty("user.dir")+File.separator+"dir"));//到时这个目录下有.frq,.prx,.tim等等文件// 创建分词器(拆分文字),中文的话这个分词器都是分成单个的Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);// 创建索引配置信息IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);config.setOpenMode(OpenMode.CREATE);// 创建索引写入类IndexWriter iw = new IndexWriter(directory, config);// 哪些文件需要创建索引File needIndex = new File(System.getProperty("user.dir")+File.separator+"src"+File.separator+"txt");for(File f : needIndex.listFiles()) {Document d = new Document();// 相当于数据库里的一条记录System.out.println(f.getName());d.add(new TextField("name", f.getName(), Store.YES));// 相当于一条记录里的字段和字段的内容,比如字段是name,保存的信息是txt的名字,Store.YES:是否要保存索引d.add(new TextField("content", readTxt(f), Store.YES));// 相当于一条记录里的字段和字段的内容,比如字段是contetn,保存的信息是txt的所有内容iw.addDocument(d);iw.commit();}iw.close();}public static String readTxt(File file) {try {BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"));String str = null;StringBuffer sbf = new StringBuffer();while((str = br.readLine()) != null){sbf.append(str);}br.close();return sbf.toString();} catch (Exception e) {e.printStackTrace();return "";}}}
索引创建好了,在dir文件夹下会生成一堆文件。接下来查询一下,现在就直接查单词'end' 在哪个txt文件下查询代码:
package cn;import java.io.File;import org.apache.lucene.document.Document;import org.apache.lucene.index.DirectoryReader;import org.apache.lucene.index.IndexReader;import org.apache.lucene.index.Term;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TermQuery;import org.apache.lucene.search.TopDocs;import org.apache.lucene.store.Directory;import org.apache.lucene.store.FSDirectory;// 查询public class Search {public static void main(String[] args) throws Exception {// 存储索引目录路径Directory directory = FSDirectory.open(new File("G:\\eclipseworkspace\\lucence\\dir"));// 索引读取工具IndexReader read = DirectoryReader.open(directory);// 索引搜索工具IndexSearcher searcher = new IndexSearcher(read);// 查询Query query = new TermQuery(new Term("content","end"));// 相当于在字段content中查找‘end’// 查询返回记录TopDocs top = searcher.search(query, 3);// 返回前3条System.out.println("得到"+top.totalHits+"条记录");// 返回全部有效的数量,虽然只查出前3条// 拿到有效的docScoreDoc[] scoreDocs = top.scoreDocs;// 遍历获取for (ScoreDoc scoreDoc :scoreDocs){Document doc = searcher.doc(scoreDoc.doc);// 找出这条记录System.out.println(doc.get("name") + " " + doc.get("content"));// 找出内容}read.close();// 关闭}}
ok,就这么简单,也是个简单入门例子。
显示结果:
得到2条记录
in_the_end.txt it starts with one thingi don t know whyit doesn t even matterhow hard you trykeep that in mindi designed this rhymeto explain in due timeall i knowtime is a valuable thingwatch it fly byas the pendulum swingswatch it count downto the end of the daythe clock ticks life awayit s so unrealdidn t look out belowwatch the time goright out the windowtrying to hold on,but didn t even knowwasted it all justto watch you goi kept everything inside andeven though i tried,it all fell apartwhat it meant to me willeventually be amemory of a time wheni tried so hardand got so farbut in the end it doesn t even matteri had to fallto lose it allbut in the end it doesn t even matterone thing,i don t know whyit doesn’t even matterhow hard you try,keep that in mindi designed this rhyme,to remind myself howi tried so hardin spite of the wayyou were mocking meacting like i waspart of your propertyremembering all thetimes you fought with mei m surprised it got so (far)things aren t the waythey were beforeyou wouldn t evenrecognise me anymorenot that youknew me back thenbut it all comesback to me (in the end)you kept everything insideand even though i tried,it all fell apartwhat it meant to me willeventually be amemory of a time when ii tried so hardand got so farbut in the endit doesn t even matteri had to fallto lose it allbut in the endit doesn t even matteri ve put my trust in youpushed as far as i can gofor all thisthere s only one thing you should knowi ve put my trust in youpushed as far as i can gofor all thisthere s only one thing you should knowi tried so hardand got so farbut in the endit doesn t even matteri had to fallto lose it allbut in the endit doesn t even matter
numb.txt i'm tired of being what you want me to be feeling so faithless lost under the surface don't know what you're expecting of me put under the pressure of walking in your shoes (caught in the undertow just caught in the undertow) every step i take is another mistake to you (caught in the undertow just caught in the undertow) i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do is be more like me and be less like you can't you see that you're smothering me holding too tightly afraid to lose control cause everything that you thought i would be has fallen apart right in front of you (caught in the undertow just caught in the undertow) every step that i take is another mistake to you (caught in the undertow just caught in the undertow) and every second i waste is more than i can take i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do is be more like me and be less like you and i know i may end up failing too but i know you were just like me with someone disappointed in you i've become so numb i can't feel you there i've become so tired so much more aware i've becoming this all i want to do is be more like me and be less like you i've become so numb is everything what you want me to be i've become so numb is everything what you want me to be
- java鬼混笔记:lucene 1、简单的创建索引和查询
- java鬼混笔记:lucene 7、查询排序和分页
- java鬼混笔记:lucene 8、过滤查询
- java鬼混笔记:lucene 6、QueryParser 字符串查询
- java鬼混笔记:lucene 9、查询结果高亮
- java鬼混笔记:lucene 2、常见的Field
- Lucene简单实现创建索引以及查询
- lucene 索引创建查询
- lucene 4.3 索引的简单创建和搜索代码展示
- Lucene使用(一)简单索引的创建和检索
- Lucene简单文件夹索引和查询案例
- solr入门之lucene创建索引和查询索引及查询的源码读取类确定
- java鬼混笔记:JAVA JXL对EXCEL的简单读写
- lucene创建索引读取索引简单测试--笔记
- Lucene的创建和查询
- java鬼混笔记:springboot之thymeleaf 1:简单的thymeleaf例子
- java鬼混笔记:lucene 5、index基本的增删查改
- Lucene的入门例子 - 创建索引,利用索引查询
- PIXI.js源码解析(5)之事件管理器——InteractionManager
- 虚拟机的安装
- 打乱数组排序
- 206. Reverse Linked List
- PHP的环境搭建(艰辛搭配经历,最后终于搭建好了。HTTP Error 404. The requested resource is not found.解决方法之一)
- java鬼混笔记:lucene 1、简单的创建索引和查询
- Memcached, Redis, MongoDB区别
- PageBean
- 总结-2017/9/24
- day13-json、ajax、cookie、RegExp
- 老司机给嵌入式新人的几点建议
- 用两个栈实现队列
- jdbc的连接中出现1405的情况
- Mybatis分页插件PageHelper简单使用