lucene学习流程

来源:互联网 发布:ubuntu qq安装 编辑:程序博客网 时间:2024/05/10 00:02
//创建索引步骤1.创建索引的存放地IndexWriter indexWriter = new IndexWriter(FSDirectory.open(new File("C:\\suoyin")),new IndexWriterConfig(Version.LUCENE_36,new IKAnalyzer()).setOpenMode(OpenMode.CREATE));2.获取本地文件夹中的文件File[] textFiles = new File("C:\\source").listFiles();3循环文件判断是不是.txt文件if (textFiles[i].isFile() && textFiles[i].getName().endsWith(".txt"))4如果是的话就读取里面的文本,并且new一个document(Document document = new Document();)String temp = FileReaderAll(textFiles[i].getCanonicalPath(),   "GBK");5.然后new Field把文本内容存放在索引中Field FieldBody = new Field("body", temp, Field.Store.YES,   Field.Index.ANALYZED,   Field.TermVector.WITH_POSITIONS_OFFSETS);document.add(FieldBody);6.最后往索引文件夹里面写索引indexWriter.addDocument(document);*****注意循环创建完索引要关闭indexWriter******indexWriter.close();   // 这里不关闭建立索引会失败 //读取文本内容方法public static String FileReaderAll(String FileName, String charset)   throws IOException {    BufferedReader reader = new BufferedReader(new InputStreamReader(   new FileInputStream(FileName), charset));   String line = new String();   String temp = new String();    while ((line = reader.readLine()) != null) {   temp += line;   }    reader.close();   return temp;   }//搜索索引步骤1.读取索引文件IndexReader indexReader = IndexReader.open(FSDirectory.open(new File("C:\\suoyin")));2.创建搜索索引IndexSearcher indexSearcher = new IndexSearcher(indexReader);3.创建query(这里我们采用的是ik分词器)QueryParser query = new QueryParser(Version.LUCENE_36, "body", new IKAnalyzer());4.写入搜索关键字Query parse = query.parse("秋香");5.开始进行搜索最多100条   可以自定义TopDocs search = indexSearcher.search(parse, 100);6.返回结果集ScoreDoc[] scoreDocs = search.scoreDocs;7循环展示for (int i = 0; i < scoreDocs.length; i++) {int doc = scoreDocs[i].doc;Document doc2 = indexSearcher.doc(doc);//高亮显示//TokenStream tokenStream = analyzer.tokenStream("",new StringReader(doc2.get("body")));  //String str = highlighter.getBestFragment(tokenStream, doc2.get("body"));  //System.out.println("内容:"+str);System.out.println("内容:"+doc2.get("body"));}8.关闭索引indexReader.close(); //如何进行高亮显示//高亮设置  //设定高亮显示的格式,也就是对高亮显示的词组加上前缀后缀  SimpleHTMLFormatter simpleHtmlFormatter = new SimpleHTMLFormatter("<font color='red'><strong>", "</strong></font>");Highlighter highlighter = new Highlighter(simpleHtmlFormatter,new QueryScorer(parse));  //设置每次返回的字符数.想必大家在使用搜索引擎的时候也没有一并把全部数据展示出来吧,当然这里也是设定只展示部分数据highlighter.setTextFragmenter(new SimpleFragmenter(150));                                                

0 0