Luence 4.4 Jcseg分词器构建索引以及检索测试
来源:互联网 发布:mac废纸篓清空还原 编辑:程序博客网 时间:2024/06/05 09:16
一 利用Jcseg分词器构建索引
public class LuceneJcsegIndex {public static void main(String[] args) {String[] ids = new String[] { "1", "2", "3", "4" };String[] names = new String[] { "北京", "北京海淀", "南京", "shanghai" };String[] values = new String[] { "lang", "deg", "men", "context" };String[] bir = new String[] { "198108", "197906", "191111", "198710" };try {String dir = "D:\\user";// 目录Directory directory = new SimpleFSDirectory(new File(dir));Analyzer analyzer = new JcsegAnalyzer4X(JcsegTaskConfig.COMPLEX_MODE);JcsegAnalyzer4X jcseg = (JcsegAnalyzer4X) analyzer;JcsegTaskConfig jcsegTaskConfig = jcseg.getTaskConfig();jcsegTaskConfig.setAppendCJKPinyin(true);jcsegTaskConfig.setAppendCJKSyn(true);IndexWriter indexWriter = new IndexWriter(directory,new IndexWriterConfig(Version.LUCENE_44, analyzer));for (int i = 0; i < ids.length; i++) {Document document = new Document();document.add(new IntField("id", Integer.parseInt(ids[i]),Store.YES));document.add(new StringField("name", names[i], Store.YES));document.add(new TextField("text", values[i], Store.YES));document.add(new StringField("datetime", bir[i], Store.YES));indexWriter.addDocument(document);}indexWriter.commit();indexWriter.close();} catch (NumberFormatException e) {// TODO Auto-generated catch blocke.printStackTrace();} catch (IOException e) {// TODO Auto-generated catch blocke.printStackTrace();}}}
二 根据名称查询
Term term = new Term("name", "北京");
查询结果:
北京
lang
198108
三 测试QueryParse
package com.zsj.test;import java.io.File;import java.io.IOException;import org.apache.lucene.analysis.Analyzer;import org.apache.lucene.document.Document;import org.apache.lucene.index.IndexReader;import org.apache.lucene.queryparser.classic.ParseException;import org.apache.lucene.queryparser.classic.QueryParser;import org.apache.lucene.search.IndexSearcher;import org.apache.lucene.search.Query;import org.apache.lucene.search.ScoreDoc;import org.apache.lucene.search.TopDocs;import org.apache.lucene.store.Directory;import org.apache.lucene.store.FSDirectory;import org.apache.lucene.util.Version;import com.webssky.jcseg.core.JcsegTaskConfig;import com.webssky.jcseg.lucene.JcsegAnalyzer4X;/** * 测试Lucene 4.4简单搜索 * * @author hadoop * */public class LuceneJcsegQueryParse {/** * @param args */public static void main(String[] args) {// TODO Auto-generated method stubString dir = "D:\\user";try {Directory directory = FSDirectory.open(new File(dir));@SuppressWarnings("deprecation")IndexReader reader = IndexReader.open(directory);IndexSearcher indexSearcher = new IndexSearcher(reader);/** * 创建搜索字段 */Analyzer analyzer = new JcsegAnalyzer4X(JcsegTaskConfig.COMPLEX_MODE);JcsegAnalyzer4X jcseg = (JcsegAnalyzer4X) analyzer;JcsegTaskConfig jcsegTaskConfig = jcseg.getTaskConfig();jcsegTaskConfig.setAppendCJKPinyin(true);jcsegTaskConfig.setAppendCJKSyn(true);QueryParser queryParser = new QueryParser(Version.LUCENE_44,"name", analyzer);//Test AND NOTQuery query = queryParser.parse("text:men AND NOT name:南京");//Test AND//Query query = queryParser.parse("text:men AND name:南京");// Term term = new Term("name", "北京");// TermQuery termQuery = new TermQuery(term);TopDocs topDocs = indexSearcher.search(query, 4);ScoreDoc scoreDocs[] = topDocs.scoreDocs;for (int i = 0; i < scoreDocs.length; i++) {Document document = indexSearcher.doc(scoreDocs[i].doc);System.out.println(document.get("id"));System.out.println(document.get("name"));System.out.println(document.get("text"));System.out.println(document.get("datetime"));}directory.close();} catch (IOException | ParseException e) {// TODO Auto-generated catch blocke.printStackTrace();}}}
- Luence 4.4 Jcseg分词器构建索引以及检索测试
- Luence 4.4 Jcseg中文分词简单测试
- 中文分词器 jcseg
- jcseg分词
- 全文检索:分词,索引
- Jcseg分词器的实现详解
- jcseg中文分词器去除不需要的分词
- 全文检索原理(Luence倒排索引原理) 学Luence必看
- word分词器、ansj分词器、IKanalyzer分词器、mmseg4j分词器、jcseg分词器对比
- word分词器、ansj分词器、IKanalyzer分词器、mmseg4j分词器、jcseg分词器对比
- Jcseg分词 介绍
- Lucene中文分词Jcseg
- Solr4 + Jcseg(分词器) 安装配置--源自技术
- luence之全文检索
- Lucene 5.2.1 + jcseg 1.9.6中文分词索引(Lucene 学习序列2)
- 信息检索笔记-索引构建
- 信息检索之索引构建
- luence全文检索本地磁盘,可构建磁盘搜索引擎,有代码
- C#时间/日期格式大全,C#时间/日期函数大全
- linux开启防火墙端口和查看,开启相关端口号
- 进程通信
- unity自带寻路Navmesh入门教程(一)
- NuGet的资源
- Luence 4.4 Jcseg分词器构建索引以及检索测试
- JVM致命错误日志(hs_err_pid.log)解读
- 约瑟夫问题.快速求幸存者
- 今年-计划写一本java方面的书籍-初稿正式完成
- MySQL update 语句的正确用法
- ios7中让程序使用统一的status bar风格
- 什么是DLL,如何调用DLL
- 优秀Web开发者必须知道的10件事 新浪微博
- eclispe 窗口背景颜色、字体等设置整理