lucene整理4 -- 各种Query
来源:互联网 发布:python 性能测试脚本 编辑:程序博客网 时间:2024/05/19 04:51
1. 各种Query
1.1. 概述
query.toString()查看原子查询
1.2. 使用特定的分析器搜索
IndexSearcher searcher = new IndexSearcher(path );
Hits hits = null;
Query query = null;
QueryParser parser =new QueryParser("contents", new StandardAnalyzer());
query =parser.parse("11 a and hello");
hits=searcher.search(query); //查找 name:11 name:hello 共1个结果
System.out.println("查找 "+query.toString()+" 共" + hits.length() + "个结果");
1.3. 按词条搜索—TermQuery
Query query = null;
query=new TermQuery(new Term("name","word1 a and"));
hits=searcher.search(query);// 查找 name:word1 a and 共0个结果
System.out.println("查找 "+query.toString()+" 共" + hits.length() + "个结果");
1.4. 按“与或”搜索—BooleanQuery
1.和: MUST与MUST_NOT
2.或: SHOULD与SHOULD
3.A与B的并集-B MUST与MUST_NOT
Query query1=null;
Query query2=null;
BooleanQuery query=null;
query1=new TermQuery(new Term("name","word1"));
query2=new TermQuery(new Term("name","word2"));
query=new BooleanQuery();
query.add(query1,BooleanClause.Occur.MUST);
query.add(query2,BooleanClause.Occur.MUST_NOT);
1.5. 在某一范围内搜索—RangeQuery
Term beginTime=new Term("time","200001");
Term endTime=new Term("time","200005");
RangeQuery query=null;
query=new RangeQuery(beginTime,endTime,false);//不包含边界值
1.6. 使用前缀搜索—PrefixQuery
Term pre1=new Term("name","wor");
PrefixQuery query=null;
query = new PrefixQuery(pre1);
1.7. 短语搜索—PhraseQuery
a)默认坡度为0
PhraseQuery query = new PhraseQuery();
query.add(new Term(“bookname”,”钢”));
query.add(new Term(“bookname”,”铁”));
Hits hits=searcher.search(query); //搜索“钢铁”短语,而非“钢”和“铁”
b)设置坡度,默认为0
PhraseQuery query = new PhraseQuery();
query.add(new Term(“bookname”,”钢”));
query.add(new Term(“bookname”,”铁”));
query.setSlop(1);
Hits hits=searcher.search(query);//搜索“钢铁”或“钢*铁”中含一字
1.8. 多短语搜索—MultiPhraseQuery
a)
MultiPhraseQuery query=new MultiPhraseQuery();
//首先向其中加入要查找的短语的前缀
query.add(new Term(“bookname”,”钢”));
//构建3个Term,作为短语的后缀
Term t1=new Term(“bookname”,”铁”);
Term t2=new Term(“bookname”,”和”);
Term t3=new Term(“bookname”,”要”);
//再向query中加入所有的后缀,与前缀一起,它们将组成3个短语
query.add(new Term[]{t1,t2,t3});
Hits hits=searcher.search(query);
for(int i=0;i<hits.length();i++)
System.out.println(hits.doc(i));
b)
MultiPhraseQuery query=new MultiPhraseQuery();
Term t1=new Term(“bookname”,”钢”);
Term t2 = new Term(“bookname”,”和”);
query.add(new Term[]{t1,t2});
query.add(new Term(“bookname”,”铁”));
c)
MultiPhraseQuery query=new MultiPhraseQuery();
Term t1=new Term(“bookname”,”钢”);
Term t2 = new Term(“bookname”,”和”);
query.add(new Term[]{t1,t2});
query.add(new Term(“bookname”,”铁”));
Term t3=new Term(“bookname”,”是”);
Term t4=new Term(“bookname”,”战”);
query.add(new Term[]{t3,t4});
1.9. 模糊搜索—FuzzyQuery
使用的算法为levenshtein算法,在比较两个字符串时,将动作分为3种:
l 加一个字母
l 删一个字母
l 改变一个字母
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”));
public FuzzyQuery(Term term)
public FuzzyQuery(Term term,float minimumSimilarity)throws IllegalArgumentException
public FuzzyQuery(Term term,float minimumSimilarity,int prefixLength)throws IllegalArgumentException
其中minimumSimilarity为最小相似度,越小则文档的数量越多。默认为0.5.其值必须<1.0
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f);
其中prefixLength表示要有多少个前缀字母必须完全匹配
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f,1);
1.10. 通配符搜索—WildcardQuery
* 表示0到多个字符
? 表示一个单一的字符
WildcardQuery query=new WildcardQuery(new Term(“content”,”?qq*”));
1.11. 跨度搜索
1.11.1. SpanTermQuery
效果和TermQuery相同
SpanTermQuery query=new SpanTermQuery(new Term(“content”,”abc”));
1.11.2. SpanFirstQuery
从Field内容的起始位置开始,在一个固定的宽度内查找所指定的词条
SpanFirstQuery query=new SpanFirstQuery(new Term(“content”,”abc”),3);//是第3个word,不是byte
1.11.3. SpanNearQuery
SpanNearQuery相当与PhaseQuery
SpanTermQuery people=new SpanTermQuery(new Term(“content”,”mary”));
SpanTermQuery how=new SpanTermQuery(new Term(“content”,”poor”));
SpanNearQuery query=new SpanNearQuery(new SpanQuery[]{people,how},3,false);
1.11.4. SpanOrQuery
把所有SpanQuery的结果合起来
SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);
SpanTermQuery s2=new SpanTermQuery(new Term(“content”,”cc”);
SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);
SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);
SpanNearQuery query1=new SpanNearQuery(new SpanQuery[]{s1,s2},1,false);
SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},3,false);
SpanOrQuery query=new SpanOrQuery(new SpanQuery[]{query1,query2});
1.11.5. SpanNotQuery
从第1个SpanQuery的查询结果中,去掉第2个SpanQuery的查询结果
SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);
SpanFirstQuery query1=new SpanFirstQuery(s1,3);
SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);
SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);
SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},4,false);
SpanNotQuery query=new SpanNotQuery(query1,query2);
1.12. RegexQuery—正则表达式的查询
String regex="http://[a-z]{1,3}//.abc//.com/.*";
RegexQuery query=new RegexQuery(new Term("url",regex));
- lucene整理4 -- 各种Query
- lucene整理4 -- 各种Query
- lucene -- 4 各种Query
- Lucene的各种query
- 构建各种Lucene Query
- Lucene各种Query
- Lucene中的各种Query实例
- Lucene 4 Query
- lucene当中的各种query(三)
- Lucene中Query语法树的整理
- Lucene--Query
- 【lucene】 Query
- lucene query
- Lucene实例(各种Query使用的例子)
- lucene之Query
- Lucene query使用总结
- Lucene Query Parser Syntax
- Lucene 学习 Query
- 求助——求助
- 自己编写的Java Swing 时钟 比较繁琐 请指教
- 使用iexpress制作控件本地安装程序
- SuperMap IS.NET项目互联网发布IP设置
- Clutter使用的问题总结
- lucene整理4 -- 各种Query
- 一个小程序重新透视C语言Switch语句
- 从今天开始
- 转]EBS中客户化表结构的设计原则
- VC tab control 使用简单例子
- [转]如何写出专业的C头文件
- Luncene 之二 查询
- MATLAB GUI一点点体会
- SOCKET类的设计和实现