elasticsearch-java api之搜索(二)——聚合

来源:互联网 发布:java文本相似度 tfidf 编辑:程序博客网 时间:2024/05/19 14:52

前面一篇文章描述了es一些基本搜索的用法(match、term、fruzzy、matchPhraseQuery等),这篇文章我们着重讲解一下聚合查询的用法。

假设es中有如下数据


1、group by /count:

select team,count(*) from table group by team;

1)代码:

public static void aggre1Query2(String indexName,String indexType) {SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);srb.setSearchType(SearchType.COUNT);TermsBuilder teamAgg= AggregationBuilders.terms("player_count").field("team");  srb.addAggregation(teamAgg);  System.out.println(srb.toString());SearchResponse searchResponse = srb.execute().actionGet();Aggregations aggregations = searchResponse.getAggregations();Map<String, Aggregation> asMap = aggregations.asMap();Terms terms = (Terms)asMap.get("player_count");List<Bucket> buckets = terms.getBuckets();for (Bucket bt : buckets) {logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());}}

2)输出的dsl:

{  "aggregations" : {    "player_count" : {      "terms" : {        "field" : "team"      }    }  }}
3)输出:

[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] war :: 3[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] cav :: 2[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:461] tim :: 1

2、group by 多个field/count、min、avg:

1)代码:

public static void aggreQuery1(String indexName,String indexType) {SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);srb.setSearchType(SearchType.COUNT);TermsBuilder teamAgg= AggregationBuilders.terms("team_count").field("team");  TermsBuilder positionAgg= AggregationBuilders.terms("position_count").field("position");teamAgg.subAggregation(positionAgg);srb.addAggregation(teamAgg);  System.out.println(srb.toString());SearchResponse searchResponse = srb.execute().actionGet();Aggregations aggregations = searchResponse.getAggregations();//team aggTerms terms = aggregations.get("team_count");List<Bucket> buckets = terms.getBuckets();for (Bucket bt : buckets) {            logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());                        Aggregations aggregations2 = bt.getAggregations();//position agg            Terms terms2 = aggregations2.get("position_count");            List<Bucket> buckets2 = terms2.getBuckets();            for (Bucket bt2 : buckets2) {            logger.info("---"+bt2.getKeyAsString() + " :: " + bt2.getDocCount());            }        }}
2)输出的dsl:

{  "aggregations" : {    "team_count" : {      "terms" : {        "field" : "team"      },      "aggregations" : {        "position_count" : {          "terms" : {            "field" : "position"          }        }      }    }  }}
3)输出结果:

[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] war :: 3[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pf :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pg :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---sg :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] cav :: 2[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pg :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---sf :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:430] tim :: 1[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:436] ---pf :: 1

4)说明:

多个field的groupby 是一层一层的,所以代码中有两个for循环进行输出。


3、查询条件+max/min/sum/avg聚合+排序:

1)代码

public static void aggre1Query3(String indexName,String indexType) {boolean min = true;boolean avg = true;String groupCol = "team";SearchRequestBuilder srb = transportClient.prepareSearch(indexName).setTypes(indexType);srb.setSearchType(SearchType.COUNT);QueryBuilder qb = QueryBuilders.rangeQuery("age").from(10).to(39);//查询条件AggregationBuilder aggBuilder= AggregationBuilders.terms("group_name").field(groupCol).order(Order.aggregation("minSalary", false));if (min) {aggBuilder.subAggregation(AggregationBuilders.min("minSalary").field("salary"));}if (avg) {aggBuilder.subAggregation(AggregationBuilders.avg("avgSalary").field("salary"));}srb.setQuery(qb).addAggregation(aggBuilder);System.out.println(srb.toString());SearchResponse searchResponse = srb.execute().actionGet();Aggregations aggregations = searchResponse.getAggregations();Map<String, Aggregation> asMap = aggregations.asMap();Terms terms = (Terms)asMap.get("group_name");List<Bucket> buckets = terms.getBuckets();for (Bucket bt : buckets) {logger.info(bt.getKeyAsString() + " :: " + bt.getDocCount());Aggregations aggregations2 = bt.getAggregations();Min minSalary = aggregations2.get("minSalary");Avg avgSalary = aggregations2.get("avgSalary");logger.info("min:"+minSalary.value()+",avg:"+avgSalary.value());}}
2)dsl输出:

{  "query" : {    "range" : {      "age" : {        "from" : 10,        "to" : 39,        "include_lower" : true,        "include_upper" : true      }    }  },  "aggregations" : {    "group_name" : {      "terms" : {        "field" : "team",        "order" : {          "minSalary" : "desc"        }      },      "aggregations" : {        "minSalary" : {          "min" : {            "field" : "salary"          }        },        "avgSalary" : {          "avg" : {            "field" : "salary"          }        }      }    }  }}
3)结果输出:

[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:499] cav :: 2[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:504] min:2000.0,avg:2500.0[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:499] war :: 3[12-05 18:51:32] [INFO] [cn.edu.nuc.EsTest.EsDao4Search:504] min:1000.0,avg:1666.6666666666667
4)说明:

该例子是一个field的groupby,所以代码中只要一个for循环即可;同时获取min和max等聚合项。


4、聚合后返回的条数:

默认情况下,search执行后,仅返回10条聚合结果,如果想反悔更多的结果,需要在构建TermsBuilder 时指定size:

TermsBuilder teamAgg= AggregationBuilders.terms("team").size(15);  



参考:

http://blog.csdn.net/carlislelee/article/details/52598022
http://outofmemory.cn/code-snippet/38461/elasticsearch-aggregation-search-example

http://blog.csdn.net/it_lihongmin/article/details/78447001

http://blog.csdn.net/jacklin929/article/details/70304127










阅读全文
0 0
原创粉丝点击