elasticsearch学习笔记--聚合函数篇
来源:互联网 发布:windows 图标文件 编辑:程序博客网 时间:2024/06/07 06:39
Elasticsearch 有一个功能叫聚合(aggregations),允许我们基于数据生成一些精细的分析结果。聚合与 SQL 中的
GROUP BY 类似但更强大。
首先看一下我当前megacorp索引下employeetype中的数据,执行如下语句:
语句1:
GET /megacorp/employee/_search{ "query": { "match_all": {} }}
结果:
{ "took": 0, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 1, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 1, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "3", "_score": 1, "_source": { "first_name": "Douglas", "last_name": "Fir", "age": 35, "about": "I like to build cabinets", "interests": [ "forestry" ] } } ] }}
正文:
举个例子,基于上述数据挖掘出雇员中最受欢迎的兴趣爱好:
语句2:
GET /megacorp/employee/_search{ "aggs": { "all_interests": { "terms": { "field": "interests" } } }}
查询结果如下:
{ ... "hits": { ... }, "aggregations": { "all_interests": { "buckets": [ { "key": "music", "doc_count": 2 }, { "key": "forestry", "doc_count": 1 }, { "key": "sports", "doc_count": 1 } ] } }}
结论:统计所有实体的interests的具体项目和每个项目的个数。
需要说明的是在执行语句2之前需要先执行一段语句(至于why?可以参考我的另一篇博文):
PUT megacorp/_mapping/employee/{ "properties": { "interests": { "type": "text", "fielddata": true } }}
该语句的目的是使得megacorp索引下employee 类型中的interests字段可以使用聚合函数聚合(**all_**interests),同理其他字段在使用聚合函数时也必须执行如上语句,比如对last_name想使用聚合函数,就必须执行如下语句:
PUT megacorp/_mapping/employee/{ "properties": { "last_name": { "type": "text", "fielddata": true } }}
聚合函数有很多种,比如还有avg_interests。
另外如果想知道姓为Smith 的雇员中最受欢迎的兴趣爱好,可以直接添加适当的查询来组合查询:
GET /megacorp/employee/_search{ "query": { "match": { "last_name": "smith" } }, "aggs": { "all_interests": { "terms": { "field": "interests" } } }}
结果:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0.2876821, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 0.2876821, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 0.2876821, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } } ] }, "aggregations": { "all_interests": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "music", "doc_count": 2 }, { "key": "sports", "doc_count": 1 } ] } }}
聚合还支持分级汇总 。比如,查询特定兴趣爱好员工的平均年龄:
GET /megacorp/employee/_search{ "aggs" : { "all_interests" : { "terms" : { "field" : "interests" }, "aggs" : { "avg_age" : { "avg" : { "field" : "age" } } } } }}
结果:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 1, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 1, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "3", "_score": 1, "_source": { "first_name": "Douglas", "last_name": "Fir", "age": 35, "about": "I like to build cabinets", "interests": [ "forestry" ] } } ] }, "aggregations": { "all_interests": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "music", "doc_count": 2, "avg_age": { "value": 28.5 } }, { "key": "forestry", "doc_count": 1, "avg_age": { "value": 35 } }, { "key": "sports", "doc_count": 1, "avg_age": { "value": 25 } } ] } }}
上面的语句的意思是统计具体的每种兴趣爱好喜欢的人数以及这些人的平均年龄。
阅读全文
0 0
- elasticsearch学习笔记--聚合函数篇
- Elasticsearch笔记-聚合
- HBase学习笔记-聚合函数
- Elasticsearch学习笔记2----聚合操作及常见问题解决
- Elasticsearch API聚合查询-笔记
- Linq学习笔记3(聚合函数)
- Linq学习笔记--聚合函数/Aggregator
- Linq学习笔记--聚合函数/Aggregator
- OPENCV学习笔记(二) 聚合函数
- elasticsearch聚合--内存控制篇
- Elasticsearch java API (17)Aggregations 聚合 函数
- Elasticsearch聚合
- Elasticsearch]聚合
- ElasticSearch聚合
- ElasticSearch聚合
- Oracle聚合函数学习
- MongoDB学习笔记(聚合)
- 数据库学习笔记(五)-模糊查询和聚合函数
- 成长博客
- C++类的三种数据成员:常量(const)、静态(static)、普通 的赋值方式
- iOS学习笔记-143.网络03——自己搭建的后台登陆接口文档
- Virus_JS2_PyAnalysis
- oracle中in和exists的区别
- elasticsearch学习笔记--聚合函数篇
- 父子枚举 二级枚举 枚举关联
- File文件基本操作之二:java 替换指定文件中的指定内容
- 异常,自定义异常,一个简单明了的理解过程
- Spring Boot(三):RestTemplate提交表单数据的三种方法
- 如何在java代码中实现分批查询
- vi专题
- 魅族手机如何刷flyme国际版(跳过检查固件损坏这一步)
- 快速排序(Java实现)