企业级搜索elasticsearch应用02-elasticsearch搜索

来源:互联网 发布:如何评价詹姆斯知乎 编辑:程序博客网 时间:2024/05/22 04:31

一 。使用命令搜索

  1》uri搜索(参考https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html)

     uri搜索表示将查询以及操作的动作置于uri参数中 
   为了方便搜索 添加测试数据(/root/my.json)

{"index":{"_id":"1"}}{"id":"1","country":"美国","provice":"加利福尼亚州","city":"旧金山","age":"30","name":"John","desc":"John is come from austrina  John,s Dad is Johh Super"}  {"index":{"_id":"2"}}{"id":"2","country":"美国","provice":"加利福尼亚州","city":"好莱坞","age":"40","name":"Mike","desc":"Mike is come from austrina  Mike,s Dad  is Mike Super"}  {"index":{"_id":"3"}}{"id":"3","country":"美国","provice":"加利福尼亚州","city":"圣地牙哥","age":"50","name":"Cherry","desc":"Cherry is come from austrina  Cherry,s Dad  is Cherry Super"}  {"index":{"_id":"4"}}{"id":"4","country":"美国","provice":"德克萨斯州","city":"休斯顿","age":"60","name":"Miya","desc":"Miya is come from austrina  Miya,s Dad  is Miya Super"}  {"index":{"_id":"5"}}{"id":"5","country":"美国","provice":"德克萨斯州","city":"大学城","age":"70","name":"fubos","desc":"fubos is come from austrina  fubos,s Dad  is fubos Super"}  {"index":{"_id":"6"}}{"id":"6","country":"美国","provice":"德克萨斯州","city":"麦亚伦","age":"20","name":"marry","desc":"marry is come from austrina  marry,s Dad  is marry Super"}  {"index":{"_id":"7"}}{"id":"7","country":"中国","provice":"湖南省","city":"长沙市","age":"18","name":"张三","desc":"张三来自长沙市 是公务员一名"}  {"index":{"_id":"8"}}{"id":"8","country":"中国","provice":"湖南省","city":"岳阳市","age":"15","name":"李四","desc":"李四来自岳阳市 是一名清洁工"}  {"index":{"_id":"9"}}{"id":"9","country":"中国","provice":"湖南省","city":"株洲市","age":"33","name":"李光四","desc":"李光四 老家岳阳市 来自株洲 是李四的侄子"}  {"index":{"_id":"10"}}{"id":"10","country":"中国","provice":"广东省","city":"深圳市","age":"67","name":"王五","desc":"王五来自深圳市  是来自深圳的一名海关缉私精英"}  {"index":{"_id":"11"}}{"id":"11","country":"中国","provice":"广东省","city":"广州市","age":"89","name":"王冠宇","desc":"王冠宇是王五的儿子"}  
使用bulkapi 来批量导入
cd /root && curl -XPOST '192.168.58.147:9200/user/info/_bulk?pretty' --data-binary @my.json

可以使用 _search api来实现文档的搜索 比如 (q=查询的列:值)

[root@node1 ~]# curl -XGET 'http://192.168.58.147:9200/_search?q=name=marry&pretty'{  "took" : 42,  "timed_out" : false,  "_shards" : {    "total" : 10,    "successful" : 10,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : 1,    "max_score" : 1.6127073,    "hits" : [      {        "_index" : "user",        "_type" : "info",        "_id" : "6",        "_score" : 1.6127073,        "_source" : {          "id" : "6",          "country_s" : "美国",          "provice_s" : "德克萨斯州",          "city_s" : "麦亚伦",          "age_i" : "20",          "name_s" : "marry",          "desc_s" : "marry is come from austrina  marry,s Dad  is marry Super"        }      }    ]  }}
 查询所有用户索引中 年龄是50的
curl -XGET 'http://192.168.58.147:9200/user/_search?q=age:50&pretty'
查询所有用户索引库中 用户信息类型 用户资产类型 年龄是30的 多个索引或者多个类型使用,隔开
curl -XGET 'http://192.168.58.147:9200/student/info,money/_search?q=age:30&pretty'
查询所有索引库中(_all)类型是info的年龄是30的用户
curl -XGET 'http://192.168.58.147:9200/_all/info/_search?q=age:30&pretty'
使用from指定从第几条开始查 size返回的结果数  默认from=0 size=10
curl -XGET 'http://192.168.58.147:9200/user/_search?q=age:30&from=0&size=2&pretty'

常用的查询参数如下:

NameDescriptionq表示查询字符串df在查询中,当没有定义字段的前缀的情况下的默认字段前缀analyzer当分析查询字符串时,分析器的名字default_operator被用到的默认操作,有ANDOR两种,默认是ORexplain对于每一个命中(hit),对怎样得到命中得分的计算给出一个解释_source将其设置为false,查询就会放弃检索_source字段。你也可以通过设置_source_include_source_exclude检索部分文档fields命中的文档返回的字段sort排序执行。可以以fieldNamefieldName:asc或者fieldName:desc的格式设置。fieldName既可以是存在的字段,也可以是_score字段。可以有多个sort参数track_scores当排序的时候,将其设置为true,可以返回相关度得分timeout默认没有timeoutfrom默认是0size默认是10search_type搜索操作执行的类型,有dfs_query_then_fetchdfs_query_and_fetchquery_then_fetchquery_and_fetchcountscan几种,默认是query_then_fetchlowercase_expanded_termsterms是否自动小写,默认是trueanalyze_wildcard是否分配通配符和前缀查询,默认是falseterminate_afterThe maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early. If set, the response will have a boolean field terminated_early to indicate whether the query execution has actually terminated_early. Defaults to no terminate_after.

指定使用中文名查询时 都无法查询数据 应该是分词器 分词的问题

curl -XGET 'http://192.168.58.147:9200/user/info/_search?q=name=张&pretty'   curl -XGET 'http://192.168.58.147:9200/user/info/_search?q=name=张三&pretty'   
测试默认分词器的分词功能
[root@node1 ~]# curl -XPOST  'http://192.168.58.147:9200/_analyze?pretty' -d '> {>   "tokenizer": "standard",>   "text":      "我是饺子"> }';{  "tokens" : [    {      "token" : "我",      "start_offset" : 0,      "end_offset" : 1,      "type" : "<IDEOGRAPHIC>",      "position" : 0    },    {      "token" : "是",      "start_offset" : 1,      "end_offset" : 2,      "type" : "<IDEOGRAPHIC>",      "position" : 1    },    {      "token" : "饺",      "start_offset" : 2,      "end_offset" : 3,      "type" : "<IDEOGRAPHIC>",      "position" : 2    },    {      "token" : "子",      "start_offset" : 3,      "end_offset" : 4,      "type" : "<IDEOGRAPHIC>",      "position" : 3    }  ]}

 既然是逐个分词 q=name:张 应该可以查出来啊 奇怪 后来想到http协议有可能在url上带参数 中文乱码 不使用url方式 使用requestbody的方式试试

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {        "term" : { "name" : "三" }    }}'
 发现这样搜索 就是搜索张或者三都能出结果 输入张三无法出结果  确实是standard分词器的特点
实际上 张三 就是一个词  这里推荐使用ik分词器的分词 分成的词 具有一定的意义(比如我是中国人  拆分成 我 是 中国人  中国等有意义的词)

设置ik分词器 (IK插件地址 https://github.com/medcl/elasticsearch-analysis-ik)

我这里5.6.4的es下载对应 5.6.4的ik
cd /home/es/elasticsearch-5.6.4 &&    ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.4/elasticsearch-analysis-ik-5.6.4.zip
安装完成后  检查plugins目录
[es@node1 elasticsearch-5.6.4]$ cd plugins/[es@node1 plugins]$ lltotal 4drwxr-xr-x 2 es es 4096 Dec  5 18:49 analysis-ik
重启 elasticsearch
 ./elasticsearch -Ecluster.name=my_cluster_name -Enode.name=my_node_name -Enetwork.host=192.168.58.147
ik 带有两个分词器 
ik_max_word :会将文本做最细粒度的拆分;尽可能多的拆分出词语 
ik_smart:会做最粗粒度的拆分;已被分出的词语将不会再次被其它词语占有 
测试 
[es@node1 plugins]$ curl -XGET 'http://192.168.58.147:9200/_analyze?pretty&analyzer=ik_max_word' -d '我是中国人'{  "tokens" : [    {      "token" : "我",      "start_offset" : 0,      "end_offset" : 1,      "type" : "CN_CHAR",      "position" : 0    },    {      "token" : "是",      "start_offset" : 1,      "end_offset" : 2,      "type" : "CN_CHAR",      "position" : 1    },    {      "token" : "中国人",      "start_offset" : 2,      "end_offset" : 5,      "type" : "CN_WORD",      "position" : 2    },    {      "token" : "中国",      "start_offset" : 2,      "end_offset" : 4,      "type" : "CN_WORD",      "position" : 3    },    {      "token" : "国人",      "start_offset" : 3,      "end_offset" : 5,      "type" : "CN_WORD",      "position" : 4    }  ]}[es@node1 plugins]$ curl -XGET 'http://192.168.58.147:9200/_analyze?pretty&analyzer=ik_smart' -d '我是中国人'        {  "tokens" : [    {      "token" : "我",      "start_offset" : 0,      "end_offset" : 1,      "type" : "CN_CHAR",      "position" : 0    },    {      "token" : "是",      "start_offset" : 1,      "end_offset" : 2,      "type" : "CN_CHAR",      "position" : 1    },    {      "token" : "中国人",      "start_offset" : 2,      "end_offset" : 5,      "type" : "CN_WORD",      "position" : 2    }  ]}
删除索引user  重新创建索引 user
 curl -XDELETE 'http://192.168.58.147:9200/user?pretty'   curl -XPUT 'http://192.168.58.147:9200/user?pretty' -d '{    "settings" : {        "analysis" : {            "analyzer" : {                "ik" : {                    "tokenizer" : "ik_max_word"                }            }        }    },    "mappings" : {        "info" : {            "dynamic" : true,            "properties" : {                "name" : {                    "type" : "string",                    "analyzer" : "ik_max_word"                },"desc" : {                    "type" : "string",                    "analyzer" : "ik_max_word"                }            }        }    }}';
上面的创建 表示设置 user的分词器是ik  mapping用于设置 类型和属性的类型 以及他的分词器
info类型下的name_s和desc_s使用分词器
再次进入/root导入之前的数据文件
cd /root && curl -XPOST '192.168.58.147:9200/user/info/_bulk?pretty' --data-binary @my.json

测试分词器的效果 发现张 无法查询了 张三才能查询

[root@node1 ~]# curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {        "term" : { "name" : "张" }      }}';{  "took" : 22,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : 0,    "max_score" : null,    "hits" : [ ]  }}

2》请求体搜索

》》term 词查询
 中文搜索只能使用请求体搜索 否则无法搜索 所以一般搜索都是使用请求体 简单测试使用uri搜索 一般请求体中使用dsl语言 比如
查询所有数据

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {    }}';
查询包含词 name字段=三的

"query" : {        "term" : { "name" : "三" }    }
和query同级 支持以下参数

NameDescriptiontimeout默认没有timeoutfrom默认是0size默认是10search_type搜索操作执行的类型,有dfs_query_then_fetchdfs_query_and_fetchquery_then_fetchquery_and_fetchcountscan几种,默认是query_then_fetchquery_cache?search_type=count时,查询结果是否缓存terminate_afterThe maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early. If set, the response will have a boolean field terminated_early to indicate whether the query execution has actually terminated_early. Defaults to no terminate_after.

from和size 设置分页 order设置排序(注意请求体的参数不要用tab键代替空格 否则出错)

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "from":0,  "size":5,  "sort":[{"_score":{"order":"desc"}},{"age":{"order":"desc"}}  ],  "query" : {        "term" : { "desc" : "来自" }    }}';
按照每个document的得分降序 得分相同的数据再按照age降序排序 我之前导入数据时 age使用的""表示是字符串 一般都会抛出异常
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on 
表示默认的字符串 不允许排序或者聚合 应该将fielddata设置为true 修改之前创建的映射 添加一个age将fielddata设置为true
"properties" : {                "name" : {                    "type" : "string",                    "analyzer" : "ik_max_word"                },"desc" : {                    "type" : "string",                    "analyzer" : "ik_max_word"                },"age" : {                    "type" : "string",                     "fielddata" : true                 }            }
或者 type改成integer
"age" : {                    "type" : "integer"         }
_source指定显示哪些列
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {        "term" : { "desc" : "来自" }    },    "_source":["name", "desc"]}';

》》terms 词查询

  terms和term功能类似 允许搜索多个词

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {     "terms" : {       "desc" : ["来自","com"]     }   }}';

》》match查询

  对比term和match区别
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {       "term" : { "desc" : "李 来自" }    }}';curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {       "match" : { "desc" : "李 来自" }    }}';
term会认为 李 来自是一个词 去匹配 所有查不到任何数据
match 将包含李 和来自 两个词的数据合并查询
还有另外一个match_phrase和term是一样的认为是一个词
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": { "match_phrase": { "desc" : "李 来自" } }}'
》》bool查询
bool查询可以组合查询 比如 must必须两个条件都满足 必须有李和来自两个词
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {    "bool": {      "must": [        { "match": { "desc": "李" } },        { "match": { "desc": "来自" } }      ]    }  }}'
should 出现任意一个词即可
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {    "bool": {      "should": [        { "match": { "desc": "李" } },        { "match": { "desc": "来自" } }      ]    }  }}';
must_not必须没有某些词
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {    "bool": {      "must_not": [        { "match": { "desc": "李" } },        { "match": { "desc": "来自" } }      ]    }  }}';
也可以组合使用 必须出现词李 不能出现词 来自
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {    "bool": {      "must_not": [        { "match": { "desc": "李" } }      ],  "must":[{ "match": { "desc": "来自" } }]    }  }}';

》》regexp正则查询

 支持正则匹配模式 例如

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {       "regexp" : { "desc" : "李.*" }    }}';
》》prefix前缀查询
 以什么词开头 必须是一个词
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {       "prefix" : { "desc" : "李" }    }}';

》》multi_match多字段查询

多个字段等于同一个词

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {     "multi_match" : {       "query" : "李" ,   "fields":["name","desc"]     }   }}';
》》range范围过滤

范围操作符包含:

  • gt :: 大于
  • gte:: 大于等于
  • lt :: 小于
  • lte:: 小于等于
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {     "range" : {       "age" : {"gt":20}     }   }}';
假设需要判断  age>=20 and age<=30
使用range
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query": {     "range" : {       "age" : {"gte":20,"lte":"30"}     }   }}';
可是使用bool
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query":{  "bool":{   "must": {     "range" : {       "age" : {"gte":20}      }    },   "must": {     "range" : {       "age" : {"lte":30}      }     }   }   }}';
》》exists 和 missing 过滤

  exists 和 missing 过滤可以用于查找文档中是否包含指定字段或没有某个字段

比如 查询所有带有age字段的doc

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{    "query":{   "exists":{"field":"age"}}}';
查询所有没有 age字段的doc
miising已经过期 替代的是使用bool+exists
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query":{  "bool":{   "must_not": {     "exists" : {       "field":"age"      }    }   }   }}';

3》过滤

      每次的搜索返回的结果中都有一个_score的字段  这个得分是指定的搜索查询匹配程度的一个相对度量。得分越高,文档越相关,
得分越低文档的相关度越低。

Elasticsearch中的所有的查询都会触发相关度得分的计算。对于那些我们不需要相关度得分的场景下,Elasticsearch以过滤器的形式提供了另一种查询功能。过滤器在概念上类似于查询,但是它们有非常快的执行速度,这种快的执行速度主要有以下两个原因:

    过滤器不会计算相关度的得分,所以它们在计算上更快一些
    过滤器可以被缓存到内存中,这使得在重复的搜索查询上,其要比相应的查询快出许多。
比如查询是否存在字段age的doc 可以用query 也可以使用过滤  过滤放在bool中
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query":{  "bool":{   "filter": {     "exists" : {       "field":"age"      }    }   }   }}';
返回的结果中 _score都为0 比如
{        "_index" : "user",        "_type" : "info",        "_id" : "7",        "_score" : 0.0,        "_source" : {          "id" : "7",          "country" : "中国",          "provice" : "湖南省",          "city" : "长沙市",          "age" : "18",          "name" : "张三",          "desc" : "张三来自长沙市 是公务员一名"        }      }
4》聚合

 默认数字类型 比如 integer float long都可以直接聚合 如果是字符串需要聚合 必须设置属性 fielddata=true  假设我使用国家进行聚合(group by)
首先修改 mapping 添加映射(删除索引 重建 重新导入数据)

               "country" : {                    "type" : "string",                    "analyzer" : "ik_max_word",                    "fielddata":true                },"provice" : {                    "type" : "string",                    "analyzer" : "ik_max_word",                    "fielddata":true                },"city" : {                    "type" : "string",                    "analyzer" : "ik_max_word",                    "fielddata":true                }

执行简单的聚合 (按某个字段分组 每个分组有多少个元素) 注意指定分词器 否则 逐个单词来分组
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      }    }  }}';
最后聚合结果为:

 "aggregations" : {    "group_by_country" : {      "doc_count_error_upper_bound" : 0,      "sum_other_doc_count" : 0,      "buckets" : [        {          "key" : "美国",          "doc_count" : 6        },        {          "key" : "中国",          "doc_count" : 5        }      ]    }  }
类似于sql语句
SELECT country as key,COUNT(*) as doc_count from user_info GROUP BY country ORDER BY COUNT(*) DESC
上面的group_by_country是给当前分组取一个名字 可以多个分组 下面结果会出现两个分组结果 互不影响
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      }    },    "group_by_provice": {      "terms": {        "field": "provice"      }    }  }}';
上面size:0表示只是获取分组的结果  类似于solr中的facets

如果设置了size:数字  返回结果中 还会带出具体分组的具体数据 类似于solr的聚合

es的聚合支持嵌套 比如 按国家分组后  再按城市分组 (安装国家分组 各个分组内部的数据 再按照城市分组 )

curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      },      "aggs":{        "group_by_age":{          "terms":{             "field":"city"           }        }  }    }  }}';
也可以在嵌套的聚合中使用聚合函数获取平均值 最大值 或者最小值
分组获取每组平均年龄 按照平均年龄大到小排序  支持max min  sum 等聚合
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      },      "aggs":{        "avg_age":{          "avg":{             "field":"age"           }        }  }    }  }}';
支持 按照范围值进行分组 比如按照国家分组后 按照每个国家的年龄段分组
curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      },      "aggs":{        "group_by_age_range":{          "range": {"field": "age","ranges": [  {"from": 10,"to": 20  },  {"from": 20,"to": 40  },  {"from": 40,"to": 100  }]  }        }  }    }  }}';

二 。使用命令搜索

java操作 搜索的api

package es;import java.net.InetAddress;import java.net.UnknownHostException;import java.util.List;import org.elasticsearch.action.search.SearchResponse;import org.elasticsearch.client.transport.TransportClient;import org.elasticsearch.common.settings.Settings;import org.elasticsearch.common.transport.InetSocketTransportAddress;import org.elasticsearch.index.query.QueryBuilders;import org.elasticsearch.search.SearchHit;import org.elasticsearch.search.aggregations.AggregationBuilders;import org.elasticsearch.search.aggregations.bucket.terms.StringTerms;import org.elasticsearch.search.aggregations.bucket.terms.StringTerms.Bucket;import org.elasticsearch.search.aggregations.metrics.avg.Avg;import org.elasticsearch.search.sort.SortOrder;import org.elasticsearch.transport.client.PreBuiltTransportClient;/** * 之前的 /user/info数据为例 * @author jiaozi * */public class Search {/** * 查询所有的 的doc * [root@node1 ~]# curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '> {>   "query":{}> }'{  "hits" : {    "total" : 1,    "max_score" : 0.5377023,    "hits" : [      {        "_index" : "user",        "_type" : "info",        "_id" : "7",        "_score" : 0.5377023,        "_source" : {          "id" : "7",          "country" : "中国",          "provice" : "湖南省",          "city" : "长沙市",          "age" : "18",          "name" : "张三",          "desc" : "张三来自长沙市 是公务员一名"        }      }    ]  }}上面的结果是  hits里有多个hits 代码也是一样  hits里有source */public static void search(){SearchResponse searchResponse = client.prepareSearch("user").setTypes("info").addSort("age", SortOrder.ASC).setFrom(0).setSize(5).get();SearchHit[] hits = searchResponse.getHits().getHits();for (int i = 0; i < hits.length; i++) {System.out.println(hits[i].getSource());}}/** * curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query" : {        "term" : { "desc" : "来自" }    },    "_source":["name", "desc"]}'; */public static void searchTerms(){SearchResponse searchResponse = client.prepareSearch("user").setTypes("info").addSort("age", SortOrder.ASC)//QueryBuilders可以指定其他 比如 match或者match_phase或者 regexp 或者prefix或者其他.setQuery(QueryBuilders.termQuery("desc", "来自")) //搜索 有得分//.setQuery(QueryBuilders.rangeQuery("age").lte(30)) //范围查询//.setQuery(QueryBuilders.regexpQuery("desc", "来自"))//正则//.setQuery(QueryBuilders.prefixQuery("desc", "张三"))//前缀//.setQuery(QueryBuilders.matchQuery("desc", "张三 来自"))//match//.existsQuery("desc")  //exists 是否存在字段//.setPostFilter(QueryBuilders.termQuery("desc", "来自")) //过滤没得分.setFrom(0).setSize(5).get();SearchHit[] hits = searchResponse.getHits().getHits();for (int i = 0; i < hits.length; i++) {System.out.println(hits[i].getSource());}}/** * 等价于  * curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "query":{  "bool":{   "must": {     "range" : {       "age" : {"gte":20}      }    },   "must": {     "range" : {       "age" : {"lte":30}      }     }   }   }}'; */public static void searchBool(){SearchResponse searchResponse = client.prepareSearch("user").setTypes("info").setQuery(QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("age").lte(40)).must(QueryBuilders.rangeQuery("age").gte(20))//.mustNot(queryBuilder)//.should(queryBuilder)).get();SearchHit[] hits = searchResponse.getHits().getHits();for (int i = 0; i < hits.length; i++) {System.out.println(hits[i].getSource());}}/** * 聚合  * curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      }    }  }}'; */public static void aggs(){SearchResponse searchResponse = client.prepareSearch("user").addAggregation(AggregationBuilders.terms("group_by_country").field("country")).setSize(4).get();StringTerms terms = searchResponse.getAggregations().get("group_by_country");List<Bucket> buckets = terms.getBuckets();for (int i = 0; i < buckets.size(); i++) {Bucket bucket = buckets.get(i);System.out.println(bucket.getKey()+"----"+bucket.getDocCount());}SearchHit[] hits = searchResponse.getHits().getHits();for (int i = 0; i < hits.length; i++) {System.out.println(hits[i].getSource());}}/** * curl -XPOST '192.168.58.147:9200/user/_search?pretty' -d '{  "size": 0,  "aggs": {    "group_by_country": {      "terms": {        "field": "country"      },      "aggs":{        "avg_age":{          "avg":{             "field":"age"           }        }  }    }  }}'; */public static void aggsAvgs(){SearchResponse searchResponse = client.prepareSearch("user").addAggregation(AggregationBuilders.terms("group_by_country").field("country").subAggregation(AggregationBuilders.avg("avg_age").field("age"))).get();StringTerms terms = searchResponse.getAggregations().get("group_by_country");List<Bucket> buckets = terms.getBuckets();for (int i = 0; i < buckets.size(); i++) {Bucket bucket = buckets.get(i);Avg st=bucket.getAggregations().get("avg_age");System.out.println(bucket.getKey()+"----"+bucket.getDocCount());System.out.println(st.getValue());}}public static void main(String[] args) {aggsAvgs();}static TransportClient client;static{try {client=getClient();} catch (UnknownHostException e) {e.printStackTrace();}}/** * 获取客户端操作对象 * @return * @throws UnknownHostException  */public static TransportClient getClient() throws UnknownHostException{Settings settings = Settings.builder()        .put("cluster.name", "my_cluster_name")        //.put("index.analysis.analyzer.default.type","ik_max_word")        .build();TransportClient client = new PreBuiltTransportClient(settings)        .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("192.168.58.147"), 9300));return client;}}