ES Search APIs（持续更新）

来源：互联网发布：手机数据能删除吗编辑：程序博客网时间：2024/04/29 20:16

https://www.elastic.co/guide/en/elasticsearch/reference/2.1/search.html
Most search APIs are multi-index, multi-type,（逗号分隔） with the exception of the Explain API endpoints.

Routing
When executing a search, it will be broadcast to all the index/indices shards (round robin between replicas). Which shards will be searched on can be controlled by providing the routing parameter.

..._search?routing=xxx若索引时有routing值，则搜索时指定routing值，减少搜索的分片数

Stats Groups
which maintains a statistics aggregation per group.

GET index/type/_search{    "size": 1000,    "stats" : ["c1", "c2"]}GET _stats/search?groups=c1,c2&level=shards  分片级别

Global Search Timeout
集群级别、可动态更新：search.default_search_timeout=-1,表示未设置，单位毫秒
请求级别：timeout

Search

query –> hits
查询方式：

URI Search
Request Body Search

..._search?q=字段名:字段值'

Multi-Index, Multi-Type

_search 整个集群下搜
_all/type/_search 所有索引下的type1搜索
index/_search 单个索引下
dex/type/_search 单索引、单type
idx1,idx2/t1,t2/_search 多索引、多type

URI Search

通过URI提供request parameters，这种方式是适合简单的查询，部分search options 不可用。
非常适合”curl tests”

..._search?q=字段名:值

参数：

q，query string（query_string query）
df,default field to use when no field prefix is defined within the query
analyzer 查询字符串的分词器
lowercase_expanded_terms 默认true,should terms be automatically lowercased or not.分词的(analyzed)一般不受此影响，因其已被分词器小写。主要对未分词(not_analyzed)的。
analyze_wildcard 默认false。Should wildcard(通配符) and prefix queries be analyzed or not. By default, wildcards terms in a query string are not analyzed. 但wildcard查询仍工作。
https://github.com/elastic/elasticsearch/issues/18592
default_operator OR(默认) or AND.
lenient 宽大，默认false。 true will cause format based failures (like providing text to a numeric field) to be ignored.
explain 打分是怎么计算出来的
_source false将禁止返回_source字段，部分字段用_source_include & _source_exclude
```
_search?_source_include=store* 
```
fields 逗号分隔多个字段。有这个参数，则不再返回_source字段。若store=no（默认)则从_source解析值
```
    _search?fields=rdcName,skuId,storeName
```
sort 排序，默认_score。设置fieldName、 fieldName:asc、fieldName:desc。多字段排序，逗号分隔，注意次序。
track_scores ，When sorting, set to true in order to still track scores and return
timeout 搜索超时，默认无。超时返回—–可能部分结果、无结果。
terminate_after The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early. If set, the response will have a boolean field terminated_early to indicate whether the query execution has actually terminated_early.
from 默认0, starting from index of the hits
size 默认10, number of hits to return.
search_type dfs_query_then_fetch ,query_then_fetch（默认） ,scan—>scroll ,count–>size=0

Request Body Search

$ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{    "query" : {        "term" : { "user" : "kimchy" }    }}'

参数：

timeout
from/size
search_type
terminate_after
request_cache true|false（默认）。当size=0时，缓存aggs/suggestions/hits.total, not cache hits。注意查询中不要使用now之类，则不能cache。
Cached results are invalidated automatically whenever the shard refreshes.
The longer the refresh interval, the longer that cached entries will remain valid.
If the cache is full, the least recently used cache keys will be evicted.
手工清理：POST _cache/clear?request_cache=true
适用场景：轻变更。
设定：
1. 可在创建索引时指定：
  "index.requests.cache.enable": true 动态
2. per-request
  _search?request_cache=true
注意：If your query uses a script whose result is not deterministic(确定性的，如用了随机函数、当前时间等)，应设置为false。
cache大小（elasticsearch.yml）：
indices.requests.cache.size: 1% 占比heap的默认值。
Monitoring cache usage（bytes）：
```
GET _stats/request_cache?pretty&human     indices-statsGET _nodes/stats/indices/request_cache?pretty&human    nodes-stats
```

search_type ,request_cache必须以url参数形式，其它在request body。
request body可以作为REST parameter source的值。

Both HTTP GET and HTTP POST can be used to execute search with body. Since not all clients support GET with body, POST is allowed as well.

Query

{    "query" : {        "term" : { "user" : "kimchy" }    }}

from/size(分页)

from：the offset from the first result you want to fetch.
size：the maximum amount of hits to be returned. 每页多少条。

这2个参数：既可以request parameters，也可以 search body中设置.

注意：from + size ≤ index.max_result_window(1w) ，否则Scroll

Sort

one or more field.
关注点：field、排序方向（order）、字段次序。
sort: per-field level.

{    "sort" : [        { "post_date" : {"order" : "asc"}},        "user",        { "name" : "desc" },        { "age" : "desc" },        "_score"    ],    "query" : {        "term" : { "user" : "kimchy" }    }}

_score和_doc
_score ,to sort by score（得分）,
_doc , to sort by index order,不关心返回文档的顺序，最高效，如scrolling，

sort values
若指定排序字段，则返回结果将包含。

"_source": { ... }"sort": [排序值列表，逗号分隔 ]

sort order
asc 升序
desc 降序
_score默认desc，得分最高的第一，其它默认asc

Sort mode option
ES supports sorting by array or multi-valued fields.

min 最小
max 最大
sum 总，number array
avg 均，number array
median 中值，number array
区分avg、median

"sort" : [ {"price" : {"order" : "asc", "mode" : "avg"}} ]

Sorting within nested objects

nested_path ,Defines on which nested object to sort. 在哪个嵌套类型字段的direct field 排序。
nested_filter ，当嵌套类型字段是多值时，过滤出需考虑排序的值，并不是所有的值都考虑排序
嵌套：object/nest object

PUT /my_index/blogpost/2{  "title": "Investment secrets",  "body":  "What they don't tell you ...",  "tags":  [ "shares", "equities" ],  "comments": [    {      "name":    "Mary Brown",      "comment": "Lies, lies, lies",      "age":     42,      "stars":   1,      "date":    "2014-10-18"    },    {      "name":    "John Smith",      "comment": "You're making it up!",      "age":     28,      "stars":   2,      "date":    "2014-10-16"    }  ]}--------> 查询  "sort": {    "com    ments.stars": {       "order": "desc",         "mode":  "min",         "nested_filter": {         "range": {          "comments.date": {            "gte": "2014-10-01",            "lt":  "2014-11-01"          }        }      }    }  }

注意：nested_filter仅影响排序，并不影响搜索结果。

Missing Values
当排序字段无值时，处理方法：_last、_first、custom

"sort" : [        { "price" : {"missing" : "_last"} }]

注意：当嵌套字段排序时，若nested_filter没有匹配值时，则missing value is used。

Ignoring Unmapped Fields
默认，若排序字段no mapping时，则search request will fail。
unmapped_type 选项， allows to ignore fields that have no mapping and not sort by them.

"sort" : [        { "price" : {"unmapped_type" : "long"} },]

若price字段no mapping则as if there was a mapping of type long,with all documents in this index having no value for this field.

ES Search APIs（持续更新）

Search

URI Search

Request Body Search

Query

from/size(分页)

Sort

Source filtering

Fields

Script Fields

Field Data Fields

Post filter

Highlighting

Rescoring

Search Type

Scroll

Preference

Explain

Version

Index Boost

min_score

Named Queries

Inner hits

Search Template

Search Shards API

Suggesters

Multi Search API

Count API

Search Exists API

Validate API

Explain API

Percolator

Field stats API