ES Search APIs(持续更新)

来源:互联网 发布:手机数据能删除吗 编辑:程序博客网 时间:2024/04/29 20:16

https://www.elastic.co/guide/en/elasticsearch/reference/2.1/search.html
Most search APIs are multi-index, multi-type,(逗号分隔) with the exception of the Explain API endpoints.

Routing
When executing a search, it will be broadcast to all the index/indices shards (round robin between replicas). Which shards will be searched on can be controlled by providing the routing parameter.

..._search?routing=xxx若索引时有routing值,则搜索时指定routing值,减少搜索的分片数

Stats Groups
which maintains a statistics aggregation per group.

GET index/type/_search{    "size": 1000,    "stats" : ["c1", "c2"]}GET _stats/search?groups=c1,c2&level=shards  分片级别

Global Search Timeout
集群级别、可动态更新:search.default_search_timeout=-1,表示未设置,单位毫秒
请求级别:timeout

Search

query –> hits
查询方式:

  • URI Search
  • Request Body Search
..._search?q=字段名:字段值'

Multi-Index, Multi-Type

  • _search 整个集群下搜
  • _all/type/_search 所有索引下的type1搜索
  • index/_search 单个索引下
  • dex/type/_search 单索引、单type
  • idx1,idx2/t1,t2/_search 多索引、多type

URI Search

通过URI提供request parameters,这种方式是适合简单的查询,部分search options 不可用。
非常适合”curl tests”

..._search?q=字段名:值

参数:

  • q,query string(query_string query)
  • df,default field to use when no field prefix is defined within the query
  • analyzer 查询字符串的分词器
  • lowercase_expanded_terms 默认true,should terms be automatically lowercased or not.分词的(analyzed)一般不受此影响,因其已被分词器小写。主要对未分词(not_analyzed)的。
  • analyze_wildcard 默认false。Should wildcard(通配符) and prefix queries be analyzed or not. By default, wildcards terms in a query string are not analyzed. 但wildcard查询仍工作。

    https://github.com/elastic/elasticsearch/issues/18592

  • default_operator OR(默认) or AND.
  • lenient 宽大,默认false。 true will cause format based failures (like providing text to a numeric field) to be ignored.
  • explain 打分是怎么计算出来的
  • _source false将禁止返回_source字段,部分字段用_source_include & _source_exclude

    _search?_source_include=store* 
  • fields 逗号分隔多个字段。有这个参数,则不再返回_source字段。若store=no(默认)则从_source解析值

        _search?fields=rdcName,skuId,storeName
  • sort 排序,默认_score。设置fieldName、 fieldName:asc、fieldName:desc。多字段排序,逗号分隔,注意次序。
  • track_scores ,When sorting, set to true in order to still track scores and return
  • timeout 搜索超时,默认无。超时返回—–可能部分结果、无结果。
  • terminate_after The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early. If set, the response will have a boolean field terminated_early to indicate whether the query execution has actually terminated_early.

  • from 默认0, starting from index of the hits

  • size 默认10, number of hits to return.
  • search_type dfs_query_then_fetch ,query_then_fetch(默认) ,scan—>scroll ,count–>size=0

Request Body Search

$ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{    "query" : {        "term" : { "user" : "kimchy" }    }}'

参数:

  • timeout
  • from/size
  • search_type
  • terminate_after
  • request_cache true|false(默认)。当size=0时,缓存aggs/suggestions/hits.total, not cache hits。注意查询中不要使用now之类,则不能cache。

    Cached results are invalidated automatically whenever the shard refreshes.
    The longer the refresh interval, the longer that cached entries will remain valid.
    If the cache is full, the least recently used cache keys will be evicted.
    手工清理:POST _cache/clear?request_cache=true
    适用场景:轻变更。

    设定:

    1. 可在创建索引时指定:
      "index.requests.cache.enable": true 动态
    2. per-request
      _search?request_cache=true

    注意:If your query uses a script whose result is not deterministic(确定性的,如用了随机函数、当前时间等),应设置为false。

    cache大小(elasticsearch.yml):
    indices.requests.cache.size: 1% 占比heap的默认值。

    Monitoring cache usage(bytes):

    GET _stats/request_cache?pretty&human     indices-statsGET _nodes/stats/indices/request_cache?pretty&human    nodes-stats

search_type ,request_cache必须以url参数形式,其它在request body。
request body可以作为REST parameter source的值。

Both HTTP GET and HTTP POST can be used to execute search with body. Since not all clients support GET with body, POST is allowed as well.

Query

{    "query" : {        "term" : { "user" : "kimchy" }    }}

from/size(分页)

from:the offset from the first result you want to fetch.
size:the maximum amount of hits to be returned. 每页多少条。

这2个参数:既可以request parameters,也可以 search body中设置.

注意:from + sizeindex.max_result_window(1w) ,否则Scroll

Sort

one or more field.
关注点:field、排序方向(order)、字段次序。
sort: per-field level.

{    "sort" : [        { "post_date" : {"order" : "asc"}},        "user",        { "name" : "desc" },        { "age" : "desc" },        "_score"    ],    "query" : {        "term" : { "user" : "kimchy" }    }}

_score和_doc
_score ,to sort by score(得分),
_doc , to sort by index order,不关心返回文档的顺序,最高效,如scrolling,

sort values
若指定排序字段,则返回结果将包含。

"_source": { ... }"sort": [排序值列表,逗号分隔 ]

sort order
asc 升序
desc 降序
_score默认desc,得分最高的第一,其它默认asc

Sort mode option
ES supports sorting by array or multi-valued fields.

  • min 最小
  • max 最大
  • sum 总,number array
  • avg 均,number array
  • median 中值,number array
    区分avg、median
"sort" : [ {"price" : {"order" : "asc", "mode" : "avg"}} ]

Sorting within nested objects

  • nested_path ,Defines on which nested object to sort. 在哪个嵌套类型字段的direct field 排序。
  • nested_filter ,当嵌套类型字段是多值时,过滤出需考虑排序的值,并不是所有的值都考虑排序
  • 嵌套:object/nest object
PUT /my_index/blogpost/2{  "title": "Investment secrets",  "body":  "What they don't tell you ...",  "tags":  [ "shares", "equities" ],  "comments": [    {      "name":    "Mary Brown",      "comment": "Lies, lies, lies",      "age":     42,      "stars":   1,      "date":    "2014-10-18"    },    {      "name":    "John Smith",      "comment": "You're making it up!",      "age":     28,      "stars":   2,      "date":    "2014-10-16"    }  ]}--------> 查询  "sort": {    "com    ments.stars": {       "order": "desc",         "mode":  "min",         "nested_filter": {         "range": {          "comments.date": {            "gte": "2014-10-01",            "lt":  "2014-11-01"          }        }      }    }  }

注意:nested_filter仅影响排序,并不影响搜索结果

Missing Values
当排序字段无值时,处理方法:_last、_first、custom

"sort" : [        { "price" : {"missing" : "_last"} }]

注意:当嵌套字段排序时,若nested_filter没有匹配值时,则missing value is used。

Ignoring Unmapped Fields
默认,若排序字段no mapping时,则search request will fail
unmapped_type 选项, allows to ignore fields that have no mapping and not sort by them.

"sort" : [        { "price" : {"unmapped_type" : "long"} },]

若price字段no mapping则as if there was a mapping of type long,with all documents in this index having no value for this field.

注意:解决的是搜索异常,但排序无效

Source filtering

Fields

Script Fields

Field Data Fields

Post filter

Highlighting

Rescoring

Search Type

Scroll

Preference

Explain

Version

Index Boost

min_score

Named Queries

Inner hits

Search Template

Search Shards API

Suggesters

Multi Search API

Count API

Search Exists API

Validate API

Explain API

Percolator

Field stats API

1 1
原创粉丝点击