Elasticsearch-对一个field进行多值全文本搜索

来源:互联网 发布:网络社区营销什么意思 编辑:程序博客网 时间:2024/04/30 02:15

添加测试数据:

POST /forum/article/_bulk{ "update": { "_id": "1"} }{ "doc" : {"title" : "this is java and elasticsearch blog"} }{ "update": { "_id": "2"} }{ "doc" : {"title" : "this is java blog"} }{ "update": { "_id": "3"} }{ "doc" : {"title" : "this is elasticsearch blog"} }{ "update": { "_id": "4"} }{ "doc" : {"title" : "this is java, elasticsearch, hadoop blog"} }{ "update": { "_id": "5"} }{ "doc" : {"title" : "this is spark blog"} }
  • 搜索title中包含java或者elasticsearch的doc,这个不是之前的term搜索,而是full text的全文检索:
GET /forum/article/_search{  "query": {    "match": {      "title": "java elasticsearch"    }  }}
  • 如果我们想搜索即包含java也包含elasticsearch的doc,那么我们可以执行:
GET /forum/article/_search{  "query": {    "match": {      "title": {        "query": "java elasticsearch",        "operator": "and"      }    }  }}
  • 如果我们想搜索包含java,elasticsearch,spark,hadoop中至少3条的结果:
GET /forum/article/_search{  "query": {    "match": {      "title": {        "query": "java elasticsearch hadoop spark",        "minimum_should_match":"75%"      }    }  }}

如果使用bool可以用以下搜索条件,should默认是可以一个都不匹配:

GET /forum/article/_search{  "query": {    "bool": {      "should": [        {          "match": {            "title": "java"          }        },        {          "match": {            "title": "elasticsearch"          }        },        {          "match": {            "title": "hadoop"          }        },        {          "match": {            "title": "spark"          }        }      ],      "minimum_should_match": 3    }  }}

实际上,使用诸如上面的match query进行多值搜索的时候,es底层会自动将这个match query转换成bool的语法

例如:

{  "query": {    "match": {      "title": "java elasticsearch"    }  }}

会转换成:

GET /forum/article/_search{  "query": {    "bool": {      "should": [        {          "term": {            "title": {              "value": "java"            }          }        },        {          "term": {            "title": {              "value": "elasticsearch"            }          }        }      ]    }  }}

需要注意关键的一步是,两者之间唯一不同的是match使用了对搜索参数分词的处理,如果直接使用term将不会对搜索参数进行分词处理