elasticsearch查询语句篇

来源:互联网 发布:怎么知道ftp端口是多少 编辑:程序博客网 时间:2024/05/29 03:09

1.ElasticSearch基本概念

整个搜索客户端github地址:https://github.com/cweeyii/elasticsearch-parent
elasticsearch基本概念见:https://es.xiaoleilu.com/010_Intro/05_What_is_it.html
集群模式安装:http://blog.csdn.net/cweeyii/article/details/71055884

2. 重点概念

  • 搜素类型(searchType)
    特别是你需要检索出满足条件的文档数量时,可以直接设置为count类型,即只会返回命中的文档数量。(相当于mysql:select count(1) from table where valid=0)
    PS:该类型现在已经被废弃可以直接设置search条件中的from=0 size=0即可,效率一样。
#检索条件构造:SearchRequestBuilder builder = client.prepareSearch(indexName).setQuery(searchCondition.getQueryBuilder()).setFrom(0).setSize(0);#结果数量获取:SearchHits hits = searchResponse.getHits();hits.getTotalHits();
  • 默认对象
    ES建立的索引中会包换多个元数据字段,每一个都以下划线开头,例如 _type, _id,_index 和 _source
    这些字段是十分有用的,例如可以将用户记录中的主键设置为_id的内容,可以实现根据主键更新es记录的作用,并且可以实现根据id获取记录或者实现查询中过滤指定id的记录的功能。参见:IdsQueryBuilder。并且如果没有设置逻辑的routing,那么记录定位shard分片就是根据_id来实现的。
    _index索引名字 _type索引类型 _id文档id _source (Elasticsearch 用来保存文档主体 JSON字段)
    这里写图片描述
  • 动态映射
    当 Elasticsearch 处理一个位置的字段时,它通过【动态映射】来确定字段的数据类型且自动将该字段加到类型映射中。例如:你可以不用先自己去建立mapping关系,es会根据你传入的索引类型中的字段的类型来自动映射,如string类型会分词和存储。这个功能在你要对索引的对象加上一个字段的时候,非常有用。你不需要去删除和修改mapping,只要刷一遍数据,这个字段就自动刷到索引中了。
    但是有时候该功能不是想要的。如下面一个mapping就不能通过自动映射来实现:
{   "settings": {      "index": {         "number_of_replicas": "1",         "number_of_shards": "5"      }   },   "mappings": {      "enterprise_basic_info": {         "_all": {            "enabled": false         },         "properties": {            "id": {               "type": "long"            },                   "enterpriseName": {               "type": "string",               "analyzer": "ik_max_word"            },            "address": {               "type": "string",               "analyzer": "ik_max_word"            },            "latitude": {               "type": "double"            },            "longitude": {               "type": "double"            },            "phone": {               "type": "string",               "index": "not_analyzed"            },            "businessCategory": {               "type": "string",               "index": "not_analyzed"            },            "cityName": {               "type": "string",               "index": "not_analyzed"            },            "districtName": {               "type": "string",               "index": "not_analyzed"            },            "valid": {               "type": "long"            },            "location": {               "type": "geo_point",                                "geohash_prefix":true,               "geohash_precision":12            }         }      }   }}

其中城市和行政区虽然都是字符串类型但是并不需要其被分词。因此对于建立索引推荐还是自己配置mapping

  • 索引别名
    索引别名有点像指针的作用,其并不会存储数据或者产生一个新的索引,其主要是指定向一个索引。别名常用于索引的快速切换的功能。例如:刚开始你的索引别名my_index指向my_index1,你可以不用开关机,修改代码直接将my_index指向my_index2
  • QueryBuilder和FilterBuilder的区别
    FilterBuilder在检索的时候,实现的是过滤的功能,它会将所有的记录根据筛选条件进行预先的筛选,然后在筛选的结果里面进行QueryBuilder的查询。因此FilterBuilder也有选取满足指定条件的记录的功能,并且该筛选结果会被缓存起来,下一次有同样条件的筛选要求,就不需要重新计算了,另外与QueryBuilder比较FilterBuilder其不需要计算文档的相关性,因此速度更快。【官网解释】
    PS:我做实验发现,并没有速度的提升,有可能进行了QueryBuilder的优化,或者我索引文档的数量太少(10000条记录)体现不出差别:
    见后文的github中代码:ElasticSearchConditionTest
int times = 10000;        SearchCondition filterCondition = new SearchCondition();        filterCondition.setFilterBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST)                .term("valid", 1, OperationType.MUST).builder());        Long beginTime1 = System.currentTimeMillis();        for (int i = 0; i < times; i++) {            List<EnterpriseBasicInfoDTO> basicInfoDTOList = enterpriseSearchHandle.getListByCondition(filterCondition, null);        }        Long beginTime2 = System.currentTimeMillis();        LOGGER.info("运行Filter {} 花费时间{} 秒", times, (beginTime2 - beginTime1) / 1000);        SearchCondition queryCondition = new SearchCondition();        queryCondition.setQueryBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST).                term("valid", 1, OperationType.MUST).builder());        for (int i = 0; i < times; i++) {            List<EnterpriseBasicInfoDTO> basicInfoDTOList = enterpriseSearchHandle.getListByCondition(queryCondition, null);        }        Long beginTime3 = System.currentTimeMillis();        LOGGER.info("运行query {} 花费时间{} 秒", times, (beginTime3 - beginTime2) / 1000);

执行结果:可以发现进行一万次就只有几秒的提升,感觉并没有太大区别。

18:10:51.718  INFO (ElasticSearchConditionTest.java:41) - 运行Filter 10000 花费时间27 秒18:11:26.416  INFO (ElasticSearchConditionTest.java:49) - 运行query 10000 花费时间34 秒
  • 快速的距离范围查找GeoHash
    GeoHash算法是主要用于解决快速查找邻域范围(如500m内商家)类的其他记录的功能的算法。其主要思想是将整个地球品面分为8分,每一份由不同的字符表示,同样的对于每一份也递归的进行切分,最后根据你设置的geohash的长度,没一份覆盖的范围越来越小,因此如果需要求范围内点,只需要获取领域返回的其他几块,之后在这些删选数据中在进行高消耗的详细计算。
    这里写图片描述
    上图是geohash不同长度对应的精度。如11位长的geohash编码能够到达查找15米范围的所有相邻点的功能。
    要设置坐标的geohash功能需要添加一个新的字段来表示,如我有一堆POI有经纬度坐标,为了要实现geohash范围查找的功能,我需要在mapping中加入一个location字段
#mapping中设置            "latitude": {               "type": "double"            },            "longitude": {               "type": "double"            },            "location": {               "type": "geo_point",               "geohash_prefix":true,               "geohash_precision":12            }#在建立索引的类型对象中只需要如下设置即可:(利用fastJson序列化只根据get和set方法来判断是否具有location字段,你可以不用设置该字段,具体代码也可以看下面的gitHub链接,里面有具体的实现)public String getLocation() {        return latitude + "," + longitude;    }
  • ElasticSearch具体操作
    term匹配(不进行分词)准确匹配:TermQueryBuilder
    queryString(进行分词)分词匹配:QueryStringQueryBuilder根据QueryStringQueryBuilder.Operator的操作是AND 还是OR操作来决定分词结果是需要同时包含,还是包含其中一个就行。
    prefix 准确匹配: 如果索引的字段需要进行分词,那么根据该分词结果的term是否有prefix指定的前缀,如果有则匹配。如果索引的字段不进行分词,那么看该字段内容是否有prefix前缀。PrefixQueryBuilder
    range(范围匹配,大小、小于、between and):指定字段是否在该范围内,如果在则匹配。RangeQueryBuilder
    notInId或者idIn:根据id进行筛选或者过滤。IdsQueryBuilder
    fuzzy模糊匹配:根据字符串之间的编辑距离来匹配FuzzyQueryBuilder
    wildcard通配符匹配:根据通配符来匹配字符串WildcardQueryBuilder
    geoDistance地理坐标范围匹配:根据各种计算距离的方式来实现距离范围匹配GeoDistanceQueryBuilder
    geoHash根据geohash编码来进行近似范围匹配:GeohashCellQuery.Builder
  • 开发的elasticsearch通用包
    github地址:https://github.com/cweeyii/elasticsearch-parent
    client包:主要实现对Query操作的编辑包装和搜索操作的封装,特别好用
    重要类介绍:
    query和filter的条件构造类:
package com.cweeyii.operation;import org.elasticsearch.common.unit.DistanceUnit;import org.elasticsearch.index.query.*;import org.springframework.util.CollectionUtils;import java.util.ArrayList;import java.util.List;import java.util.Map;import java.util.concurrent.ConcurrentHashMap;/** * Created by wenyi on 17/5/9. * Email:caowenyi@meituan.com */public class OperationBuilderFactory {    public static Builder builder() {        return new Builder();    }    public static class Builder {        private Map<OperationType, List<QueryBuilder>> queryBuilderMap = new ConcurrentHashMap<>();        private Builder(){}        public Builder term(String field, Object value, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new TermQueryBuilder(field, value));            return this;        }        public Builder queryString(String field, String value, OperationType operationType, QueryStringQueryBuilder.Operator operator) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new QueryStringQueryBuilder(value).field(field).defaultOperator(operator));            return this;        }        public Builder queryString(String field, String value, OperationType operationType) {            return queryString(field, value, operationType, QueryStringQueryBuilder.Operator.OR);        }        public Builder prefix(String field, String prefix, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new PrefixQueryBuilder(field, prefix));            return this;        }        public Builder range(String field, Object from, Object to, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new RangeQueryBuilder(field).from(from).to(to));            return this;        }        public Builder notInId(List<String> ids, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new IdsQueryBuilder().ids(ids));            return this;        }        public Builder fuzzy(String field, Object value, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new FuzzyQueryBuilder(field, value));            return this;        }        public Builder wildcard(String field, String value, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new WildcardQueryBuilder(field, value));            return this;        }        public Builder geoDistance(String field, double lat, double lon, double distance, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new GeoDistanceQueryBuilder(field).point(lat, lon).distance(distance, DistanceUnit.METERS));            return this;        }        public Builder geoHash(String field, double lat, double lon, int precisionLevel, OperationType operationType) {            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);            queryBuilders.add(new GeohashCellQuery.Builder(field).point(lat, lon).precision(precisionLevel).neighbors(true));            return this;        }        public QueryBuilder builder() {            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();            List<QueryBuilder> mustBuilders = getQueryBuilders(OperationType.MUST);            if (!CollectionUtils.isEmpty(mustBuilders)) {                for (QueryBuilder queryBuilder : mustBuilders) {                    boolQueryBuilder.must(queryBuilder);                }            }            List<QueryBuilder> mustNotBuilders = getQueryBuilders(OperationType.MUST_NOT);            if (!CollectionUtils.isEmpty(mustNotBuilders)) {                for (QueryBuilder queryBuilder : mustNotBuilders) {                    boolQueryBuilder.mustNot(queryBuilder);                }            }            List<QueryBuilder> shouldBuilders = getQueryBuilders(OperationType.SHOULD);            if (!CollectionUtils.isEmpty(shouldBuilders)) {                for (QueryBuilder queryBuilder : shouldBuilders) {                    boolQueryBuilder.should(queryBuilder);                }            }            return boolQueryBuilder;        }        public List<QueryBuilder> getQueryBuilders(OperationType operationType) {            List<QueryBuilder> queryBuilders = queryBuilderMap.get(operationType);            if (queryBuilders == null) {                synchronized (this) {                    if (queryBuilders == null) {                        queryBuilders = new ArrayList<>();                        queryBuilderMap.put(operationType, queryBuilders);                    }                }            }            return queryBuilders;        }    }}

搜素条件包装类:实现了搜索条件的封装、排序、聚合

public class SearchCondition {    private QueryBuilder queryBuilder = null;    private QueryBuilder filterBuilder = null;    private List<SortBuilder> orders = new ArrayList<>();    private List<AbstractAggregationBuilder> aggregationBuilders = new ArrayList<>();    private SearchType searchType;    private int limit = 20;    private int offset = 0;    private int total = 0;    public List<AbstractAggregationBuilder> getAggregationBuilders() {        return aggregationBuilders;    }    public void setAggregationBuilders(List<AbstractAggregationBuilder> aggregationBuilders) {        this.aggregationBuilders = aggregationBuilders;    }    public List<SortBuilder> getOrders() {        return orders;    }    public void setOrders(List<SortBuilder> orders) {        this.orders = orders;    }    public int getTotal() {        return total;    }    public SearchCondition setTotal(int total) {        this.total = total;        return this;    }    public SearchCondition orderBy(String field, double lat, double lon, SortOrder order, GeoDistance geoDistance) {        if (!StringUtils.isEmpty(field)) {            orders.add(new GeoDistanceSortBuilder(field).order(order).point(lat, lon).geoDistance(geoDistance));        }        return this;    }    public SearchCondition orderBy(String field, double lat, double lon) {        return orderBy(field, lat, lon, SortOrder.ASC, GeoDistance.DEFAULT);    }    public SearchCondition orderBy(String field, SortOrder order) {        if (!StringUtils.isEmpty(field)) {            orders.add(new FieldSortBuilder(field).order(order));        }        return this;    }    public SearchCondition orderBy(String field) {        return orderBy(field, SortOrder.ASC);    }    public QueryBuilder getQueryBuilder() {        if (queryBuilder == null) {            return QueryBuilders.matchAllQuery();        }        return queryBuilder;    }    public SearchCondition setQueryBuilder(QueryBuilder queryBuilder) {        this.queryBuilder = queryBuilder;        return this;    }    public QueryBuilder getFilterBuilder() {        return filterBuilder;    }    public SearchCondition setFilterBuilder(QueryBuilder filterBuilder) {        this.filterBuilder = filterBuilder;        return this;    }    public SearchCondition setAggregation(String field, double lat, double lon, Pair<Double, Double>... rangePoints) {        if (!StringUtils.isEmpty(field)) {            GeoDistanceBuilder geoDistanceBuilder = new GeoDistanceBuilder(field).point(new GeoPoint(lat, lon)).unit(DistanceUnit.METERS);            for (Pair<Double, Double> rangePoint : rangePoints) {                geoDistanceBuilder.addRange(rangePoint.getFirst(), rangePoint.getSecond());            }            aggregationBuilders.add(geoDistanceBuilder);        }        return this;    }    public SearchType getSearchType() {        return searchType;    }    public SearchCondition setSearchType(SearchType searchType) {        this.searchType = searchType;        return this;    }    public int getLimit() {        return limit;    }    public SearchCondition setLimit(int limit) {        this.limit = limit;        return this;    }    public int getOffset() {        return offset;    }    public SearchCondition setOffset(int offset) {        this.offset = offset;        return this;    }}

使用方法:

SearchCondition searchCondition = new SearchCondition();        searchCondition.setFilterBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST)                .term("valid", 1, OperationType.MUST).builder());                searchCondition.setFilterBuilder(OperationBuilderFactory.builder().geoHash("location", lat, lon, 5, OperationType.MUST).builder())                .orderBy("location", lat, lon, SortOrder.ASC, GeoDistance.ARC).orderBy("id", SortOrder.ASC).setOffset(0).setLimit(100);
0 0
原创粉丝点击