ElasticSearch 地理位置聚合
来源:互联网 发布:自己讲故事软件下载 编辑:程序博客网 时间:2024/05/19 22:57
好久没有写东西了。正好这两天刚忙完,就抽个时间把之前弄的东西总结下。供自己温故。当然也如果能对朋友们有所帮助。那是再好不过!先说说弄了这么久es的一些小感受吧。1.随着es使用的深入。发现我在做一个项目的时候最初希望将这个项目涉及到的type全部放到一个index下。但是当项目的深入后发现同样的一个字段可能在不同的type下,它的类型可能是string,也有可能是int(数字)型 。比如我常用的status这个字段。(所有的表都是基于原来的mysql数据库同步过去) 在30多个表中,其中2到3个表它的类型是string其他的表都是int。这样就导致需要将这个status指定一个别名。在所有用到这几个表的数据的时候都需要将别名重新改成status。这样无形中则增加了很多麻烦。2.由于将所有type都放到一个index下。发现我们在项目中会遇到意想不到一些问题。比如分词方式,我遇到的问题是按照名称去做聚合。可是前期并不知情,就将name这个字段用了ik分词。大家都知道用了分词的字段聚合的结果就不是你想要的了。es字段属性设置好之后又没有办法去修改。就只有删除库重新建。3.就是删除index的时候。数据都在一个index下你本来只有一个字段定义错了。但是没办法修改就只有备份然后删除index 重新创建。这样代价太大。接下来就就是正题了。查询stang_cbid的文档,title中包含“测试”,同时发布日期大于2017-05-01,按照pubtime 汇总并排序。
{ "query": { "bool": { "filter": [ { "match_phrase": { "title": "测试" } }, { "range":{ "pubtime":{ "gte":"2017-05-01" } } } ] } },"sort": [ { "pubtime": { "order": "desc" } } ] , "aggs":{ "group":{ "terms":{ "field":"pubtime" } } }}
局部更新数据:
post /index/type/_id/_update{ "doc":{ "filed1":"value", "filed2":"value" }}
bool条件过滤并以坐标点距离排序
{ "query" : { "bool": { "must": [ { "match_phrase": { "projectname": "隧道" } } , { "match_phrase": { "area_id": "26" } } ] } }, "sort" : [ { "_geo_distance" : { "location" :"31.88,106.25", "order" : "asc", "unit" : "km" } } ]}
分组查询:(注意用于分组的filed(字段) index属性为“not_analyzed”不被分析)
{ "query": { "bool": { "must": [ { "term": { "type": { "value": 2 } } }, { "term": { "area_id": { "value": 26 } } } ] } }, "aggs": { "top_tags": { "terms": { "field": "standards", "size": 20 }, "aggs": { "too": { "top_hits": { "_source": { }, "size" : 1 } } } } }}
我们要计算地理位置。首先我们需要设置字段的类型为geo_point.
如下:
“properties”: {
“location”:{
“type”: “geo_point”
}
}
我es的版本是2.4.1。type里面找不到这个类型。但是设置上了还是正确的类型。
存放的形式有3总
json体:”location”:{“lat”:30.23422,”lon”:107.23151}
数组:[30.123,104.21543]
String:{“30.34”,”121.343654”}
所有类型都是latitude在前,longitude在后。
下面是java 的代码:
1.给定一个坐标范围,查询落在这个范围内的所有文档,并指定返回的参数名称。
public Map map_tunnel(Map map) { ElasticsearchUtil eu = new ElasticsearchUtil("123"); String indexname = "test"; int from = 0; List list = new ArrayList<>(); String name = ""; try{ name = map.get("name").toString(); }catch(Exception e){ name = null; } String[] fileds = new String[]{}; SearchResponse searchResponse = null; QueryBuilder qb = QueryBuilders.geoBoundingBoxQuery("location").topLeft(Double.parseDouble(map.get("lat1").toString()), Double.parseDouble(map.get("lon1").toString())).bottomRight(Double.parseDouble(map.get("lat2").toString()), Double.parseDouble(map.get("lon2").toString())).ignoreMalformed(true); //ignoreMalformed忽略畸形数据 if (null != name) { switch (name) { case "plan": name = "stang_plan_project"; fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"}; break; case "work": name = "stang_work_project"; fileds = new String[]{"id", "latitude", "longitude", "projectaddress", "projectname"}; break; case "tunnel": name = "stang_tunnel"; fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type"}; break; default: break; } if (name.contains("tunnel")) { BoolQueryBuilder filterqb = QueryBuilders.boolQuery(); filterqb.mustNot(QueryBuilders.matchQuery("type", 2)); filterqb.filter(QueryBuilders.matchQuery("status", 3)); QueryBuilder qbss = QueryBuilders.queryFilter(filterqb); searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize); } else { searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, fileds, from, maxSize); } } else { name = "stang_tunnel"; fileds = new String[]{"id", "latitude", "longitude", "address", "name", "section", "status", "type","forid"}; QueryBuilder qbss = QueryBuilders.boolQuery().mustNot(QueryBuilders.matchQuery("type", 2)); searchResponse = eu.searchCompanySetSourceFiled(indexname, name, qb, qbss, fileds, from, maxSize); } try { SearchHits hits = searchResponse.getHits(); for (SearchHit hit : hits) { map = new HashMap<>(); map = hit.getSource(); list.add(map); } Map tmap = new HashMap<>(); tmap.put("ext", list); tmap.put("state", true); tmap.put("message", "操作成功"); return tmap; } catch (Exception e) { return OutData.softwareFormart(); } finally { eu.close(); } }
这个是searchCompanySetSourceFiled方法。(这个方法其实没什么好多说的。)
public SearchResponse searchCompanySetSourceFiled(String indexname, String type, QueryBuilder queryBuilder, String[] fileds, int from, int pageSize) { SearchResponse searchResponse = null; try { searchResponse = client.prepareSearch(indexname).setTypes(new String[]{type}).setQuery(queryBuilder).setFetchSource(fileds, null).setFrom(from).setSize(pageSize).execute().actionGet(); return searchResponse; } catch (Exception e) { return searchResponse; } }
得到的效果就是这样的:
简单解释下参数:lat1,lon1为给定的左上角的坐标,lat2,lon2为右下角的坐标。
geoBoundingBoxQuery:这个方法大致意思就是根据你传入的两个坐标点构建一盒子。只要是落在这个盒子或者矩形中的坐标点都会被查询出来。
2.计算两个坐标点之间的距离。
public Map datalist(Map<String, Object> map) { String lat = map.get("lat").toString(); String lon = map.get("lon").toString(); map.remove("lat"); map.remove("lon"); int from = 0; try{ from = Integer.parseInt(map.get("from").toString()); }catch(Exception e){ from = 0; } ElasticsearchUtil eu = new ElasticsearchUtil("123"); BoolQueryBuilder bqb = QueryBuilders.boolQuery(); try { if (!map.isEmpty()) { for (Entry<String, Object> vo : map.entrySet()) { switch (vo.getKey()) { case "type": QueryBuilder term = QueryBuilders.matchQuery("type", vo.getValue()); bqb.must(term); break; case "status": QueryBuilder term1 = QueryBuilders.matchQuery("status", vo.getValue()); bqb.must(term1); break; case "roadnetwork": QueryBuilder term2 = QueryBuilders.matchQuery("roadnetwork", vo.getValue()); bqb.must(term2); break; case "name": QueryBuilder term3 = QueryBuilders.matchPhraseQuery("name", vo.getValue()); bqb.must(term3); break; case "area_id": QueryBuilder term4 = QueryBuilders.matchQuery("area_id", vo.getValue()); bqb.must(term4); break; default: break; } } } else { QueryBuilder term = QueryBuilders.matchAllQuery(); bqb.should(term); } SortBuilder sb = SortBuilders.geoDistanceSort("location").point(Double.parseDouble(lat), Double.parseDouble(lon)).ignoreMalformed(true).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC); String indexname = "test"; String[] fileds = {"address", "area_id", "city_id", "id", "latitude", "longitude", "name", "pic_url", "roadnetwork", "section", "status", "type", "length"}; SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, "stang_tunnel", bqb, fileds, sb, from, pageSize); SearchHits hits = searchResponse.getHits(); List lists = new ArrayList<>(); for (SearchHit hit : hits) { map = new HashMap<>(); map = hit.getSource(); map.put("distance", OutData.formartDouble((double) hit.getSortValues()[0])); lists.add(map); } map = new HashMap<>(); map.put("count", hits.getTotalHits()); lists.add(map); return OutData.software_Formart(lists,pageSize); } catch (Exception e) { return OutData.softwareFormart(); } finally { eu.close(); } }
location:type中创建的类型为坐标点的字段的名称。
SortBuilders.geoDistanceSort():计算两个点之间的距离的函数
unit(DistanceUnit.KILOMETERS):设置现在的距离单位。
.order(SortOrder.ASC);排序方式(按照距离升序排列。即距离近的文档排在最前面.)
3.根据条件查询数据并计算坐标距离再聚合统计,效果和sql 的group by相似。
public Map datagroup(Map<String, Object> map) { ElasticsearchUtil eu = new ElasticsearchUtil("123"); int from = 0; String type = "stang_tunnel"; try { from = (Integer.parseInt(map.get("page").toString()) - 1) * pageSize; } catch (Exception e) { } map.remove("page"); String lat = map.get("lat").toString(); String lon = map.get("lon").toString(); map.remove("lat"); map.remove("lon"); String indexname = "test"; BoolQueryBuilder bqb = QueryBuilders.boolQuery(); try { if (!map.isEmpty()) { for (Entry<String, Object> vo : map.entrySet()) { switch (vo.getKey()) { case "type": QueryBuilder term1 = QueryBuilders.matchPhraseQuery("type", vo.getValue()); bqb.must(term1); break; case "area_id": QueryBuilder term2 = QueryBuilders.matchPhraseQuery(vo.getKey(), vo.getValue()); bqb.must(term2); break; default: break; } } } else { QueryBuilder term = QueryBuilders.matchAllQuery(); bqb.must(term); } GeoPoint gp = GeoPoint.parseFromLatLon(lat + "," + lon); SortBuilder sbd = SortBuilders.geoDistanceSort("location").points(gp).unit(DistanceUnit.KILOMETERS).order(SortOrder.ASC); String[] fields = {"id", "area_id", "city_id", "name", "roadnetwork", "status"}; //核心部分 AbstractAggregationBuilder aab = AggregationBuilders.terms("group").field("name").size(100).subAggregation(AggregationBuilders.topHits("too").setFetchSource(true).setFetchSource(fields, null).setSize(1)); SearchResponse searchResponse = eu.geoDistanceSortSearchAndSetSourceFileds(indexname, type, bqb, fields, sbd, aab, from, pageSize); SearchHits hits = searchResponse.getHits(); List lists = new ArrayList<>(); //核心部分 Terms terms = searchResponse.getAggregations().get("group"); List<Terms.Bucket> buckets = terms.getBuckets(); for (Terms.Bucket bucket : buckets) { TopHits topHits = bucket.getAggregations().get("too");//获取子聚合中的参数。 for (SearchHit hit : topHits.getHits()) { map = new HashMap<>(); map = hit.getSource(); map.put("area", OutData.transArea(map.get("area_id").toString())); map.put("city", eu.outCity("jjt", "stang_area", Integer.parseInt(map.get("city_id").toString()))); } lists.add(map); } map = new HashMap<>(); map.put("ext", lists); map.put("state", true); map.put("message", "操作成功"); } catch (Exception e) { System.out.println(e.getMessage()); map = OutData.softwareFormart(); } finally { eu.close(); } return map; }
AggregationBuilders.terms(“group”).field(“name”).size(100):外层聚合。
.subAggregation(AggregationBuilders.topHits(“too”).setFetchSource(true).setFetchSource(fields, null).setSize(1)):这个内层聚合里面的参数才是我们真正要取得的参数。size为1的目的是每个类别只需要返回条数据。因为是统计类别嘛。
setFetchSource():设置内部聚合返回的字段。
好了大体上也就这么多了。欢迎各位朋友赐教。
- ElasticSearch 地理位置聚合
- Elasticsearch聚合
- Elasticsearch]聚合
- ElasticSearch聚合
- ElasticSearch聚合
- Elasticsearch 地理位置范围查询
- Elasticsearch地理位置总结
- [Elasticsearch] 聚合的测试数据
- Elasticsearch分组聚合-查询
- ElasticSearch聚合aggs入门
- Elasticsearch笔记-聚合
- Elasticsearch分析聚合
- ElasticSearch聚合分析API
- Elasticsearch分析聚合
- Elasticsearch分析聚合
- elasticsearch 之Aggregation聚合
- elasticsearch多级聚合查询
- Elasticsearch聚合查询
- python 迭代器与生成器
- Linux数组笔记
- MySQL自动设置create_time和update_time
- 原生弹窗参考
- lesson6-2
- ElasticSearch 地理位置聚合
- 新手小白使用Tomcat遇见的各种简单问题
- jsp
- pull解析、生成xml
- 线段树入门&lazy思想
- 文章标题
- 文章标题
- vue.js之v-text 与 v-html
- linux_简介 grep