ElasticSearch对地理数据查询(二)
来源:互联网 发布:淘宝企业店铺怎么退出 编辑:程序博客网 时间:2024/04/28 00:12
在ElasticSearch中,地理位置通过geo_point
这个数据类型来支持。地理位置的数据需要提供经纬度信息,当经纬度不合法时,ES会拒绝新增文档。这种类型的数据支持距离计算,范围查询等。在底层,索引使用Geohash实现。
1、创建索引
PUT创建一个索引cn_large_cities
,mapping
为city:
{ "mappings": { "city": { "properties": { "city": {"type": "string"}, "state": {"type": "string"}, "location": {"type": "geo_point"} } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
geo_point类型必须显示指定,ES无法从数据中推断。在ES中,位置数据可以通过对象,字符串,数组三种形式表示,分别如下:
# "lat,lon""location":"40.715,-74.011""location": { "lat":40.715, "lon":-74.011}# [lon ,lat]"location":[-74.011,40.715]
- 1
POST下面4条测试数据:
{"city": "Beijing", "state": "BJ","location": {"lat": "39.91667", "lon": "116.41667"}}{"city": "Shanghai", "state": "SH","location": {"lat": "34.50000", "lon": "121.43333"}}{"city": "Xiamen", "state": "FJ","location": {"lat": "24.46667", "lon": "118.10000"}}{"city": "Fuzhou", "state": "FJ","location": {"lat": "26.08333", "lon": "119.30000"}}{"city": "Guangzhou", "state": "GD","location": {"lat": "23.16667", "lon": "113.23333"}}
- 1
查看全部文档:
curl -X GET "http://localhost:9200/cn_large_cities/city/_search?pretty=true"
- 1
- 1
返回全部的5条数据,score均为1:
2、位置过滤
ES中有4中位置相关的过滤器,用于过滤位置信息:
- geo_distance: 查找距离某个中心点距离在一定范围内的位置
- geo_bounding_box: 查找某个长方形区域内的位置
- geo_distance_range: 查找距离某个中心的距离在min和max之间的位置
- geo_polygon: 查找位于多边形内的地点。
geo_distance
该类型过滤器查找的范围如下图:
下面是一个查询例子:
{ "query":{ "filtered":{ "filter":{ "geo_distance":"1km", "location":{ "lat":40.715, "lon": -73.988 } } } }}
- 1
以下查询,查找距厦门500公里以内的城市:
{ "query":{ "filtered":{ "filter":{ "geo_distance" : { "distance" : "500km", "location" : { "lat" : 24.46667, "lon" : 118.10000 } } } } } }
- 16
geo_distance_range
{ "query":{ "filtered":{ "filter":{ "geo_distance_range":{ "gte": "1km", "lt": "2km", "location":{ "lat":40.715, "lon": -73.988 } } } }}
- 1
- 6
geo_bounding_box
{ "query":{ "filtered":{ "filter":{ "geo_bounding_box":{ "location":{ "top_left":{ "lat": 40.8, "lon":-74.0 }, "bottom_right":{ "lat":40.715, "lon": -73.0 } } } } }}
- 20
3、按距离排序
接着我们按照距离厦门远近查找:
{ "sort" : [ { "_geo_distance" : { "location" : { "lat" : 24.46667, "lon" : 118.10000 }, "order" : "asc", "unit" : "km" } } ], "query": { "filtered" : { "query" : { "match_all" : {} } } }}
结果如下,依次是厦门、福州、广州…。符合我们的常识:
{ "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 5, "max_score": null, "hits": [ { "_index": "us_large_cities", "_type": "city", "_id": "AVaiSGXXjL0tfmRppc_p", "_score": null, "_source": { "city": "Xiamen", "state": "FJ", "location": { "lat": "24.46667", "lon": "118.10000" } }, "sort": [ 0 ] }, { "_index": "us_large_cities", "_type": "city", "_id": "AVaiSSuNjL0tfmRppc_r", "_score": null, "_source": { "city": "Fuzhou", "state": "FJ", "location": { "lat": "26.08333", "lon": "119.30000" } }, "sort": [ 216.61105485607183 ] }, { "_index": "us_large_cities", "_type": "city", "_id": "AVaiSd02jL0tfmRppc_s", "_score": null, "_source": { "city": "Guangzhou", "state": "GD", "location": { "lat": "23.16667", "lon": "113.23333" } }, "sort": [ 515.9964950041397 ] }, { "_index": "us_large_cities", "_type": "city", "_id": "AVaiR7_5jL0tfmRppc_o", "_score": null, "_source": { "city": "Shanghai", "state": "SH", "location": { "lat": "34.50000", "lon": "121.43333" } }, "sort": [ 1161.512141925948 ] }, { "_index": "us_large_cities", "_type": "city", "_id": "AVaiRwLUjL0tfmRppc_n", "_score": null, "_source": { "city": "Beijing", "state": "BJ", "location": { "lat": "39.91667", "lon": "116.41667" } }, "sort": [ 1725.4543712286697 ] } ] }}
- 1
结果返回的sort字段是指公里数。加上限制条件,只返回最近的一个城市:
{ "from":0, "size":1, "sort" : [ { "_geo_distance" : { "location" : { "lat" : 24.46667, "lon" : 118.10000 }, "order" : "asc", "unit" : "km" } } ], "query": { "filtered" : { "query" : { "match_all" : {} } } }}
4、地理位置聚合
ES提供了3种位置聚合:
- geo_distance: 根据到特定中心点的距离聚合
- geohash_grid: 根据Geohash的单元格(cell)聚合
- geo_bounds: 根据区域聚合
4.1 geo_distance聚合
下面这个查询根据距离厦门的距离来聚合,返回0-500,500-8000km的聚合:
{ "query":{ "filtered":{ "filter":{ "geo_distance" : { "distance" : "10000km", "location" : { "lat" : 24.46667, "lon" : 118.10000 } } } } }, "aggs":{ "per_ring":{ "geo_distance":{ "field": "location", "unit": "km", "origin":{ "lat" : 24.46667, "lon" : 118.10000 }, "ranges":[ {"from": 0 , "to":500}, {"from": 500 , "to":8000} ] } } }}
- 1
返回的聚合结果如下;
"aggregations": { "per_ring": { "buckets": [ { "key": "*-500.0", "from": 0, "from_as_string": "0.0", "to": 500, "to_as_string": "500.0", "doc_count": 2 }, { "key": "500.0-8000.0", "from": 500, "from_as_string": "500.0", "to": 8000, "to_as_string": "8000.0", "doc_count": 3 } ] } }
- 12
可以看到,距离厦门0-500km的城市有2个,500-8000km的有3个。
4.2 geohash_grid聚合
该聚合方式根据geo_point数据对应的geohash值所在的cell进行聚合,cell的划分精度通过precision
属性来控制,精度是指cell划分的次数。
{ "query":{ "filtered":{ "filter":{ "geo_distance" : { "distance" : "10000km", "location" : { "lat" : 24.46667, "lon" : 118.10000 } } } } }, "aggs":{ "grid_agg":{ "geohash_grid":{ "field": "location", "precision": 2 } } }}
- 1
聚合结果如下:
"aggregations": { "grid_agg": { "buckets": [ { "key": "ws", "doc_count": 3 }, { "key": "wx", "doc_count": 1 }, { "key": "ww", "doc_count": 1 } ] } }
- 1
可以看到,有3个城市的的geohash值为ws。将精度提高到5,聚合结果如下:
"aggregations": { "grid_agg": { "buckets": [ { "key": "wx4g1", "doc_count": 1 }, { "key": "wwnk7", "doc_count": 1 }, { "key": "wssu6", "doc_count": 1 }, { "key": "ws7gp", "doc_count": 1 }, { "key": "ws0eb", "doc_count": 1 } ] } }
- 1
- 16
4.3 geo_bounds聚合
这个聚合操作计算能够覆盖所有查询结果中geo_point的最小区域,返回的是覆盖所有位置的最小矩形:
{ "query":{ "filtered":{ "filter":{ "geo_distance" : { "distance" : "10000km", "location" : { "lat" : 24.46667, "lon" : 118.10000 } } } } }, "aggs":{ "map-zoom":{ "geo_bounds":{ "field": "location" } } }}
- 1
- 15
结果如下:
"aggregations": { "map-zoom": { "bounds": { "top_left": { "lat": 39.91666993126273, "lon": 113.2333298586309 }, "bottom_right": { "lat": 23.16666992381215, "lon": 121.43332997336984 } } } }
- 1
- 6
也就是说,这两个点构成的矩形能够包含所有到厦门距离10000km的区域。我们把距离调整为500km,此时覆盖这些城市的矩形如下:
"aggregations": { "map-zoom": { "bounds": { "top_left": { "lat": 26.083329990506172, "lon": 118.0999999679625 }, "bottom_right": { "lat": 24.46666999720037, "lon": 119.29999999701977 } } } }
- 1
- 2
5、参考资料
图解 MongoDB 地理位置索引的实现原理:http://blog.nosqlfan.com/html/1811.html
Geopoint数据类型:https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html
- ElasticSearch对地理数据查询(二)
- Elasticsearch对地理数据查询(一)
- pyes对elasticsearch的数据基本查询
- Elasticsearch(二)-数据
- ElasticSearch 数据查询
- Elasticsearch geo 地理查询容易掉入的坑
- 对加密数据的高效相似性查询(二)
- 数据查询(二)
- Elasticsearch对Hbase中的数据建索引实现海量数据快速查询
- GIS的学习(二十)基于Geoserver的WFS服务与Openlayers实现地理查询
- GIS的学习(二十)基于Geoserver的WFS服务与Openlayers实现地理查询
- ElasticSearch(四):查询
- ArcGIS教程:地理处理服务示例(选择数据)(二)
- elasticsearch学习总结(二) 集群数据分配
- Elasticsearch(二)elasticsearch索引数据与简单检索GET一个文档
- Elasticsearch(九)elasticsearch数据输入和输出二 -- 批量操作
- Elasticsearch(十三)elasticsearch请求体查询
- ElasticSearch API 地理距离过滤器
- Javascript基础系列16:Javascript的原型链和继承详解
- 事件与状态机 事件驱动编程
- 数学题
- 正则表达式
- 杂七杂八
- ElasticSearch对地理数据查询(二)
- visualgdb 设置环境变量 LD_LIBRARY_PATH
- RN开发问题总结(一)
- 记录那段转行程序员的心路历程
- scala 在某特定范围内生成随机数
- python多线程
- 随笔记-01
- 电子邮箱的安全性
- Note_5