ElasticSearch对地理数据查询(二)

来源:互联网 发布:淘宝企业店铺怎么退出 编辑:程序博客网 时间:2024/04/28 00:12

在ElasticSearch中,地理位置通过geo_point这个数据类型来支持。地理位置的数据需要提供经纬度信息,当经纬度不合法时,ES会拒绝新增文档。这种类型的数据支持距离计算,范围查询等。在底层,索引使用Geohash实现。

1、创建索引

PUT创建一个索引cn_large_cities,mapping为city:

{    "mappings": {        "city": {            "properties": {                "city": {"type": "string"},                "state": {"type": "string"},                "location": {"type": "geo_point"}            }        }    }}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

geo_point类型必须显示指定,ES无法从数据中推断。在ES中,位置数据可以通过对象,字符串,数组三种形式表示,分别如下:

# "lat,lon""location":"40.715,-74.011""location": {  "lat":40.715,  "lon":-74.011}# [lon ,lat]"location":[-74.011,40.715]
  • 1

POST下面4条测试数据:

{"city": "Beijing", "state": "BJ","location": {"lat": "39.91667", "lon": "116.41667"}}{"city": "Shanghai", "state": "SH","location": {"lat": "34.50000", "lon": "121.43333"}}{"city": "Xiamen", "state": "FJ","location": {"lat": "24.46667", "lon": "118.10000"}}{"city": "Fuzhou", "state": "FJ","location": {"lat": "26.08333", "lon": "119.30000"}}{"city": "Guangzhou", "state": "GD","location": {"lat": "23.16667", "lon": "113.23333"}}
  • 1

查看全部文档:

curl -X GET "http://localhost:9200/cn_large_cities/city/_search?pretty=true"
  • 1
  • 1

返回全部的5条数据,score均为1:

这里写图片描述

2、位置过滤

ES中有4中位置相关的过滤器,用于过滤位置信息:

  • geo_distance: 查找距离某个中心点距离在一定范围内的位置
  • geo_bounding_box: 查找某个长方形区域内的位置
  • geo_distance_range: 查找距离某个中心的距离在min和max之间的位置
  • geo_polygon: 查找位于多边形内的地点。

geo_distance

该类型过滤器查找的范围如下图:

下面是一个查询例子:

{  "query":{    "filtered":{      "filter":{        "geo_distance":"1km",        "location":{          "lat":40.715,          "lon": -73.988        }      }    }  }}
  • 1

以下查询,查找距厦门500公里以内的城市:

{    "query":{        "filtered":{          "filter":{            "geo_distance" : {                "distance" : "500km",                "location" : {                    "lat" : 24.46667,                    "lon" : 118.10000                }            }        }    }    } }
  • 16

geo_distance_range

{  "query":{    "filtered":{      "filter":{        "geo_distance_range":{        "gte": "1km",        "lt":  "2km",        "location":{          "lat":40.715,          "lon": -73.988        }      }    }  }}
  • 1
  • 6

geo_bounding_box

{  "query":{    "filtered":{      "filter":{        "geo_bounding_box":{        "location":{          "top_left":{            "lat": 40.8,            "lon":-74.0          },          "bottom_right":{            "lat":40.715,            "lon": -73.0          }        }      }    }  }}
  • 20

3、按距离排序

接着我们按照距离厦门远近查找:

{  "sort" : [      {          "_geo_distance" : {              "location" : {                    "lat" : 24.46667,                    "lon" : 118.10000              },               "order" : "asc",              "unit" : "km"          }      }  ],  "query": {    "filtered" : {        "query" : {            "match_all" : {}        }    }  }}

结果如下,依次是厦门、福州、广州…。符合我们的常识:

{  "took": 8,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 5,    "max_score": null,    "hits": [      {        "_index": "us_large_cities",        "_type": "city",        "_id": "AVaiSGXXjL0tfmRppc_p",        "_score": null,        "_source": {          "city": "Xiamen",          "state": "FJ",          "location": {            "lat": "24.46667",            "lon": "118.10000"          }        },        "sort": [          0        ]      },      {        "_index": "us_large_cities",        "_type": "city",        "_id": "AVaiSSuNjL0tfmRppc_r",        "_score": null,        "_source": {          "city": "Fuzhou",          "state": "FJ",          "location": {            "lat": "26.08333",            "lon": "119.30000"          }        },        "sort": [          216.61105485607183        ]      },      {        "_index": "us_large_cities",        "_type": "city",        "_id": "AVaiSd02jL0tfmRppc_s",        "_score": null,        "_source": {          "city": "Guangzhou",          "state": "GD",          "location": {            "lat": "23.16667",            "lon": "113.23333"          }        },        "sort": [          515.9964950041397        ]      },      {        "_index": "us_large_cities",        "_type": "city",        "_id": "AVaiR7_5jL0tfmRppc_o",        "_score": null,        "_source": {          "city": "Shanghai",          "state": "SH",          "location": {            "lat": "34.50000",            "lon": "121.43333"          }        },        "sort": [          1161.512141925948        ]      },      {        "_index": "us_large_cities",        "_type": "city",        "_id": "AVaiRwLUjL0tfmRppc_n",        "_score": null,        "_source": {          "city": "Beijing",          "state": "BJ",          "location": {            "lat": "39.91667",            "lon": "116.41667"          }        },        "sort": [          1725.4543712286697        ]      }    ]  }}
  • 1

结果返回的sort字段是指公里数。加上限制条件,只返回最近的一个城市:

{  "from":0,  "size":1,  "sort" : [      {          "_geo_distance" : {              "location" : {                    "lat" : 24.46667,                    "lon" : 118.10000              },               "order" : "asc",              "unit" : "km"          }      }  ],  "query": {    "filtered" : {        "query" : {            "match_all" : {}        }    }  }}

4、地理位置聚合

ES提供了3种位置聚合:

  • geo_distance: 根据到特定中心点的距离聚合
  • geohash_grid: 根据Geohash的单元格(cell)聚合
  • geo_bounds: 根据区域聚合

4.1 geo_distance聚合

下面这个查询根据距离厦门的距离来聚合,返回0-500,500-8000km的聚合:

{    "query":{        "filtered":{            "filter":{                "geo_distance" : {                    "distance" : "10000km",                    "location" : {                        "lat" : 24.46667,                        "lon" : 118.10000                    }                }           }        }    },    "aggs":{        "per_ring":{            "geo_distance":{                "field": "location",                "unit":  "km",                "origin":{                    "lat" : 24.46667,                    "lon" : 118.10000                },                "ranges":[                    {"from": 0 , "to":500},                    {"from": 500 , "to":8000}                ]            }        }    }}
  • 1

返回的聚合结果如下;

"aggregations": {    "per_ring": {      "buckets": [        {          "key": "*-500.0",          "from": 0,          "from_as_string": "0.0",          "to": 500,          "to_as_string": "500.0",          "doc_count": 2        },        {          "key": "500.0-8000.0",          "from": 500,          "from_as_string": "500.0",          "to": 8000,          "to_as_string": "8000.0",          "doc_count": 3        }      ]    }  }
  • 12

可以看到,距离厦门0-500km的城市有2个,500-8000km的有3个。

4.2 geohash_grid聚合

该聚合方式根据geo_point数据对应的geohash值所在的cell进行聚合,cell的划分精度通过precision属性来控制,精度是指cell划分的次数。

{    "query":{        "filtered":{            "filter":{                "geo_distance" : {                    "distance" : "10000km",                    "location" : {                        "lat" : 24.46667,                        "lon" : 118.10000                    }                }           }        }    },    "aggs":{        "grid_agg":{            "geohash_grid":{                "field": "location",                "precision":  2            }        }    }}
  • 1

聚合结果如下:

"aggregations": {    "grid_agg": {      "buckets": [        {          "key": "ws",          "doc_count": 3        },        {          "key": "wx",          "doc_count": 1        },        {          "key": "ww",          "doc_count": 1        }      ]    }  }
  • 1

可以看到,有3个城市的的geohash值为ws。将精度提高到5,聚合结果如下:

"aggregations": {    "grid_agg": {      "buckets": [        {          "key": "wx4g1",          "doc_count": 1        },        {          "key": "wwnk7",          "doc_count": 1        },        {          "key": "wssu6",          "doc_count": 1        },        {          "key": "ws7gp",          "doc_count": 1        },        {          "key": "ws0eb",          "doc_count": 1        }      ]    }  }
  • 1
  • 16

4.3 geo_bounds聚合

这个聚合操作计算能够覆盖所有查询结果中geo_point的最小区域,返回的是覆盖所有位置的最小矩形:

{    "query":{        "filtered":{            "filter":{                "geo_distance" : {                    "distance" : "10000km",                    "location" : {                        "lat" : 24.46667,                        "lon" : 118.10000                    }                }           }        }    },    "aggs":{        "map-zoom":{            "geo_bounds":{                "field": "location"            }        }    }}
  • 1
  • 15

结果如下:

 "aggregations": {    "map-zoom": {      "bounds": {        "top_left": {          "lat": 39.91666993126273,          "lon": 113.2333298586309        },        "bottom_right": {          "lat": 23.16666992381215,          "lon": 121.43332997336984        }      }    }  }
  • 1
  • 6

也就是说,这两个点构成的矩形能够包含所有到厦门距离10000km的区域。我们把距离调整为500km,此时覆盖这些城市的矩形如下:

"aggregations": {    "map-zoom": {      "bounds": {        "top_left": {          "lat": 26.083329990506172,          "lon": 118.0999999679625        },        "bottom_right": {          "lat": 24.46666999720037,          "lon": 119.29999999701977        }      }    }  }
  • 1
  • 2

5、参考资料

图解 MongoDB 地理位置索引的实现原理:http://blog.nosqlfan.com/html/1811.html
Geopoint数据类型:https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html

0 0
原创粉丝点击