Indexed Nearest Neighbour Search in PostGIS
来源:互联网 发布:小度掌柜商家版mac 编辑:程序博客网 时间:2024/06/01 08:10
原文出自:http://blog.opengeo.org/2011/09/28/indexed-nearest-neighbour-search-in-postgis/
An always popular question on the PostGIS users mailing list has been “how do I find the N nearest things to this point?”.
To date, the answer has generally been quite convoluted, since PostGIS supports bounding box index searches, and in order to get the N nearest things you need a box large enough to capture at least N things. Which means you need to know how big to make your search box, which is not possible in general.
PostgreSQL has the ability to return ordered information where an index exists, but the ability has been restricted to B-Tree indexes until recently. Thanks to one of our clients, we were able to directly fund PostgreSQL developers Oleg Bartunov and Teodor Sigaev in adding the ability to return sorted results from a GiST index. And since PostGIS indexes use GiST, that means that now we can also return sorted results from our indexes.
Which is a very long way of saying that PostGIS (the development code in the source repository) now has the ability to do index-assisted nearest neighbour searching.
This feature (the PostGIS side of it) was funded by Vizzuality, and hopefully it comes in useful in theirCartoDB work.
You will need PostgreSQL 9.1 and the PostGIS source code from the repository, but this is what a nearest neighbour search looks like:
SELECT name, gidFROM geonamesORDER BY geom <-> st_setsrid(st_makepoint(-90,40),4326)LIMIT 10;
Note the magic <->
operator in the ORDER BY clause. This is where the magic occurs. The <->
is a “distance” operator, but it only makes use of the index when it appears in the ORDER BY clause. Between putting the operator in the ORDER BY and using a LIMIT to truncate the result set, we can very very quickly (less than 10ms on a 2M record table, in this case) get the 10 nearest points to our test point.
“It can’t possibly be this easy!!” You’re right. It can’t. Because it is traversing the index, which is made of bounding boxes, the distance operator only works with bounding boxes. For point data, the bounding boxes are equivalent to the points, so the answers are exact. But for any other geometry types (lines, polygons, etc) the results are approximate.
There are actually two different approximations available for you to chose from.
- Using the
<->
operator, you get the nearest neighbour using the centers of the bounding boxes to calculate the inter-object distances. - Using the
<#>
operator, you get the nearest neighbour using the bounding boxes themselves to calculate the inter-object distances.
In general, because the box calculations are approximations of calculations on the objects themselves, getting a more exact “nearest N objects” is going to require a two-phase query, where the first phase grabs a larger candidate set, and the second phase does an exact test (just like all the other index-assisted predicates). So, for example:
with index_query as ( select st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as distance, parcel_id, address from parcels order by geom <#> 'SRID=3005;POINT(1011102 450541)' limit 100)select * from index_query order by distance limit 10;
The indexed query pulls the 100 nearest objects by box distance, and the second query pulls the 10 actual closest from that set.
- Indexed Nearest Neighbour Search in PostGIS
- SOJ2099 Search Nearest Neighbour
- Nearest Neighbour in R
- K Nearest-Neighbour 总结
- Understanding k-Nearest Neighbour
- Instance-based Learning: K-Nearest Neighbour Algorithm && Radial Basis Function
- OpenCV(2)ML库->K-Nearest Neighbour分类器
- A Bayesian reassessment of nearest neighbour classication阅读解读
- Nearest Neighbor Search
- Nearest Neighbor Search
- Nearest Neighbor Search(数学题)
- Indexed View in SQL Server
- Indexed Views in SQL Server
- Find the nearest common ancestor of any two nodes in a binary search tree
- quantization product for nearest search
- C++实现的简单k近邻算法(K-Nearest-Neighbour,K-NN)
- Nearest Neighbor Search:: A Database Perspective
- k Nearest Neighbor Search by CUDA
- 探梦方寸间 移动终端CPU、GPU浅析(转)
- 如何查看僵死进程
- 详解C++ friend关键字
- struts2-拦截器 过滤文字
- check约束能加if条件的吗???
- Indexed Nearest Neighbour Search in PostGIS
- Linux /etc目录详解
- SharePoint 2007 Web Content Management 性能优化系列 3 – IIS压缩
- 转:决策树模型组合之随机森林与GBDT
- SharePoint 2007 Web Content Management 性能优化系列 2 – 32 bits or 64 bits ?
- 企业CEO与ERP软件公司
- SharePoint 2007 Web Content Management 性能优化系列 1 – 做好拓扑架构规划
- ubuntu 修复grub
- KMP