cvpr11——Image Retrieval with Geometry-Preserving Visual Phrases

来源：互联网发布：多线程编程java 编辑：程序博客网时间：2024/06/05 06:34

Yimeng Zhang, , Zhaoyin Jia, and Tsuhan Chen. “Image Retrieval with Geometry Preserving Visual Phrases” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (oral presentation, 3.5% acceptance rate) , 2011. [pdf]

这篇文章主要在BoV（bag of visual word）基础上加入了spatial信息，提出了GVP[1]（Geometry-Preserving Visual Phrases），其基本思想比较简单，如下图所示：

统计两幅图像中，相同的word之间的距离。设定一阈值k，判断offset space平面中每个方格中的word是否大于k，之后统计GPV的个数。例如上图中：

k=1 GVP的个数为8，k=2 GVP=1（B,F）+1(D,F)+3(3 choose 2, for bin with A;B;C)。之后将GPV融入到索引结果中，其结构如下图：

相当于只增加了位置信息即（划分网格的个数）。

结合TF-IDF机制进行检索，同时可以引入MinHash。效果优于[2][3]。该方法只考虑了平移的特性，对于尺度和旋转在[4]同样可以适合该框架。

[1] Yimeng Zhang, Tsuhan Chen. “Efficient Kernels for Identifying Unbounded Order Spatial Features.” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2009. [pdf]

[2] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007.

[3] Y. Cao, C. Wang, Z. Li, L. Zhang, and L. Zhang. Spatialbag-of-features. In CVPR, 2010.

[4] Yimeng Zhang, Tsuhan Chen. “Weakly Supervised Object Recognition and Localization with Invariant High Order Features.” British Machine Vision Conference (BMVC), 2010. [pdf]

Yimeng Zhang个人主页：http://chenlab.ece.cornell.edu/people/yimeng/