Fast Exact Search in Hamming Spacewith Multi-Index Hashing
来源:互联网 发布:淘宝家具店排名 编辑:程序博客网 时间:2024/05/22 10:38
2016年9月22日
Introduction
(1)次线性时间,(2)存储高效,(3)便于实现
(1)在搜索k-nn的时候先把问题转化为搜索汉明距离.
(2)搜索汉明距离可以转化为搜索每段的汉明距离,然后将m段的查询结果合并得到候选集
(3)最后在候选集中删除汉明距离大于r的结果
Tow problem
Define
•r-neighbors
如果一个二值码g是查询二值码的r-neighbors,当且仅当g与查询二值码最多有r位不同。
r-neighbor search
Tackle r-neighbor
IntuitiveMethod
随着r的增大,L(q,r)急剧增大。这种方法只有在小的搜素半径和短的编码下才具有使用性。
Multi-Index hashing
Tow proposition
MIH forr-neighbor Search
(1)numberof lookups
那些要搜索的buckets的个数
(2)numberof candidatestested
取决于子串的长度s. hash buckets 中的二值码子串都是完全一样的。如果一个buckets是在要检索的buckets,那么这个buckets里的所有二值码都是候选的。长度s得到hashbuckets 的个数是2^s,数据库中待查询的二值码总共有n个,假设二值码是均匀分布的,那么每个hashbuckets中二值码的个数平均是n/2^s.那么一个子串中所有候选的个数就是lookups* n/2^s.
Cost的上界取决于s
Choosing an Effective Substring Lengthsearch ratio r/q
plotscost as a function of substring lengths, for240-bit codes, different database sizesn, anddifferentsearchratio.
Complexity
k-NEAREST NEIGHBOR SEARCH
EXPERIMENTS
Multi-Index Hashingvs.LinearScanMulti-Index Hashingvs.LinearScan
Substring Optimization
方法:
(1)Initial:A random bit is assigned to the first substring
(2)a bit is assigned to substringj,which is maximally correlated with the bit assigned to substring j− 1. 到这一步,每一个substring中都有1个bits.
(3)repeat :
An unused bit is assigned to substring j, if the maximum correlation between that bit and other bits already assigned to substring jisminimal.
This approach significantlydecreases the correlation between bits withina single substring. This should make the distribution ofcodes within substrings buckets more uniform, andthereby lower the number of candidates within a given search radius.
Futurework
each substring hash table, thereby making the distribution of substrings asuniform as possible. However, this entropic approach is left to futurework.
- Fast Exact Search in Hamming Spacewith Multi-Index Hashing
- 多下标哈希表——Fast Exact Search in Hamming Space with Multi-Index Hashing
- exact nn search in hamming space
- 深度学习入门笔记:Fast Image Search with Deep Convolutional Neural Networks and Efficient Hashing Codes
- Adaptive Hashing for Fast Similarity Search
- Fast Multi-GPU collectives with NCCL-翻译
- Tricks in decide the index in the binary search with duplicate elements
- Fast Supervised Hashing with Decision Trees for High-Dimensional Data
- Index (search engine) In Wiki
- 1.Index Mapping (or Trivial Hashing) with negatives allowed[数据结构]
- Fast Scrolling in Tweetie with UITableView
- Fast Updates with MongoDB (update-in-place)
- elasticsearch---search in depth之multi-field search
- Max space clustering (Hamming)
- <Chapter 2>Fast Index Creation in the InnoDB Storage Engine
- NeMa: fast graph search with label similarity-VLDB2013
- Fast, Scalable Networking in Go with Mangos-nanomsg in go
- Reference Index in Latex With IEEE trans
- 【数据结构】算法10.6-10.8 快速排序
- Mediinfo.DBAccess.DBSQLException
- Itellij Idea 常用快捷键
- operator int () const; // 类型转换操作符函数(转整型)
- 第二章:java学习基础语法
- Fast Exact Search in Hamming Spacewith Multi-Index Hashing
- 【关于科研】自己认为的一些做好科研的建议
- 【数据结构】算法10.9 选择排序-简单选择排序
- JSON
- logstash 格式处理
- iOS iTunes Connect协议更新导致无法构建新版本
- GreenDao3.0+的配置使用以及数据库升级
- 新版本提示
- 使用Memcache实现Session共享(单点登录)的原理