Basics of Hash Table--Data Structure
来源:互联网 发布:淘宝百合花名妆真假 编辑:程序博客网 时间:2024/06/06 06:30
Intro:
API: Python: Dict; JAVA: HashMap
Applications: File Systems; Password Verification; Store Optimization
IP Address:
Main Loop
log - array of log lines(time,IP)
C - mapping from IPs to counters
i - first unprocessed log line
j - first line in current 1h window
i←0
j←0
C←∅
Each second
UpdateAccessList(log,i,j,C)
UpdateAccessList(log,i,j,C)
while log[i].time≤Now():
C[log[i].IP]←C[log[i].IP] + 1
i←i+1
while log[j].time≤Now()−3600:
C[log[j].IP]←C[log[j].IP]−1
j←j+1
AccessedLastHour(IP,C)
return C[IP]>0
Direct Addressing
Need a data structure forC
There are 232different IP(v4) addresses
Convert IP to 32-bit integer
Create an integer array A of size 232
Use A[int(IP)]asC[IP]
int(IP)
return IP[1]·224+IP[2]·216+IP[3]·28+IP[4]
UpdateAccessList(log,i,j,A)
while log[i].time≤Now():
A[int(log[i].IP)]←A[int(log[i].IP)] + 1
i←i+1
while log[j].time≤Now()−3600:
A[int(log[j].IP)]←A[int(log[j].IP)]−1
j←j+1
AccessedLastHour(IP)
return A[int(IP)]>0
Asymptotics
UpdateAccessListisO(1)per log line
AccessedLastHourisO(1)
But need 232memory even for few IPs
IPv6: 2128won’t fit in memory
In general: O(N)memory,N=|S|
List-based Mapping:
Direct addressing requires too much memory
Let’s store only active IPs
Store them in a list
Store only last occurrence of each IP
Keep the order of occurrence
UpdateAccessList(log,i,L)
while log[i].time≤Now():
log_line←L.FindByIP(log[i].IP)
if log_line!=NULL:
L.Erase(log_line)
L.Append(log[i])
i←i+1
while L.Top().time≤Now()−3600:
L.Pop()
AccessedLastHour(IP,L)
return L.FindByIP(IP)!=NULL
Asymptotics
n is number of active IPs
Memory usage is Θ(n)
L.Append,L.Top,L.PopareΘ(1)
L.FindandL.EraseareΘ(n)
UpdateAccessListisΘ(n)per log line
AccessedLastHourisΘ(n)
Encoding IPs
Encode IPs with small numbers
I.e. numbers from 0 to 999
Different codes for currently active IPs
Hash Function
De nition
For any set of objectsSand any integer
m >0, a functionh:S→ {0,1,...,m−1}is called a hash function.
De nition
m is called thecardinalityof hash function h.
Desirable Properties
h should be fast to compute.
Different values for different objects.
Direct addressing withO(m)memory.
Want small cardinalitym.
Impossible to have all different values ifnumber of objects|S|is more than m.
Collisions
De nition
When h(o1) = h(o2)ando1̸=o2, this is acollision.
Map
Store mapping from objects to other objects:
Filename → location of the file on disk
Student ID → student name
Contact name → contact phone number
Definition
Map from S to V is a data structure with methodsHasKey(O),Get(O),Set(O,v),whereO∈S,v∈V.
h :S→ {0,1, . . . ,m−1}
O,O′∈S
v,v′∈V
A ← array ofmlists (chains) of pairs(O,v)
HasKey(O)
L ←A[h(O)]
for (O′,v′)inL:
if O′==O:
return true
return false
Get(O)
L ←A[h(O)]
for (O′,v′)inL:
if O′==O:
return v′
return n/a
Set(O,v)
L ←A[h(O)]
for pinL:
if p.O==O:
p.v←v
return
L.Append(O,v)
Set
De nition
Set is a data structure with methodsAdd(O),Remove(O),Find(O).
Examples
IPs accessed during last hourStudents on campus
Keywords in a programming language
h :S→ {0,1, . . . ,m−1}
O,O′∈S
A ← array ofmlists (chains) of objectsO
Find(O)
L ←A[h(O)]
for O′inL:
if O′==O:
return true
return false
Add(O)
L ←A[h(O)]
for O′inL:
if O′==O:
return
L.Append(O)
Remove(O)
if not Find(O):
return
L ←A[h(O)]
L.Erase(O)
Hash Table:
Definition
An implementation of a set or a map usinghashing is called a hash table.
Programming Language:
Set:
unordered_set inC++
HashSet in Java
set in Python
Map:
unordered_map inC++
HashMap in Java
dict in Python
Conclusion
Chaining is a technique to implement ahash table
Memory consumption isO(n+m)
Operations work in timeO(c+1)
How to make bothmandcsmall?
- Basics of Hash Table--Data Structure
- Data Structure---Hash Table
- Hash Table: Hash Functions--Data Structure
- 86.Examine the structure and data of the CUST_TRANS table:
- characteristics of data structure
- Array of Data Structure
- Data structure of tree
- Hash Tables: Distributed Hash Tables--Data Structure
- The notes of Algorithms ---- Data Structures ---- Hash Table
- Structure of Import Symbols table
- Defination of Some Data Structure
- Thoughts of learning data structure
- 原创:Data Structure 学习笔记 之一 hash
- Hash Tables: String Search--Data Structure
- Basics of Cube Aggregates and Data Rollup
- [zt] hashing table and Data Structure tutorial
- Oracle: check the structure of a table
- LeetCode Summary of Data Structure & Algorithms
- 什么叫执行力
- webdriver 使用的一些例子
- input框在浏览器上显示一个叉,去掉方法
- 一文看懂web服务器、应用服务器、web容器、反向代理服务器区别与联系
- 【IOS】pod安装和使用pod install --no-repo-update
- Basics of Hash Table--Data Structure
- 初入JAVA——欢迎各路大神指点一二!!!
- ubuntu只能以访客登录,或命令行界面下无法用startx切换到图形界面
- jquery获取页面图片的实际尺寸
- 10 嵌套循环(NESTED LOOP)--优化主题系列
- 【头条】戴尔网络:软、硬"两条腿"践行开放战略
- 关于Django + Nginx + uWSGI 配置总结
- Consistent hashing 一致性哈希算法以及Java实现(已做测试)
- Java single number