【Python】使用Bloomfilter去重

来源:互联网 发布:广东11选5遗漏360数据 编辑:程序博客网 时间:2024/05/21 19:28

环境

  • python3.5
  • pip3 install bitarray-0.8.1-cp35-cp35m-win_amd64.whl
  • pip3 install pybloom_live
  • 参考:https://github.com/jaybaird/python-bloomfilter

使用

  • ScalableBloomFilter
from pybloom_live import ScalableBloomFiltersbf = ScalableBloomFilter(initial_capacity=100, error_rate=0.001, mode=ScalableBloomFilter.LARGE_SET_GROWTH)url = "www.baidu.com"url2 = "www.douban,com"sbf.add(url)print(url in sbf)   # Trueprint(url2 in sbf)  # False
  • BloomFilter
from pybloom_live import BloomFilterbf = BloomFilter(capacity=1000)bf.add("www.baidu.com")print("www.baidu.com" in bf)   # Trueprint("www.douban.com" in bf)  # False