Python语言实现哈夫曼编码
来源:互联网 发布:网络安全技术的建议 编辑:程序博客网 时间:2024/05/16 13:52
汉语版:使用python实现huffman编码是一个能够很快地实现。所以我们选择使用python来实现我们这个程序。 l
E-version: we will use python to realize this program called huffman encoding and decoding. why we use python, because in python we can finish this program faster then other codes. this program are not the final implementation. actually, this is the first version i commit to git. i changed a lot in the least version . so if you run those codes on your environment. some problems may be exist; don`t worry, the first four drafts are right, you can build everything based on them. so good lucky to you.
I:实现节点类
class Node: def __init__(self,freq): self.left = None self.right = None self.father = None self.freq = freq def is_left(self): return self.father.left == self
II:为每一个节点赋权值
def create_nodes(frequencies): return [Node(freq) for freq in frequencies]
III:创建哈夫曼树
def create_huffman_tree(nodes): queue = nodes[:] while len(queue) > 1: queue.sort(key=lambda item: item.freq) node_left = queue.pop(0) node_right = queue.pop(0) node_father = Node(node_left.freq + node_right.freq) node_father.left = node_left node_father.right = node_right node_left.father = node_father node_right.father = node_father queue.append(node_father) queue[0].father = None return queue[0]
III:遍历叶节点
def huffman_encoding(nodes, root): codes = [''] * len(nodes) for i in range(len(nodes)): node_tmp = nodes[i] while node_tmp != root: if node_tmp.is_left(): codes[i] = '0' + codes[i] else: codes[i] = '1' + codes[i] node_tmp = node_tmp.father return codes
IV:获取字符出现的频数
# 获取字符出现的频数def count_frequency(input_string): # 用于存放字符 char_store = [] # 用于存放频数 freq_store = [] # 解析字符串 for index in range(len(input_string)): if char_store.count(input_string[index]) > 0: temp = int(freq_store[char_store.index(input_string[index])]) temp = temp + 1 freq_store[char_store.index(input_string[index])] = temp else: char_store.append(input_string[index]) freq_store.append(1) # 返回字符列表和频数列表 return char_store, freq_store
V:获取字符、频数的列表
# 获取字符、频数的列表def get_char_frequency(char_store=[], freq_store=[]): # 用于存放char_frequency char_frequency = [] for item in zip(char_store, freq_store): temp = (item[0], item[1]) char_frequency.append(temp) return char_frequency
VI:将字符转换成哈夫曼编码
# 将字符转换成huffman编码def get_huffman_file(input_string, char_frequency, codes): # 逐个字符替换 file_content = '' for index in range(len(input_string)): for item in zip(char_frequency, codes): if input_string[index] == item[0][0]: file_content = file_content + item[1] file_name = 'huffman_' + str(uuid.uuid1())+'.txt' with open(file_name, 'w+') as destination: destination.write(file_content) return file_name
VII:解压缩哈夫曼文件
# 解压缩huffman文件def decode_huffman(input_string, char_store, freq_store): encode = '' decode = '' for index in range(len(input_string)): encode = encode + input_string[index] for item in zip(char_store, freq_store): if encode == item[1]: decode = decode + item[0] encode = '' return decode
VIII:计算压缩比(写错了,可以自行改写)
# 计算压缩比def get_encode_ration(codes): # 计算所需要的二进制个数 h_length = 0 for item in codes: h_length = h_length + len(item) t_length = bin_middle(len(codes))*len(codes) ratio = t_length/h_length return str(ratio)[0:3]# 计算所在的二进制空间def bin_middle(number): n, i = 1, 0 while n < number: n = n * 2 i = i + 1 return i
最后:Django文件接收,并返回
def upload(request): ctx = {} if request.method == "POST": file_name = str(request.FILES['file']) if not file_name.endswith('txt'): ctx['fail'] = 'file format exist wrong!' else: file = request.FILES['file'] ctx['success'] = 'Successful' input_string = tool.read_file(tool.save_file(file)) char_store, freq_store = tool.count_frequency(input_string) char_frequency = tool.get_char_frequency(char_store, freq_store) nodes = huf.create_nodes([item[1] for item in char_frequency]) root = huf.create_huffman_tree(nodes) codes = huf.huffman_encoding(nodes, root) save_file_name = tool.get_huffman_file(input_string, char_frequency, codes) for item in zip(char_frequency, codes): print('Character:%s freq:%-2d encoding: %s', item[0][0], item[0][1], item[1]) ctx['node'] = char_frequency def file_iterator(files, chunk_size=512): with open(files) as f: while True: c = f.read(chunk_size) if c: yield c else: break the_file_name = tool.get_encode_ration(codes)+'_'+str(uuid.uuid1())+'.txt' response = StreamingHttpResponse(file_iterator(save_file_name)) response['Content-Type'] = 'application/octet-stream' response['Content-Disposition'] = 'attachment;filename="{0}"'.format(the_file_name) return response
阅读全文
0 0
- Python语言实现哈夫曼编码
- python实现哈夫曼编码
- c语言实现哈夫曼编码
- c语言实现哈夫曼编码
- C语言实现哈夫曼编码
- C语言实现哈夫曼编码
- C语言实现哈夫曼编码
- 哈夫曼编码C语言实现
- 哈夫曼编码的C语言实现
- 哈夫曼编码的C语言实现
- C语言哈夫曼编码的实现
- C语言实现哈夫曼编码与译码
- C语言实现哈夫曼编码与译码
- 哈夫曼编码的C语言实现
- python 实现base64编码
- PYTHON设置默认语言编码
- PYTHON设置默认语言编码
- PYTHON设置默认语言编码
- Sumsets
- MySQL5.7免安装配置
- JS中的prototype的作用
- 关于set/multiset和map/multimap总结
- hdu-2700-Parity
- Python语言实现哈夫曼编码
- redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refu
- CSS学习之旅
- python发送邮件
- C/C++中 extern 关键字详解
- http(车贴码查询结果站点)
- 我学会了整数幂
- lapack def文件的建立
- 浏览器F12下面怎么玩