htmldiff

来源:互联网 发布:西沙海战知乎 编辑:程序博客网 时间:2024/04/29 16:46
用 Shell 写程序还是不方便,今天用 Python 把昨天写的 Bash 脚本重写了,遇到两个关于字符串的个小问题:


1、做个类似 diff 工具的效果,大致指出两个字符串的不同之处,这个可以用 difflib 模块解决。


!/usr/bin/pythonimport difflib text1 = """http://www.vpsee.com is a website which is dedicated forbuilding scalable websites on cloud platforms. The keywords are: Linux, Mac,Cloud Computing, C, Python, MySQL, Nginx, VPS, Performance, Scalability,Architecture, ..., etc. Have fun!"""text1_lines = text1.splitlines() text2 = """http://VPSee.com is a website which is dedicated forbuilding scalable websites on cloud platforms. The keywords are: Linux, Mac,Cloud Computing, C, Python, MySQL, Nginx, VPS, Performance, Scalability,Programming, Optimisation, Architecture, ... , etc. Have fun !"""text2_lines = text2.splitlines() d = difflib.Differ()diff = d.compare(text1_lines, text2_lines)print '\n'.join(list(diff))程序运行结果如下:- http://www.vpsee.com is a website which is dedicated for?        ^^^^^^^ + http://VPSee.com is a website which is dedicated for?        ^^^   building scalable websites on cloud platforms. The keywords are: Linux, Mac,  Cloud Computing, C, Python, MySQL, Nginx, VPS, Performance, Scalability,- Architecture, ..., etc. Have fun!+ Programming, Optimisation, Architecture, ... , etc. Have fun !


2、如何比较两个字符串,并且忽略大小写、空白字符、TAB 制表符、换行等。这个很容易解决,把字符串转换成小写后 split,然后以空格为分隔符 join 在一起。


#!/usr/bin/python a = " \t\n\n a B C d\t\n\n\n"b = "\t\t\n\n a b c D\n\n\n\n" s1 = a.lower()s1 = ' '.join(s1.split())s2 = b.lower()s2 = ' '.join(s2.split()) if s1 == s1:print "=="else:print "!="


htmldiff返回的字符串是以html呈现出来的,如下例子所示


import difflib  pprint_left = '''A Good Babysitter Is Hard To Find    This is Frederickby Leo Lionni, the first book I picked for myself.I was in kindergarten, I believe, which would be either 1968 or 1969.Frederick has a specific lesson for children about how art is asimportant in life as bread, but there's a secondary considerationI took away: if we pool our talents our lives are immeasurably better. Poor Impulse Control: A Good Babysitter Is Hard To Find Curiously, this book is the story of my life, however one interpretsthose things. I expect Mickey Rooney to show up any time with a barnand a plan for a show, though my mom is not making costumes. My sistersown a toy store with a fantastic selection of imaginative children's books.I try not to open them because I can't close them and put them back.My tantrums are setting a bad example for the kids. Anyway, I mentionthis because yesterday was Mr. Rogers' 40th anniversary. I appreciatethe peaceful gentleman more as time passes, as I play with finger puppetsin department meetings, as I eye hollow trees for Lady Elaine Fairchildinfestations. Maybe Pete can build me trolley tracks!Labels: To TakeYour Heart Away'''  pprint_right = """     A Good Babysitter Is Hard To Find    This is Frederickby Leo Lionni, the first book I picked for myself.I was in kindergarten, I believe, which would be either 1968 or 1969.Frederick has a specific lesson for children about how art is asimportant in life as bread, but there's a secondary considerationI took away: if we pool our talents our lives are immeasurably better.Curiously, this book is the story of my life, however one interpretsthose things. I expect Mickey Rooney to show up any time with a barnand a plan for a show, though my mom is not making costumes. My sistersown a toy store with a fantastic selection of imaginative children's books.I try not to open them because I can't close them and put them back.My tantrums are setting a bad example for the kids. Anyway, I mentionthis because yesterday was Mr. Rogers' 40th anniversary. I appreciate Poor Impulse Control: A Good Babysitter Is Hard To Find the peaceful gentleman more as time passes, as I play with finger puppetsin department meetings, as I eye hollow trees for Lady Elaine Fairchildinfestations. Maybe Pete can build me trolley tracks!Labels: To TakeYour Heart Away   """  diff = difflib.HtmlDiff(wrapcolumn=100) s = difflib.HtmlDiff.make_file(difflib.HtmlDiff(wrapcolumn=50),pprint_left.strip().split('\n'),pprint_right.strip().split('\n')) f=open(r"c:\compareresult.html",'w')f.write(s)f.close()



0 0
原创粉丝点击