python计算txt文本有多少单词

来源:互联网 发布:win7自动开关机软件 编辑:程序博客网 时间:2024/05/02 23:22
def count_words(filename):    try:        with open(filename) as file:            contents = file.read()    except FileNotFoundError:        msg = 'Sorry, the file '+filename+' does not exist'        print(msg)    else:        words = contents.split()        n_words = len(words)        print(n_words)filenames = ['alice.txt','pi_digits.txt','hh.txt','little_women.txt','moby_dick.txt','siddhartha.txt']for filename in filenames:    count_words(filename)

结果如下

294613Sorry, the file hh.txt does not exist18907921513642172>>> 

值得注意的是,如果文本中放的是数字,而不是字母(单词),则不需要用split()来分割,否则会出错,这是因为分割split()将一串数字看成一个字符引起的。如上面的pi_digits.txt的文本如下:

3.1415926535  8979323846  2643383279

分割之后

>>> with open('pi_digits.txt') as file:    w = file.read()    w.split()['3.1415926535', '8979323846', '2643383279']

可见最开始给的结果3就是这样来的,看成了三个字符。
如果有数字文本,那么去掉split()函数就可以了,还是以pi为例:

def count_words(filename):    try:        with open(filename) as file:            contents = file.read()    except FileNotFoundError:        msg = 'Sorry, the file '+filename+' does not exist'        print(msg)    else:        n_words = len(contents)        print(n_words)filename = 'pi_digits.txt'count_words(filename)

结果如下:

38>>> 
阅读全文
0 0
原创粉丝点击