Python 对文本先按词频统计,若相同按字典排序,后取TopN

来源:互联网 发布:ios7越狱后优化 编辑:程序博客网 时间:2024/06/07 09:49

Python Code:

def count_words(s, n):    dic = {}    words = s.split(" ")    for word in words:        dic[word] = words.count(word)    wordslist = sorted(dic.items(), key=lambda kv: (-kv[1], kv[0]))[:n]    return wordslistdef test_run():    print count_words("cat bat mat cat bat cat baa", 3)    print count_words("betty bought a bit of butter but the butter was bitter", 3)if __name__ == '__main__':    test_run()
1 0