python gutenberg古腾堡语料库

来源：互联网发布：淘宝老店铺编辑：程序博客网时间：2024/06/06 17:58

import nltkfrom nltk.corpus import gutenberga = gutenberg.fileids()print(a)emma = gutenberg.words("shakespeare-macbeth.txt")print(emma[1030 :1037])for fileid in gutenberg.fileids():    num_chars = len(gutenberg.raw(fileid))    num_words = len(gutenberg.words(fileid))    num_sents = len(gutenberg.sents(fileid))    print(num_chars , num_words, num_sents , fileid)

阅读全文

0 0

python gutenberg古腾堡语料库
python处理人民日报语料库
人民日报语料库抓取python实现
语料库
语料库
语料库
语料库
语料库
语料库
语料库
语料库
语料库
语料库
语料库
语料库
Python NLTK库安装Error:Resource u*corpora/gutenberg* not found.
python自然语言处理-就职演说语料库
python自然语言处理之加载本地语料库
简单斐波那契
单选，反选，全选
负载均衡基础知识
从ttf原始文件解析出字体名称时遇到的字节序的问题是big endian
Android官方DataBinding（十一）：对双向绑定之反向绑定的改进和简化
python gutenberg古腾堡语料库
HttpClient大并发下Timeout waiting for connection from pool 问题解决方案
2012-2013 Northwestern European Regional Contest (NWERC 2012)【solved：6 / 11】
nginx 安装及基本命令
RecyclerView 添加动画，以及快速滑动导致的问题
Holiday节假日信息系统的开发（零）--序言
java常用包之util 包
蓝牙4.0BLE抓包(二) – 广播包解析
四大线程池详解