python本地版wordCloud字符云生成

来源:互联网 发布:淘宝处理不公平 编辑:程序博客网 时间:2024/05/01 06:29

字符云是统计和设计的结合,是一种艺术和大数据结合的很好方式,

通过字符云我们能清楚的看到词频较高的单词,而且还能看到生动的形象

网上也有很多在线的生成直接搜word cloud online有很多



下载可以在这下载

https://github.com/amueller/word_cloud/

用的系统是Ubuntu Kylin 64位 1510版

相关网站http://minimaxir.com/2016/05/wordclouds/


上图是自己生成的 左边是原图右边是生成的

http://blog.csdn.net/shenmifangke


首先是按照pip
apt-get install python-pip


wget https://github.com/amueller/word_cloud/archive/master.zip
not found
无法解析的时候说明网不通多试几次 (有时可能需要重启终端)
couldn't 创建之类 用sudo wget https://github.com/amueller/word_cloud/archive/master.zip

然后可以按照github上的来运行


sudo pip install -r requirements.txt
python setup.py install


当然也可以手动来

自己下载压缩包
wget https://github.com/amueller/word_cloud/archive/master.zip

下载好后解压

unzip master.zip

删除压缩包

rm master.zip

进入目录

cd word_cloud-master

安装本体

python setup.py install


里面可以看到有example文件夹

cd examples

运行几个试试
python xxx.py(xxx是名字 比如可以试试  python colored.py  python masked.py)

masked.py是单色图

colored.py是彩色图

如果有下面错误
no module named matplotlib.pyplot
可以试试下面这句,注意需要另外用个终端来输入

sudo apt-get install python-matplotlib


另外我最上面的图结果需要改写下colored.py才能在目录里看到生成的

#!/usr/bin/env python2"""Image-colored wordcloud========================You can color a word-cloud by using an image-based coloring strategy implemented inImageColorGenerator. It uses the average color of the region occupied by the wordin a source image. You can combine this with masking - pure-white will be interpretedas 'don't occupy' by the WordCloud object when passed as mask.If you want white as a legal color, you can just pass a different image to "mask",but make sure the image shapes line up."""from os import pathfrom PIL import Imageimport numpy as npimport matplotlib.pyplot as pltfrom wordcloud import WordCloud, STOPWORDS, ImageColorGeneratord = path.dirname(__file__)# Read the whole text.这里是文本名字text = open(path.join(d, '<span style="color:#FF0000;">name.txt</span>')).read()# read the mask / color image# taken from http://jirkavinse.deviantart.com/art/quot-Real-Life-quot-Alice-282261010 <pre name="code" class="python"><span style="color:#FF0000;">name.png是要处理的图片 最好背景是白的 透明的也行  那个aaaaa是截至单词</span>
alice_coloring = np.array(Image.open(path.join(d, "name.png")))wc = WordCloud(background_color="white", max_words=39000, mask=alice_coloring, stopwords=STOPWORDS.add("aaaaa"), max_font_size=100, random_state=2)# generate word cloudwc.generate(text)# create coloring from imageimage_colors = ImageColorGenerator(alice_coloring)# showplt.imshow(wc)plt.axis("off")plt.figure()#store to file# recolor wordcloud and show# we could also give color_func=image_colors directly in the constructor 最下面是输出文件名字plt.imshow(wc.recolor(color_func=image_colors))plt.axis("off")plt.figure()plt.imshow(alice_coloring, cmap=plt.cm.gray)plt.axis("off")plt.show()wc.to_file(path.join(d, "name_output.png"))

等待一下就能跳出来图了,另外目录当中也会 自动产生
不过没法处理中文还是挺遗憾的

下面是另一个测试





0 0
原创粉丝点击