linux redhat 6.4 下安装Scrapy 1.0
来源:互联网 发布:super meat boy mac 编辑:程序博客网 时间:2024/05/23 21:40
Scrapy是一个开源的机遇twisted框架的python的单机爬虫,该爬虫实际上包含大多数网页抓取的工具包,用于爬虫下载端以及抽取端。
安装环境:
redhat 6.4python2.7.3
安装步骤:
1.下载python2.7 http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz
[root@zxy-websgs ~]# wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz -P /opt[root@zxy-websgs opt]# tar xvf Python-2.7.3.tgz [root@zxy-websgs Python-2.7.3]# ./configure [root@zxy-websgs Python-2.7.3]# make && make install
验证python2.7安装
[root@zxy-websgs Python-2.7.3]# python2.7Python 2.7.3 (default, Feb 28 2013, 03:08:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> exit()
2.安装setuptools,http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz
[root@zxy-websgs ~]# wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz -P /opt/[root@zxy-websgs opt]# tar zxvf setuptools-0.6c11.tar.gz [root@zxy-websgs setuptools-0.6c11]# python2.7 setup.py install
3.安装Twisted 12.1 Zope.interface 3.8+
[root@zxy-websgs setuptools-0.6c11]# easy_install Twisted......Installed /usr/local/lib/python2.7/site-packages/Twisted-12.3.0-py2.7-linux-x86_64.egg......Installed /usr/local/lib/python2.7/site-packages/zope.interface-4.0.4-py2.7-linux-x86_64.egg
Twisted要安装zope.interface,可以从下面地址下载
zope.interface:http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.1.tar.gz
twisted:http://twistedmatrix.com/Releases/Twisted/12.1/Twisted-12.1.0.tar.bz2
4.安装sqlite,并重新编译Python 2.7
5.安装w3lib 1.4+
[root@zxy-websgs setuptools-0.6c11]# easy_install -U w3libSearching for w3libReading http://pypi.python.org/simple/w3lib/Reading http://github.com/scrapy/w3libBest match: w3lib 1.2Downloading http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz#md5=f929d5973a9fda59587b09a72f185a9eProcessing w3lib-1.2.tar.gzRunning w3lib-1.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-wm_1BB/w3lib-1.2/egg-dist-tmp-2DQHY_zip_safe flag not set; analyzing archive contents...Adding w3lib 1.2 to easy-install.pth fileInstalled /usr/local/lib/python2.7/site-packages/w3lib-1.2-py2.7.eggProcessing dependencies for w3libFinished processing dependencies for w3lib
6.安装libxml2-2.7.6-5, libxml2-devel-2.7.6-5, libxslt-1.1.28, lxml 3.6
7.安装pyOpenSSL(这个是可选安装的,主要为了使scrapy能够支持https)
用easy_install pyOpenSSL安装的是pyOpenSSL-0.13版本,没安装成功,于是手动下载.011版本来进行安装。
[root@zxy-websgs opt]# wget http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz -P /opt[root@zxy-websgs opt]# tar zxvf pyOpenSSL-0.11.tar.gz [root@zxy-websgs pyOpenSSL-0.11]# python2.7 setup.py install
pyOpenSSL:http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz
8.安装scrapy 1.0
[root@zxy-websgs pyOpenSSL-0.11]# easy_install -U Scrapy
验证安装
[root@zxy-websgs pyOpenSSL-0.11]# scrapyScrapy 1.0 - no active projectUsage: scrapy <command> [options] [args]Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directoryUse "scrapy <command> -h" to see more info about a command
- linux redhat 6.4 下安装Scrapy 1.0
- scrapy安装(Linux RedHat)
- RedHat下完美安装scrapy爬虫框架
- linux下安装scrapy
- Linux下安装scrapy
- RedHat Enterprise Linux 6.4下安装 VLC
- Linux 下Scrapy的安装
- Linux(RedHat,Centos)上scrapy详尽安装笔记
- Linux(RedHat,Centos)上scrapy详尽安装笔记 【转】
- Linux(RedHat,Centos)上scrapy详尽安装笔记
- Linux(RedHat,Centos)上scrapy详尽安装笔记
- redhat linux 下安装 log4cpp
- redhat-linux下安装gcc
- redhat linux 下安装 log4cpp
- redhat linux 下安装 log4cpp
- Redhat linux下安装weblogic
- Linux(REDHAT) 下安装QT
- Redhat linux 下安装fluentd
- 最长上升子序列的个数
- 笔记1:“请求/响应”模型
- 订单捕获-销售订单概览
- 图的存储结构-邻接表
- poj 3450 Corporate Identity
- linux redhat 6.4 下安装Scrapy 1.0
- 大教堂和市集
- CodeForces 567A Lineland Mail
- 最短路
- CodeForces 567B Berland National Library
- listview点击Item的传值跳转页面的实现和findViewByiId的NullPointerException
- HTTPS
- matlab的基本命令·画图篇
- 湖南省第三届大学生程序设计竞赛 C 数字整除