ubantu下部署scrapyd

来源:互联网 发布:java谷歌地图开发 编辑:程序博客网 时间:2024/06/16 10:41

一、安装scrapyd和scrapyd-client

1、pip install scrapyd

2、pip install scrapyd-client

二、在命令行敲scrapyd打开web service端口


出现如上 说明启动scrapyd启动成功

若出现了以下错误:

 File "/usr/local/lib/python2.7/dist-packages/scrapyd-1.1.0-py2.7.egg/scrapyd/utils.py", line 61, in get_spider_queues
    d[project] = SqliteSpiderQueue(dbpath)
  File "/usr/local/lib/python2.7/dist-packages/scrapyd-1.1.0-py2.7.egg/scrapyd/spiderqueue.py", line 12, in __init__
    self.q = JsonSqlitePriorityQueue(database, table)
  File "/usr/local/lib/python2.7/dist-packages/scrapyd-1.1.0-py2.7.egg/scrapyd/sqlite.py", line 95, in __init__
    self.conn = sqlite3.connect(self.database, check_same_thread=False)
sqlite3.OperationalError: unable to open database file

通过安装sqlite3后,可以解决该错误

sqlite3 下载地址https://github.com/lgastako/db-sqlite3,

启动scrapy后可以在web上打开http:localhost:6800浏览

三、打开scrapy项目的scrapy.cfg,

[settings]
default = courtannounce.settings

[deploy]
#url = http://localhost:6800/
project = courtannounce
将deploy下的url注释去掉

默认的target是defualt,不过也可以改成:

[deploy:scrapy1]

将target修改成scrapy1

然后通过scrapyd-deploy scrapy1 -p courtannounce将scrapy(scrapy1为target, courtannounce是spider name)加入scrapyd进行监听

四、curl http://localhost:6800/schedule.json -d project=courtannounce -d spider=courtannouncement

通过curl启动scrapy进行爬虫

在web页面上通过job一栏可以查看爬虫的状态和log信息


1 0
原创粉丝点击