Scrapy pipeline spider_opened and spider_closed not being called
来源:互联网 发布:逆袭网络剧第2集 编辑:程序博客网 时间:2024/06/07 09:33
http://stackoverflow.com/questions/4113275/scrapy-pipeline-spider-opened-and-spider-closed-not-being-called
Scrapy pipeline spider_opened and spider_closed not being called
I am having some trouble with a scrapy pipeline. My information is being scraped form sites ok and the process_item method is being called correctly. However the spider_opened and spider_closed methods are not being called.
class MyPipeline(object): def __init__(self): log.msg("Initializing Pipeline") self.conn = None self.cur = None def spider_opened(self, spider): log.msg("Pipeline.spider_opened called", level=log.DEBUG) def spider_closed(self, spider): log.msg("Pipeline.spider_closed called", level=log.DEBUG) def process_item(self, item, spider): log.msg("Processsing item " + item['title'], level=log.DEBUG)
Both the __init__
and process_item
logging messages are displyed in the log, but the spider_open
and spider_close
logging messages are not.
I need to use the spider_opened and spider_closed methods as I want to use them to open and close a connection to a database, but nothing is showing up in the log for them.
If anyone has any suggested that would be very useful.
2 Answers
Sorry, found it just after I posted this. You have to add:
dispatcher.connect(self.spider_opened, signals.spider_opened)dispatcher.connect(self.spider_closed, signals.spider_closed)
in __init__
otherwise it never receives the signal to call it
class MyPipeline(object): def __init__(self): log.msg("Initializing Pipeline") self.conn = None self.cur = None def spider_opened(self, spider): log.msg("Pipeline.spider_opened called", level=log.DEBUG) def spider_closed(self, spider): log.msg("Pipeline.spider_closed called", level=log.DEBUG) def process_item(self, item, spider): log.msg("Processsing item " + item['title'], level=log.DEBUG)
无论是__init__
和process_item
被持续显示在日志中,但spider_open
和spider_close
不是。 我需要的spider_opened和我希望他们能够打开和关闭到数据库的连接,但没有被显示在日志中他们。 如果任何人有任何建议,将本文地址 :CodeGo.net/215885/
-------------------------------------------------------------------------------------------------------------------------
1. 对不起,发现它只是以后我张贴了这个。你必须添加:
dispatcher.connect(self.spider_opened, signals.spider_opened)dispatcher.connect(self.spider_closed, signals.spider_closed)
在__init__
否则它永远不会收到叫它信号本文标题 :Scrapy管道spider_opened和spider_closed没有被调用
本文地址 :CodeGo.net/215885/
- Scrapy pipeline spider_opened and spider_closed not being called
- ViewDidAppear/ViewWillAppear not being called
- viewWillAppear, viewDidAppear not being called, not firing
- viewWillAppear: not being called inside a UINavigationController
- ORA-06508: PL/SQL: could not find program unit being called
- scrapy 的 item pipeline
- onTaskRemoved() not getting called in HUAWEI and XIOMI devices
- scrapy爬虫之Image Pipeline
- scrapy爬虫之Item Pipeline
- 爬虫Scrapy-04Item Pipeline
- PipeLine and Value
- MATLAB Toolbox Path Cache is out of date and is not being used的解决
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- MATLAB Toolbox Path Cache is out of date and is not being used的解决
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- willRotateToInterfaceOrientation not called
- onAnimationEnd not get called
- UICollectionView cellForItemAtIndexPath not called
- 港媒:习近平的足球外交
- 正则表达式的快速学习方法,javascript正则表达式,PHP正则表达式
- Oracle表分区
- Android JNI学习之javah命令的正确使用(找了好半天才找到的,汉,网上好多说法都没用)
- web.xml中classpath的含义
- Scrapy pipeline spider_opened and spider_closed not being called
- ajax 重定向 302
- oracle到mysql的迁移(一):表和数据的迁移
- Redis 五种数据类型的使用
- EL表达式
- PAT 1009. 说反话
- 孙鑫 MFC第9讲 toolbar 工具栏 添加按钮 不显示 T字形图标 vs2010
- 安装fedora后碰到的问题以及解决办法
- [noj1034] 一只青蛙一张嘴
dispatcher
variable? And how come I can't find this indoc.scrapy.org/en/latest/topics/item-pipeline.html? :( – wrongusername Oct 8 '12 at 18:05from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
– herrherr Oct 28 '13 at 15:08