Scrapy报错

来源:互联网 发布:hibernate注解sql 编辑:程序博客网 时间:2024/05/17 21:54

Scrapy抓取数据时报错

Traceback (most recent call last):  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks  result = g.send(result)  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request  defer.returnValue((yield download_func(request=request,spider=spider)))  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1363, in returnValue  raise _DefGen_Return(val)twisted.internet.defer._DefGen_Return: <200 http://ios.jobbole.com/all-posts/page/2/>During handling of the above exception, another exception occurred:Traceback (most recent call last):  File "C:\software\Python\Python35\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred  result = f(*args, **kw)  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\spidermw.py", line 49, in process_spider_input  return scrape_func(response, request, spider)  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\scraper.py", line 146, in call_spider  dfd.addCallbacks(request.callback or spider.parse, request.errback)  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 303, in addCallbacks  assert callable(callback)AssertionError


思考后,根据assert callable(callback)猜测是调用回调函数时发生了错误。检查源代码
  def parse(self, response):      selector = Selector(response)      # 获取文章的链接      article_urls = selector.xpath('//a[@class="archive-title"]/@href').extract()      for article_url in article_urls:          yield Request(url=article_url, callback=self.parse_content)      # 调用下一页的链接      next_page_url = selector.xpath('//a[contains(@class, "next")]/@href').extract()      if next_page_url:          yield Request(url=next_page_url[0], callback="parse")#self.parse      else:          print("已经是最后一页了...........")
由于后面一个函数没有发挥作用,猜测这就是问题所在。所以将callback="parse"改为callback=self.parse后,问题解决



原创粉丝点击