scrapy 排错记录
来源:互联网 发布:linux netsnmp 下载 编辑:程序博客网 时间:2024/06/13 18:33
之前在服务器上用scrapy写爬虫,一直用得好好的。结果前天一同学在上面装了NLTK后就再也用不了了(不管是用shell还是crawl),报错如下:
Traceback (most recent call last): File "/usr/local/bin/scrapy", line 9, in <module> load_entry_point('Scrapy==0.24.4', 'console_scripts', 'scrapy')() File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute _run_print_help(parser, _run_command, cmd, args, opts) File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help func(*a, **kw) File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command cmd.run(args, opts) File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 46, in run self.crawler_process.start_crawling() File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 124, in start_crawling return self._start_crawler() is not None File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 139, in _start_crawler crawler.configure() File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 46, in configure self.extensions = ExtensionManager.from_crawler(self) File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 50, in from_crawler return cls.from_settings(crawler.settings, crawler) File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 29, in from_settings mwcls = load_object(clspath) File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 42, in load_object raise ImportError("Error loading object '%s': %s" % (path, e))ImportError: Error loading object 'scrapy.telnet.TelnetConsole': No module named conch
提示 conch 这个模块没有找到,这可能是 sys.path 被改了。所幸之前在tmux上挂着一个python交互窗口,可以查得旧的sys.path。与现在的 sys.path 对比发现多出了两项:
'/usr/local/lib/python2.7/dist-packages/jieba-0.36.1-py2.7.egg','/usr/local/lib/python2.7/dist-packages/setuptools-15.0-py2.7.egg'
按理说找不到东西应该是 sys.path 少了一些东西才是,这个一时看不出什么。
于是沿着python报错信息,试图简单地重现错误。
错误是在 /usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py 中抛出的,代码如下:
try: mod = import_module(module) except ImportError as e: raise ImportError("Error loading object '%s': %s" % (path, e))
于是可以这样重现这个错误:
>>> from importlib import import_module>>> import_module('scrapy.telnet')
同样是得到 No module named conch 的报错。在 scrapy 项目的 telnet.py 里一开始就有这么一行代码:
from twisted.conch import manhole, telnet
这行代码没有执行成功,因为找不到 conch 这个模块。尝试直接 import twisted.conch 也是失败的。
python的第三方包都放在dist-packages目录里,在 /usr/local/lib/python2.7/dist-packages 我找到了 twisted 目录,里面是有 conch 的!
然后我用 locate 指令看系统中的 twisted 目录都在哪里,因为有可能新装了什么把原来可用的给替代了。
$ locate twisted
最终发现,在 /usr/lib/python2.7/dist-packages 下也有一个 twisted 目录,而且里面确实没有 conch 这个子目录。查看 _version.py,有这么一行:
version = versions.Version('twisted', 11, 1, 0)
而看原来在用的 /usr/local/lib/python2.7/dist-packages/twisted 里的 _version.py,这一行是:
version = versions.Version('twisted', 14, 0, 2)
这说明现在沿 sys.path 搜到的是老版本的 twisted(可能是以前谁装的),sys.path 被改动后,又指向了这个老的 twisted. 仔细比较 sys.path,有两行的顺序改变了。
这是之前正常的 sys.path:
'''/usr/local/lib/python2.7/dist-packages/requests-2.0.0-py2.7.egg''/usr/local/lib/python2.7/dist-packages/kafka_python-0.8.1_1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/tox-1.6.1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/py-1.4.19-py2.7.egg''/usr/local/lib/python2.7/dist-packages/virtualenv-1.10.1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/pymongo-2.6.3-py2.7-linux-x86_64.egg''/usr/lib/python2.7''/usr/lib/python2.7/plat-linux2''/usr/lib/python2.7/lib-tk''/usr/lib/python2.7/lib-old''/usr/lib/python2.7/lib-dynload''/usr/local/lib/python2.7/dist-packages' ### 注意这一行'/usr/lib/python2.7/dist-packages' ### 还有这一行'/usr/lib/python2.7/dist-packages/PIL''/usr/lib/python2.7/dist-packages/gst-0.10''/usr/lib/python2.7/dist-packages/gtk-2.0''/usr/lib/pymodules/python2.7''/usr/lib/python2.7/dist-packages/ubuntu-sso-client''/usr/lib/python2.7/dist-packages/ubuntuone-client''/usr/lib/python2.7/dist-packages/ubuntuone-control-panel''/usr/lib/python2.7/dist-packages/ubuntuone-couch''/usr/lib/python2.7/dist-packages/ubuntuone-installer''/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol'
这是被人装了东西后,即现在的sys.path:
'''/usr/local/lib/python2.7/dist-packages/requests-2.0.0-py2.7.egg''/usr/local/lib/python2.7/dist-packages/kafka_python-0.8.1_1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/tox-1.6.1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/py-1.4.19-py2.7.egg''/usr/local/lib/python2.7/dist-packages/virtualenv-1.10.1-py2.7.egg''/usr/local/lib/python2.7/dist-packages/pymongo-2.6.3-py2.7-linux-x86_64.egg''/usr/local/lib/python2.7/dist-packages/setuptools-15.0-py2.7.egg''/usr/lib/python2.7/dist-packages' # 这一行被挪到了前面'/usr/local/lib/python2.7/dist-packages/jieba-0.36.1-py2.7.egg''/usr/lib/python2.7''/usr/lib/python2.7/plat-linux2''/usr/lib/python2.7/lib-tk''/usr/lib/python2.7/lib-old''/usr/lib/python2.7/lib-dynload''/usr/local/lib/python2.7/dist-packages' # 这一行相比就在后面了'/usr/lib/python2.7/dist-packages/PIL''/usr/lib/python2.7/dist-packages/gst-0.10''/usr/lib/python2.7/dist-packages/gtk-2.0''/usr/lib/pymodules/python2.7''/usr/lib/python2.7/dist-packages/ubuntu-sso-client''/usr/lib/python2.7/dist-packages/ubuntuone-client''/usr/lib/python2.7/dist-packages/ubuntuone-control-panel''/usr/lib/python2.7/dist-packages/ubuntuone-couch''/usr/lib/python2.7/dist-packages/ubuntuone-installer''/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol'
注意注释的那两行,/usr/lib/python2.7/dist-packages 在新的sys.path 中被提到了前面,于是就先找到了那个老版本的 twisted!终于知道为什么出错了,长吁一口气~
接下来,把老版本的 twisted 目录删掉(或改名)就行了,同样处理掉的还有对应的几个 egg-info 文件。文件名如下:
twisted Twisted_Core-11.1.0.egg-info Twisted_Names-11.1.0.egg-info Twisted_Web-11.1.0.egg-info
当然也可以改默认的 sys.path,把 /usr/local/lib/python2.7/dist-packages 放在前面。但考虑到可能同样会影响别人,还是直接把老版本的东西丢掉得了,反正没什么用。
最后的解决虽然简单,但还是花了不少时间来找这个问题,服务器排错本身就是一个考验耐心的事情!
写这篇文章可能没什么直接的参考价值,因为每个人的环境不一样,出错的原因也不一样。只是排错的思路,或许可以给无助的朋友一点帮助,因为一开始我遇到这个问题的时候,也是非常地懊恼,网上找不到什么帮得上忙的资料。最终还是得静下心来,加深对 Python 的理解。总之要有这个信念:问题总是能解决的!
- scrapy 排错记录
- 排错记录
- scrapy 记录
- 记录一次LinkError排错:
- 7.25开发环境排错记录
- erlang 应用 调试与排错 记录
- 【scrapy】使用记录
- scrapy学习记录0401
- scrapy使用记录
- Scrapy 入门记录(1)
- Scrapy 使用记录
- 通过代理安装pip,及scrapy安装报错记录
- 快排写法记录
- bug排坑记录
- ES使用脚本进行局部更新的排错记录
- Kibana5.2监控ES5.2集群排错记录
- kubernetes二进制部署时ca认证排错记录
- 记录:安装scrapy与pywebkitgtk
- wxwidgets中关于多个wxMemoryDC作图的情况
- 第五周 项目三--用多文件组织多个类的程序
- Eclipse代码优化
- SELECT FOR UPDATE
- 六角填数
- scrapy 排错记录
- Mysql移库
- spring mvc + spring事务注意点
- 浅析CC2540 OSAL工作流程
- Fedora20更改开机默认启动项以及更改启动等待时间(grub2引导)
- Maven实战(七)settings.xml相关配置
- MySql数据库乱码
- 安装配置gnokii使用短信猫发送短信
- PHP Composer: installation on ubuntu