Python网络爬虫报错“SSL: CERTIFICATE_VERIFY_FAILED”的解决方案

来源：互联网发布：socket网络通信编辑：程序博客网时间：2024/05/01 20:41

import urllib.requestweburl = "https://www.douban.com/"webheader = {    'Accept': 'text/html, application/xhtml+xml, */*',    'Accept-Encoding': 'gzip, deflate',    'Accept-Language': 'zh-CN',    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko',    'DNT': '1',    'Connection': 'Keep-Alive',    'Host': 'www.douban.com'    }req = urllib.request.Request(url=weburl,headers=webheader)webPage = urllib.request.urlopen(req)data = webPage.read().decode('utf-8')print(data)print(type(webPage))print(webPage.geturl())print(webPage.info())print(webPage.getcode())

如上代码，用爬虫爬取豆瓣，报错“SSL: CERTIFICATE_VERIFY_FAILED”，经过上网查找原因，原来是Python 升级到 2.7.9 之后引入了一个新特性，当使用urllib.urlopen打开一个 https 链接时，会验证一次 SSL 证书。而当目标网站使用的是自签名的证书时就会抛出此异常。

解决方案有如下两个：

1）使用ssl创建未经验证的上下文，在urlopen中传入上下文参数

import ssl

context = ssl._create_unverified_context()

webPage = urllib.request.urlopen(req,context=context)

2）全局取消证书验证

import ssl

ssl._create_default_https_context = ssl._create_unverified_context

另外，如果用的是requests模块的get方法，里面有一个verify参数，将其设成False就可以了。

阅读全文

1 0