【python】解决urllib2乱码问题

来源:互联网 发布:东欧社会主义国家 知乎 编辑:程序博客网 时间:2024/05/17 01:07

在使用python爬取数据的时候,发现获取到的数据在VSCode中,中文乱码:

错误版本如下:

import urllibimport urllib2import systype = sys.getfilesystemencoding()page = 1url = 'http://www.qiushibaike.com/hot/page/' + str(page)user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'headers = { 'User-Agent' : user_agent }try:    request = urllib2.Request(url, headers=headers)    response = urllib2.urlopen(request)    data = response.read()    print dataexcept urllib2.URLError, e:    if hasattr(e, "code"):        print e.code    if hasattr(e, "reason"):        print e.reson

在网上找了资料以后,终于解决乱码问题:

import urllibimport urllib2import systype = sys.getfilesystemencoding()page = 1url = 'http://www.qiushibaike.com/hot/page/' + str(page)user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'headers = { 'User-Agent' : user_agent }try:    request = urllib2.Request(url, headers=headers)    response = urllib2.urlopen(request)    data = response.read()    data = data.decode('UTF-8')    data = data.encode(type)    print dataexcept urllib2.URLError, e:    if hasattr(e, "code"):        print e.code    if hasattr(e, "reason"):        print e.reson


0 0
原创粉丝点击