UnicodeDecodeError: 'gb2312' codec can't decode byte 0x88 in position 164111: illegal multibyte sequ

来源:互联网 发布:大众网络报17年还有吗 编辑:程序博客网 时间:2024/05/11 21:45

使用python遇到UnicodeDecodeError: 'gb2312' codec can't decode byte 0x88 in position 164111: illegal multibyte sequence

# 基金抓取from urllib import requestimport chardetpage1_url = "http://fund.eastmoney.com/fund.html"def getHtml(pageUrl):    response = request.urlopen(pageUrl)    raw_html = response.read()    getEncoding = chardet.detect(raw_html)['encoding']    src = raw_html.decode(getEncoding)    print(src)getHtml(page1_url)

这么办?大概意思是 网页有 非法字符你需要加上ignore


# 基金抓取from urllib import requestimport chardetpage1_url = "http://fund.eastmoney.com/fund.html"def getHtml(pageUrl):    response = request.urlopen(pageUrl)    raw_html = response.read()    getEncoding = chardet.detect(raw_html)['encoding']    src = raw_html.decode(getEncoding, 'ignore')    print(src)getHtml(page1_url)

阅读全文
0 0
原创粉丝点击