'gbk' codec can't encode character '\xa0' in position 1534: illegal multibyte sequence
来源:互联网 发布:百度云盘 无网络连接 编辑:程序博客网 时间:2024/05/10 20:28
运行代码错误如题:
url = 'http://blog.csdn.net/dc_726/article/details/45399457'# pretend as a browserheaders = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1;\ WOW64; rv:23.0) Gecko/20100101 Firefox/23.0 '}req = Request(url, headers=headers)html = urlopen(req)bsHtml = BeautifulSoup(html)text = bsHtml.find('div', id="article_content")print(text)
1.读取过程中己将文本编码为utf-8, 故错误的原因在打印过程(print())中,实际上,窗口错误提示已明确指出:
File "D:/Project/python/Text/text.py", line 34, in <module>
print(text)
2.print() 系统默认为gbk编码格式,即程序运行时将对text进行gbk编码,后输出时对其进行解码,故考虑用下列改之:(应该可以采用更改系统设置的方法的。。)
print(text.encode('gbk','ignore').decode('gbk'))注意:倘若你去掉上面的decode(‘gbk’),输出的将是字节形式,以为此时系统运行时判断不需要对其进行编码,故也不会进行解码。
有关的编码问题:
1.文件编码:主要来处理文件中出现的中文,可在文件头加:
# -*- coding:utf-8 -*-
2.运行时编码:如上例。
阅读全文
0 0
- 'gbk' codec can't encode character '\xa0' in position 1534: illegal multibyte sequence
- Python: 'gbk' codec can't encode character '\u30fb' in position 0: illegal multibyte sequence
- UnicodeEncodeError: ‘gbk’ codec can’t encode character ****: illegal multibyte sequence
- UnicodeEncodeError: 'gbk' codec can't encode character: illegal multibyte sequence
- UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position
- UnicodeEncodeError: 'gbk' codec can't encode character u'\xa9' in position 28714: illegal multibyte
- UnicodeEncodeError: 'gbk' codec can't encode character '\ufeff' in position 0: illegal multibyte seq
- UnicodeEncodeError: ‘gbk’ codec can’t encode character u’\u200e’ in position 43: illegal multibyte s
- UnicodeEncodeError: 'gbk' codec can't encode character '\xbb' in position 0: illegal multibyte seque
- UnicodeEncodeError: ‘gbk’ codec can’t encode character u’\u200e’ in position 43: illegal multibyte s
- UnicodeEncodeError: ‘gbk’ codec can’t encode character u’\u200e’ in position 43: illegal multibyte s
- UnicodeEncodeError: 'gbk' codec can't encode character '\u200b' in position 0: illegal multibyte seq
- UnicodeEncodeError: 'gbk' codec can't encode character '\ufffd' in position 146: illegal multibyte s
- UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position ... 问题解决办法之一
- 编码问题:UnicodeEncodeError: 'gbk' codec can't encode character u'\xa0' in position 148:
- Python3 解决编码问题: `UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 10: ille
- Python3 解决编码问题: `UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 10: ille
- UnicodeDecodeError: 'gbk' codec can't decode bytes in position 12-13: illegal multibyte sequence
- 稀疏矩阵快速转置
- SDN学习日记—基于RYU的hub开发2
- 窗口过程函数
- 银行家算法java实现
- SVM入门(一)至(三)Refresh
- 'gbk' codec can't encode character '\xa0' in position 1534: illegal multibyte sequence
- Myeclipse配置Maven
- 搜索引擎优化
- 初识MongoDB
- 远程同步服务器 rsync 服务器主从复制
- java五种最常见的异常
- JavaScript基础(五)- JavaScript的注释
- SVM入门(四)线性分类器的求解——问题的描述Part1
- 1640 天气晴朗的魔法(二分最大生成树)