处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
来源:互联网 发布:淘宝数据包怎么导出 编辑:程序博客网 时间:2024/05/01 09:40
错误信息:
UnicodeDecodeError: ‘XXX' codec can't decode bytes in position 2-5: illegal multibyte sequence
这是因为遇到了非法字符,例如:全角空格往往有多种不同的实现方式,比如\xa3\xa0,或者\xa4\x57,这些字符,看起来都是全角空格,但它们并不是“合法”的全角空格
真正的全角空格是\xa1\xa1,因此在转码的过程中出现了异常。
而之前在处理新浪微博数据时,遇到了非法空格问题导致无法正确解析数据。
[解决办法]
#将获取的字符串strTxt做decode时,指明ignore,会忽略非法字符,
#当然对于gbk等编码,处理同样问题的方法是类似的
strTest = strTxt.decode('utf-8', 'ignore')
return strTest
[补充]
默认的参数就是strict,代表遇到非法字符时抛出异常;如果设置为ignore,则会忽略非法字符;
如果设置为replace,则会用?号取代非法字符;
如果设置为xmlcharrefreplace,则使用XML的字符引用。
- 处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
- 处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
- 处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
- 处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
- aiohttp遇到非法字符的处理(UnicodeDecodeError: 'utf-8' codec can't decode bytes in position......)
- python UnicodeDecodeError: 'gbk' codec can't decode bytes in position
- 【Error】Python:UnicodeDecodeError: ‘XXX' codec can't decode bytes in position... 解决方法
- Python中UnicodeDecodeError:‘XXX’ codec can’t decode bytes in position错误信息解决办法
- UnicodeDecodeError: ‘XXX’ codec can’t decode bytes in position 2-5: illegal multibyte sequence
- python读取文件时遇到非法字符的处理 UnicodeDecodeError: 'gbk' codec can't decode bytes in position
- python的UnicodeDecodeError: 'utf8' codec can't decode byte 0xxx in position
- UnicodeDecodeError: 'gbk' codec can't decode bytes in position 12-13: illegal multibyte sequence
- 【UnicodeDecodeError: '' codec can't decode bytes in position : illegal multibyte sequence】
- Python中遇到"UnicodeDecodeError: ‘gbk’ codec can’t decode bytes in position 0: illegal multibyte
- UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3: illegal multibyte sequence、
- UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 80-81: truncated \UXXX
- 同一文档有几种编码 decode报错解决UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 334489-334490:
- Python 中 'unicodeescape' codec can't decode bytes in position XXX: trun错误解决方案
- Bluebox Security最新提报Android漏洞的初步探讨
- 字符串和字符数组之间的转换
- 黑马程序员——集合(下)
- 大数据存储
- 临界区,互斥量,信号量,事件的区别
- 处理UnicodeDecodeError: ‘XXX' codec can't decode bytes in position...的问题
- nasm:fatal:unable to open output file
- 交通灯系统的学习
- 教育培训行业现状分析
- 前端性能优化:使用Data URI代替图片SRC
- 引入外部实体1
- 子序列最大连续和(动态规划、分治)
- Eming
- xml内部实体的引用