utf-8和utf-8-sig
来源:互联网 发布:哪个直播软件好 编辑:程序博客网 时间:2024/06/05 08:44
As UTF-8 is an 8-bit encoding no BOM is required and anyU+FEFF character in the decoded Unicode string (even if it’s the firstcharacter) is treated as a ZERO WIDTH NO-BREAK SPACE.
UTF-8以字节为编码单元,它的字节顺序在所有系统中都是一様的,没有字节序的问题,也因此它实际上并不需要BOM(“ByteOrder Mark”)。但是UTF-8 with BOM即utf-8-sig需要提供BOM。
简单的说,utf-8-sig是对字节编码有序的。
这种情况发生在pandas在read csv的时候,如果报错,key error, File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)
File "pandas/index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)
File "pandas/hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)
File "pandas/hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)
这种情况要考虑使用utf-8-sig这种编码
- utf-8和utf-8-sig
- Unexpected UTF-8 BOM (decode using utf-8-sig)
- Python中utf-8与utf-8-sig两种编码格式的区别
- Python学习笔记 --- utf-8与utf-8-sig 两种编码格式区别
- unicode 和 utf-8 utf-16 utf-32 ASCII ANSI
- UTF-8 和 UTF-8 without BOM
- utf-8 UTF-8 和utf8 区别
- Unicode编码:UTF-8和UTF-16
- ASCII,Unicode和UTF-8,UTF-16
- UTF-8和UTF-16使用对比
- UTf-8 和 UTF-16 区别
- UTF-16, UTF-8
- UTF-16 UTF-8
- UTF-8和Unicode
- unicode 和 UTF-8
- Unicode 和 UTF-8
- utf8和 utf-8
- GBK和UTF-8
- 高可用高性能可扩展的单号生成方案
- Linux下添加一个sudo超级用户
- 解决 setTimeout 传递带参数的函数无效果
- const和 枚举的一些知识。
- git commit
- utf-8和utf-8-sig
- SpringBoot 中针对同一类型的参数,可以用一个类来集中访问
- 计算机网络基础
- Linux下最新版LAMP环境(源码版)搭建详细解读
- static、const、define的一些理解
- 最小路径和
- linux下的简单操作
- PAT甲级 1033. To Fill or Not to Fill (25)
- C++结构中的位字段和共用体