python处理windows文本报错:UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4

来源:互联网 发布:域名出售页面 编辑:程序博客网 时间:2024/05/19 16:03

在windows下生成一份txt文档,在linux下用python做格式化处理,报错

Traceback (most recent call last):  File "txt2xml.py", line 55, in <module>    tree.write("1.xml")  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 820, in write    serialize(write, self._root, encoding, qnames, namespaces)  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml    _serialize_xml(write, e, encoding, qnames, None)  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 937, in _serialize_xml    write(_escape_cdata(text, encoding))  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1073, in _escape_cdata    return text.encode(encoding, "xmlcharrefreplace")UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 42: ordinal not in range(128)

查资料说是编码不匹配,添加了编码设置。该代码块为文本格式转化,得到yolo训练集合的样式。
代码使用xml.etree.ElementTree ,具体代码如下:

[root@localhost tools]# vi txt2xml.py

#encoding=utf-8  from xml.etree import ElementTree as ET  import sys  #print sys.getdefaultencoding()#在这里重新设置编码reload(sys)sys.setdefaultencoding('utf-8')f = open("/data/1xiu/darknet/carData/car.list1")for line in f:    line = line.decode("gbk").encode("utf-8").strip()    lineArray = line.split(" ")    root=ET.Element('annotation')      folder = ET.SubElement(root, "folder")    folder.text = "general_text"    filename = ET.SubElement(root, "filename")    filename.text=lineArray[0].decode("gbk").encode("utf8")    size = ET.SubElement(root, "size")    width = ET.SubElement(size, "width")    width.text = lineArray[2]    height = ET.SubElement(size,"height")    height.text = lineArray[1]    depth = ET.SubElement(size, "depth")    depth.text = str(3)    obj = ET.SubElement(root, "object")    name = ET.SubElement(obj, "name")    name.text = "plate"    bnd = ET.SubElement(obj, "bndbox")    xmin = ET.SubElement(bnd, "xmin")    xmin.text = lineArray[3]    ymin = ET.SubElement(bnd, "ymin")    ymin.text = lineArray[4]    xmax = ET.SubElement(bnd, "xmax")    xmax.text = lineArray[3] + lineArray[5]    ymax = ET.SubElement(bnd, "ymax")    ymax.text = lineArray[4] + lineArray[6]    tree=ET.ElementTree(root)    #tree.write(lineArray[0].decode("gbk").encode("utf8")+".xml")    tar = lineArray[0] + ".xml"    tree.write("1.xml")
0 0
原创粉丝点击