xml中俩种解析方式

来源:互联网 发布:linux下没有.ssh目录 编辑:程序博客网 时间:2024/06/05 19:15

两种解析方式

1、from xml.etree import ElementTree as ET

 利用ElementTree模块下的xml方法可以把一个字符串类型的东西转换成Element类,从而利用Element类下面的方法

xml(字符串)解析方式只能读不能写

from xml.etree import ElementTree as ET# 打开文件,读取XML内容str_xml = open('xo.xml', 'r').read()# 将字符串解析成xml特殊对象,root代指xml文件的根节点root = ET.XML(str_xml)

 

from xml.etree import ElementTree as ETa=open("first_xml","r",encoding="utf-8").read()print(type(a))b=ET.XML(a)                  其中b为根节点                                #利用xml方法可以的到一个Element类print(type(b))<class 'str'>                    输入一个字符串类型的转成Element类<class 'xml.etree.ElementTree.Element'>

 

def XML(text, parser=None):    """Parse XML document from string constant.    This function can be used to embed "XML Literals" in Python code.    *text* is a string containing XML data, *parser* is an    optional parser instance, defaulting to the standard XMLParser.    Returns an Element instance.

 Element下面的方法:

1、  iter所查看的东西)返回所匹配到的元素的迭代器     用于找到某一类节点并去循环

  Return an iterator containing all the matching elements.

2、   tag  返回节点的标签名

3、   attrib 返回标签的属性

4、  find()找到第一个匹配到的对象并返回   只能找儿子不能找孙子

5、    txet 获取标签的内容

2、parse(文件名)打开文件并解析,相比于xml少了打开文件那一步

from xml.etree import ElementTree as ET# 直接解析xml文件tree = ET.parse("xo.xml")# 获取xml文件的根节点root = tree.getroot()      通过getroot获取根节点

 

from xml.etree import ElementTree as ETa=ET.parse("first_xml")          #解析成ElementTree类的对象b=a.getroot()                #转换成Element类的对象print(a.getroot(),type(a))                    <Element 'data' at 0x00000033D062F958> <class 'xml.etree.ElementTree.ElementTree'>

 

def parse(source, parser=None):    """Parse XML document into element tree.    *source* is a filename or file object containing XML data,    *parser* is an optional parser instance defaulting to XMLParser.    Return an ElementTree instance.    """    tree = ElementTree()    tree.parse(source, parser)    return tree

 ElementTree下面的方法:

1、  getroot()     获取xml文件的根节点    与xml不同(通过xml()直接获取根节点,而parse()还的再通过getroot获取根节点)

2、  根节点.tag       获取节点的标签(这里与xml不同的是先利用getroot()得到根节点再tag)

3、  根节点.attrib    获取节点的属性(原理同上)

4、  text               获取标签的内容

5、  a.write(文件名)写入文件

from xml.etree import ElementTree as ETa=ET.parse("first_xml")b=a.getroot()for i in b.iter("year"):    new_year=int(i.text)+1    i.text=str(new_year)a.write("first_xml")

 6、  标签名.set("k1","k2")    为标签添加属性

7、  del 标签名 attrib["k1"]  删除标签的属性,如果标签名无属性,删除报错

3、创建一个XML文档

方法1

from xml.etree import ElementTree as ETa=ET.Element("aaa")                 #创建根节点b=ET.Element("bbb",{"k1":"k2"})    #创建子节点c=ET.Element("ccc",{"k2":"k3"})d=ET.Element("ddd",{"k3":"k4"})a.append(b)b.append(c)c.append(d)  #生成文档对象et = ET.ElementTree(a)      *******#生成文档对象********et.write("test.xml", encoding="utf-8", xml_declaration=True, short_empty_elements=False)

 方法2

from xml.etree import ElementTree as ET# 创建根节点root = ET.Element("famliy")# 创建大儿子# son1 = ET.Element('son', {'name': '儿1'})son1 = root.makeelement('son', {'name': '儿1'})# 创建小儿子# son2 = ET.Element('son', {"name": '儿2'})son2 = root.makeelement('son', {"name": '儿2'})# 在大儿子中创建两个孙子# grandson1 = ET.Element('grandson', {'name': '儿11'})grandson1 = son1.makeelement('grandson', {'name': '儿11'})# grandson2 = ET.Element('grandson', {'name': '儿12'})grandson2 = son1.makeelement('grandson', {'name': '儿12'})son1.append(grandson1)son1.append(grandson2)# 把儿子添加到根节点中root.append(son1)root.append(son1)  #生成文档对象tree = ET.ElementTree(root)tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)

 方法3

from xml.etree import ElementTree as f# 创建根节点a=f.Element("QWE")# 创建儿子b=f.SubElement(a,"asd",{"k1":"v1"})# 创建孙子c=f.SubElement(b,"fgh",{"k2":"v2"})  #生成文档对象z=f.ElementTree(a)z.write("ad.xml",encoding="utf-8")

 控制节点自闭合

short_empty_elements=False

加上节点不能自闭合    <grandson name="儿12"></grandson>

不加自闭合               <grandson name="儿12" />

注释

xml_declaration=True

加上使xml文件有注释  <?xml version='1.0' encoding='utf-8'?>

由于原生保存的XML时默认无缩进,如果想要设置缩进的话, 需要修改保存方式:

from xml.etree import ElementTree as ETfrom xml.dom import minidomdef prettify(elem):    """将节点转换成字符串,并添加缩进。    """    rough_string = ET.tostring(elem, 'utf-8')    reparsed = minidom.parseString(rough_string)    return reparsed.toprettyxml(indent="\t")# 创建根节点root = ET.Element("famliy")# 创建大儿子# son1 = ET.Element('son', {'name': '儿1'})son1 = root.makeelement('son', {'name': '儿1'})# 创建小儿子# son2 = ET.Element('son', {"name": '儿2'})son2 = root.makeelement('son', {"name": '儿2'})# 在大儿子中创建两个孙子# grandson1 = ET.Element('grandson', {'name': '儿11'})grandson1 = son1.makeelement('grandson', {'name': '儿11'})# grandson2 = ET.Element('grandson', {'name': '儿12'})grandson2 = son1.makeelement('grandson', {'name': '儿12'})son1.append(grandson1)son1.append(grandson2)# 把儿子添加到根节点中root.append(son1)root.append(son1)raw_str = prettify(root)f = open("xxxoo.xml",'w',encoding='utf-8')f.write(raw_str)f.close()

 自己写的

from xml.etree import ElementTree as ffrom xml.dom import minidomdef prettify(elem):    """将节点转换成字符串,并添加缩进。    """    rough_string = f.tostring(elem, 'utf-8')    reparsed = minidom.parseString(rough_string)    return reparsed.toprettyxml(indent="\t")a=f.Element("QWE")b=a.makeelement("asd",{"K1":"V2"})c=a.makeelement("zxc",{"K1":"V2"})a.set("name","lu")a.append(b)b.append(c)z=prettify(a)       #转成字符串了直接写入s=open("xxxx.xml","w")s.write(z)s.close()

 

0 0
原创粉丝点击