kshen转wurenhai's XML学习笔记

来源:互联网 发布:软件开发难点 编辑:程序博客网 时间:2024/04/29 08:33
到公司上班已经一周多了,这段时间主要做得还是XML语言的学习和掌握,从XML的语法开始,然后是DTD的验证,接着看Xpath,还用JAVA里Xpath.API用解析一下XML文档,昨天也把XSD验证也做了一次。
在学XML(extensible markup language)已经接触过HTML这样的标记语言,所以对XML入手还是比较容易的。但是XML提供了更为强大的标记性能,而且能够转为PDF、HTML等等可视格式,加之其不算太难掌握(主要是工具很多,这次我主要用XMLSpy,感觉不错),被广泛的在各方面应用。
这次学习主要是看《Learning XML, 2nd Edition》里面介绍的XML的发展,优缺点以及XML的语法,Quality Control(DTD、Schema…),Xpath等等(以上几个也是这次最主要看的东西),此外写一个plan的例子。
l         关于XML语法方面,以下几个东西比较有用:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plan SYSTEM "plan.dtd">
<plan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="plan.xsd">
对第一行,version一般用1.0不变了,encoding有UTF-8(也支持中文)、UTF-16、GB2312这几种。第二行,就是相用DTD验证,SYSTME表现本地的DTD验证,还可以用PUBLIC用引导命名空间(这个现在不熟习,还要多看!)。第三行表示用XSD验证。
对了XML语法还有几个要强调一个的:
<!ENTITY grd "展现组">,用ENTITY后,下文中只要用“& grd;“就可以代替“展现组”了。注要放在<!DOCTYPE [这里]>.
XML对于显示”<”,”>”,”&”, 单引号,双引号用转译,但提供"<![CDATA[if (&x < &y)]]>"来实现所见即所得(当然说的是"<![CDATA[这里的部分]]>"。以下是plan.xml文档(plan.dtd等下给出)。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plan SYSTEM "plan.dtd" [
         <!ENTITY grd "展现组">
]>
<plan>
         <person>
                  <group>&grd;</group>
                   <name>吴仁海</name>
                   <task>
                            <contend>JAVA技术学习</contend>
                            <date type="begintime">
                                     <year>2006</year>
                                     <month>7</month>
                                     <day>31</day>
                            </date>
                            <date type="deadtime">
                                     <year>2006</year>
                                     <month>8</month>
                                     <day>4</day>
                            </date>
                   </task>
         </person>
</plan>

l         Quality Control,只看了DTDSchema

DTD定义的主要有ELEMENT(元素)ATTLIST(元素的属性)ENTITY(这个没用过)NOTATION(没用过)。可用<!ELEMENT plan (person+)>来表示1 个或多个(还有零个或一个用“?”,零个或多个用“*”。属性中可以用“#REQUIRED”,“#IMPLITED”表示必需或可选等(还可以用“#FIXED 3”表示省缺为3)。以下是plan.dtd的代码:

<?xml version="1.0" encoding="UTF-8"?>

<!ELEMENT plan (person+)>

<!ELEMENT person (group, name, task+)>

           <!ELEMENT group (#PCDATA)>

           <!ELEMENT name (#PCDATA)>

           <!ELEMENT task (contend, date+)>

                    <!ELEMENT contend (#PCDATA)>

                    <!ELEMENT date (year, month, day)>

                    <!ATTLIST date

                             type (begintime | deadtime) #REQUIRED

                    >

                             <!ELEMENT year (#PCDATA)>

                             <!ELEMENT month (#PCDATA)>

                             <!ELEMENT day (#PCDATA)>

不过现在一般更加提倡用XML Schema。下面就将XML SchemaDTD比较:

1.  前者符合XML文档的规范,后者不是。

2.  可以对文档里每个元素出现的个数(minOccmaxOcc)、每个元素的类型(complexTypesimpleType)做出详细的规定,以及对元素顺序的支持。而后者定义元素个数仅零个、一个或多个的概念,在元素类型上仅有PCDATA,枚举等。

3.  在开发方面,用XMLSpy来做,对XSD文档提供了视图开发,比DTD方便了许多。

4.  XSD中没规定XML的根结点,它可以对所有的Gloable Element进行验证。DTD中规定则根结点。

于是在XML Spy Schema的视图中,就可以看plan.xsd(定义的内容与plan.dtd一样)如下:

<?xml version="1.0" encoding="UTF-8"?>

<!-- edited with XMLSpy v2006 rel. 3 sp1 (http://www.altova.com) by WuRH (NewlandComputer) -->

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">

         <xsd:element name="plan">

                   <xsd:annotation>

                            <xsd:documentation>工作计划</xsd:documentation>

                   </xsd:annotation>

                   <xsd:complexType>

                            <xsd:sequence>

                                     <xsd:element name="person" maxOccurs="unbounded">

                                               <xsd:annotation>

                                                        <xsd:documentation>员工</xsd:documentation>

                                               </xsd:annotation>

                                               <xsd:complexType>

                                                        <xsd:sequence>

                                                                 <xsd:element name="group" type="xsd:string">

                                                                           <xsd:annotation>

                                                                                    <xsd:documentation>员工的工作小组</xsd:documentation>

                                                                           </xsd:annotation>

                                                                 </xsd:element>

                                                                 <xsd:element name="name" type="xsd:string">

                                                                           <xsd:annotation>

                                                                                    <xsd:documentation>员工姓名</xsd:documentation>

                                                                           </xsd:annotation>

                                                                 </xsd:element>

                                                                 <xsd:element name="task" maxOccurs="unbounded">

                                                                           <xsd:annotation>

                                                                                    <xsd:documentation>员工的任务</xsd:documentation>

                                                                           </xsd:annotation>

                                                                           <xsd:complexType>

                                                                                    <xsd:sequence>

                                                                                             <xsd:element name="contend" type="xsd:string">

                                                                                                       <xsd:annotation>

                                                                                                                <xsd:documentation>任务信息</xsd:documentation>

                                                                                                       </xsd:annotation>

                                                                                             </xsd:element>

                                                                                             <xsd:element name="date" minOccurs="2" maxOccurs="2">

                                                                                                       <xsd:annotation>

                                                                                                                <xsd:documentation>任务周期</xsd:documentation>

                                                                                                       </xsd:annotation>

                                                                                                       <xsd:complexType>

                                                                                                                <xsd:sequence>

                                                                                                                         <xsd:element name="year" type="xsd:gYear">

                                                                                                                                   <xsd:annotation>

                                                                                                                                            <xsd:documentation>gYear</xsd:documentation>

                                                                                                                                   </xsd:annotation>

                                                                                                                         </xsd:element>

                                                                                                                         <xsd:element name="month" type="xsd:gMonth">

                                                                                                                                   <xsd:annotation>

                                                                                                                                            <xsd:documentation>gMonth</xsd:documentation>

                                                                                                                                   </xsd:annotation>

                                                                                                                         </xsd:element>

                                                                                                                         <xsd:element name="day" type="xsd:gDay">

                                                                                                                                   <xsd:annotation>

                                                                                                                                            <xsd:documentation>gDay</xsd:documentation>

                                                                                                                                   </xsd:annotation>

                                                                                                                         </xsd:element>

                                                                                                                </xsd:sequence>

                                                                                                                <xsd:attribute name="type" type="dateType" use="required">

                                                                                                                         <xsd:annotation>

                                                                                                                                   <xsd:documentation>begintime/deadtime</xsd:documentation>

                                                                                                                         </xsd:annotation>

                                                                                                                </xsd:attribute>

                                                                                                      </xsd:complexType>

                                                                                             </xsd:element>

                                                                                    </xsd:sequence>

                                                                           </xsd:complexType>

                                                                 </xsd:element>

                                                        </xsd:sequence>

                                               </xsd:complexType>

                                     </xsd:element>

                            </xsd:sequence>

                   </xsd:complexType>

         </xsd:element>

         <xsd:simpleType name="dateType" final="list">

                   <xsd:annotation>

                            <xsd:documentation>时间类型</xsd:documentation>

                   </xsd:annotation>

                   <xsd:restriction base="xsd:string">

                            <xsd:enumeration value="begintime"/>

                            <xsd:enumeration value="deadtime"/>

                   </xsd:restriction>

         </xsd:simpleType>

         <xsd:element name="per">

                   <xsd:complexType>

                            <xsd:sequence>

                                     <xsd:element name="name"/>

                                     <xsd:element name="group"/>

                            </xsd:sequence>

                   </xsd:complexType>

         </xsd:element>

</xsd:schema>

以下代码都可以用XML Spy生成,有几个东西注意。

1<xsd:element/>里有加<xsd:complexType/>才能再加子结点。其中<xsd:sequence/>是带顺序的结点(另外还有<xsd:choice/>表可选,<xsd:all/>表无顺序)

2.可以自定义simpleType。如:       

<xsd:simpleType name="dateType" final="list">

                   <xsd:restriction base="xsd:string">

                            <xsd:enumeration value="begintime"/>

                            <xsd:enumeration value="deadtime"/>

                   </xsd:restriction>

</xsd:simpleType>

就表示创建枚举型。以下方法可以创建表示1-20内的数字:

         <xsd:simpleType name="finiteNum">

                   <xsd:restriction base="xsd:int">

                            <xsd:minInclusive value="1"></xsd:minInclusive>

                            <xsd:minInclusive value="20">L</xsd:minInclusive>

                   </xsd:restriction>

         </xsd:simpleType>

还要说明的xsd:gYear,xsd:gMonth, xsd:gDay的值是2006, -8, --1

l         最后要讲的是XMLJAVA中的解析。

先说说Xpath(XML Path Language)

1.       绝对路径以“//”开始,如访问person可以用“//plan/person”

2.       查找。如”//plan/person/name['吴仁海']”表示查name的值为吴仁海的节点。如果找属性值加“@”号。如“ //plan/person/task/date[@type='deadtime']”。

3.       父节点、兄弟节点等。注:Xpath是有方向的。如Following-sibling表示向下同伴节点,要向前则要用Preceding-sibling

JDK 1.5中用javax.xml.xpath.*这个包,在JDK1.4中用org.apache.xpath.*这个包。两个包在具体的建立文档连接时,基本一样,但用法上还是有区别的,前者要精确到”//plan/person/node()”,”//plan/person/name/text()”,后者只要用”//plan/person”这样就可以访问person节点下所有的内容了。这次主要用后者。

1.  获取XML文档,通过org.w3c.dom.Document工厂模式(据说这样可以跨平台,不是很懂)

    /**

     * 获取XML文档

     *

     * @param url String

     * @return Document

     * @throws Exception

     */

    public static Document readerXmlFile(String url) throws Exception {

        try {

            /** 获取文档 */

            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

            DocumentBuilder builder = factory.newDocumentBuilder();

            Document xDoc = builder.parse(url);

            return xDoc; /** 返回 */

        } catch (IOException ex) {

            throw new Exception("文件读取错误", ex);

        } catch (SAXException ex) {

            throw new Exception("XML解析错误", ex);

        } catch (ParserConfigurationException ex) {

            throw new Exception("配置错误", ex);

        }

}

2.  读取节点,有两点类型:

Node person = XPathAPI.selectSingleNode( planDoc.getDocumentElement(), "//person[name='吴仁海']/group");

Element person = (Element) XPathAPI.selectSingleNode(planDoc.                    getDocumentElement(), "//person[name='吴仁海']/group");

PS:其中Element Extends Node,是新类,网络上的资料表示,现在都用后者。

然后用person.getFirstChild().getNodeValue()就可以取得group里的值了(里面的内容也是一个节点,这点可以以Schema创建时,有simpleType为证)

3.  增加、修改、删除。

关于修改直接用person.getFirstChild().setNodeValue()即可。而删除则person.removeNode()

但是增加方面则一定要由Element person = Document.createElement(“person”)来创建。还有一点Text name = Document.createTextNode(“吴仁海”)是用来创建某节点里的具体值。然后可以用person.appendChild(name)来增加。

4.  保存XML。代码如下

    /**

     * 保存XML文件

     *

     * @param doc Document

     * @param url String

     * @throws IllegalArgumentException

     * @throws TransformerFactoryConfigurationError

     * @throws Exception

     */

    public static void saveXMLFile(Document doc, String url) throws

            IllegalArgumentException, TransformerFactoryConfigurationError,

            Exception {

        try {

            /** 首先创建一个TransformerFactory对象,再由此创建Transformer对象。 */

            TransformerFactory tf = TransformerFactory.newInstance();

            Transformer t = tf.newTransformer();

            /** Transformer的输出属性, XSLT的省缺输出, java.util.Properties */

            Properties properties = t.getOutputProperties();

            /** 输出字付编码:GB2312,支持中文 */

            properties.setProperty(OutputKeys.ENCODING,

                                   "GB2312");

            properties.setProperty(OutputKeys.DOCTYPE_SYSTEM,

                                   "D://XML//plan.dtd");

            /** 更新XSLT引擎的输出 */

            t.setOutputProperties(properties);

            /** 调用XSLT输出, 输出DOM Tree中的内容到输出介质中 */

            t.transform(new DOMSource(doc),

                        new StreamResult(new FileOutputStream(url)));

            t.clearParameters();

            t = null;

            tf = null;

        } catch (TransformerConfigurationException ex) {

            throw new Exception("文件保存失败", ex);

        } catch (FileNotFoundException ex) {

            throw new Exception("文件保存失败", ex);

        } catch (TransformerException ex) {

            throw new Exception("文件保存失败");

        }

}

 

总结,学完以上东西,XML也算是入门了吧。但是还是有很多东西还是不怎么懂,继续学习中。 

原创粉丝点击