用jdom轻松整合java和xml（三）

来源：互联网发布：mac book pro 壁纸编辑：程序博客网时间：2024/04/24 03:09

用jdom 轻松整合 java和xml（三）xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

读取文档类型

现在，让我们来看一下怎么读取文档的详细内容。许多xml文档都有的一个东西是文档类型，在jdom中用DocType类来描述。万一你不是xml方面的专家（嘿，不用灰心，你就是我们所要面向的听众），一个文档类型的声明看起来象下边的样子：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

DOCTYPE后边的第一个词揭示文档被强制类型的名字，PUBLIC后边的词是文档类型的公共属性，最后一个词是文档类型的系统属性。文档属性可以上通过文档的getDocType（）方法获得，DocType类提供了一组获得文档类型声明的方法。

DocType docType = doc.getDocType();
System.out.println("Element: " + docType.getElementName());
System.out.println("Public ID: " + docType.getPublicID());
System.out.println("System ID: " + docType.getSystemID());

读取文档数据

每一个xml文档必须有一个根元素。这个元素是访问所有xml文档内部信息的起始点。例如：这个文档片段用<web-app>作为根元素：

<web-app id="demo">
<description>Gotta fit servlets in somewhere!</description>
<distributable/>
</web-app>

根元素的实例可以在文档中直接获得。

Element webapp = doc.getRootElement();

这样，你就可以访问这个元素的属性（如上边的id）内容和子节点元素。

访问子节点

xml文档是树型结构的，任何一个元素都有可能包含任何数量的子元素。例如：<web-app>元素有<description>和 <distributable>作为子节点元素。你可以通过很多方法获得一个元素的子元素，getChild()如果没有子元素的话返回NULL。

List getChildren(); // return all children
List getChildren(String name); // return all children by name
Element getChild(String name); // return first child by name

示例：

// Get a List of all direct children as Element objects
  List allChildren = element.getChildren();
  out.println("First kid: " + ((Element)allChildren.get(0)).getName());
  // Get a list of all direct children with a given name
  List namedChildren = element.getChildren("name");
  // Get a list of the first kid with a given name
  Element kid = element.getChild("name");

当文档结构事先知道的情况下，getchild（）方法很容易快速的获得嵌套的元素。给出一个xml文档：

<?xml version="1.0"?>
<linux:config>
  <gui>
    <window-manager>
      <name>Enlightenment</name>
      <version>0.16.2</version>
    </window-manager>
    
  </gui>
</linux:config>

下边的代码直接获得window manager 的名字

String windowManager = rootElement.getChild("gui")
                                  .getChild("window-manager")
                                  .getChild("name")
                                  .getText();

如果文档不可用要小心NullPointerExceptions异常。为了实现简单的文档导航，未来的jdom可能会支持xpath。子节点可以通过getParent()获得父节点。

获取文档属性

属性是元素拥有的另一组信息。html程序员对他是很熟悉的。下边的<table>元素有width和border属性。

这些属性可以在元素中直接获得。

String width = table.getAttributeValue("width");

你也可以用属性实例来重新获得这些属性。这个能力帮助jdom支持一些高级概念，例如名字空间中的属性。（参考文章后边关于名字空间的内容）

Attribute widthAttrib = table.getAttribute("width");
String width = widthAttrib.getValue();

为了方便你还可以获得这些属性的原始数据类型。

int width = table.getAttribute("border").getIntValue();

你可以转化这些数据到任何的原始数据类型。如果这些属性不能转换成原始数据类型就抛出一个DataConversionException异常。如果属性不存在getAttribute()返回一个null;

提取文档内容

我们为用简单的方法获得文档内容而激动，下边我们看一下用element.getText()方法提取文档的文本内容是多么容易。这个是标准方法，适用于象下边这样的文档：

<name>Enlightenment</name>

但是有些时候这些文档包含注释，文本内容和子元素。在一些高级的文档中，它甚至包含一些处理指令：

<table>
    
    Some text
    <tr>Some child</tr>
    <?pi Some processing instruction?>
  </table>

你可以总是通过下边的方式获得文本内容和子节点：

String text = table.getText(); // "Some text"
Element tr = table.getChild("tr"); // <tr> child

这使得标准使用很简单。有些时候例如输出，获得一个文档所有内容的顺序是很重要的。为了这个原因，你可以使用一个特殊的方法叫getMixedContent()。它返回一个list内容可能包含注释，字符串，元素和处理指令的实例。java程序员可以使用instanceof 来获得内容。下边的代码打印一个文档内容的摘要：
  List mixedContent = table.getMixedContent();
  Iterator i = mixedContent.iterator();
  while (i.hasNext()) {
    Object o = i.next();
    if (o instanceof Comment) {
      // Comment has a toString()
      out.println("Comment: " + o);
    }
    else if (o instanceof String) {
      out.println("String: " + o);
    }
    else if (o instanceof ProcessingInstruction) {
      out.println("PI: " + ((ProcessingInstriction)o).getTarget());
    }
    else if (o instanceof Element) {
      out.println("Element: " + ((Element)o).getName());
    }
  }

资源：

Read Part 2 of "Easy java/xml Integration with jdom," Jason Hunter and Brett McLaughlin (July, 2000) to learn how to use jdom to create and mutate xml:
javaworld.com/javaworld/jw-07-2000/jw-0728-jdom2.html">http://www.javaworld.com/javaworld/jw-07-2000/jw-0728-jdom2.html

The home of jdom:
jdom.org/%20">http://jdom.org/
Mailing list sign-up for jdom-interest and jdom-announce, as well as list archives:
jdom.org/involved/lists.html">http://jdom.org/involved/lists.html
The jdom announcement press release:
http://www.oreillynet.com/pub/a/mediakit/pressrelease/20000427.html
More information on DOM:
http://www.w3.org/DOM/
More information on SAX:
http://www.megginson.com/SAX/

More information on JAXP:
java.sun.com/xml/">http://java.sun.com/xml/