.Net Compact Framework开发(3)——XML SAX操作

来源:互联网 发布:javascript弹出提示框 编辑:程序博客网 时间:2024/04/25 21:29
  • XmlTextReader是只读的单向XML解析器,在解析发生错误时触发XmlException,不支持DTD;XmlTextReader可以采用多种方法装入XML文档,XmlResolver属性用于解析远程资源;
//不支持相对路径,必须使用完整路径或者文件位于root目录
XmlTextReader reader = new XmlTextReader("filename.xml");
XmlTextReader xmlReader = new XmlTextReader("http://foo.com/bar.xml");
//如果你的网站需要密码认证,可以采用如下方法
static void Main()
{
  NetworkCredential cred =
    new NetworkCredential("usrnm", "psswd", "domain");

  Stream s =
    GetDocumentStream("http://www/foo.com/bar.xml", cred,);
  XmlTextReader reader = new XmlTextReader(s);

  // Do something interesting with the XmlTextReader
}

static Stream GetDocumentStream(string address, ICredentials cred)
{
  XmlUrlResolver xur = new XmlUrlResolver();
  Uri uri = new Uri(address);

  if(cred != null)
    xur.Credentials = cred;

  try
  {
    return (Stream)xur.GetEntity(uri, null,null);
  }
  catch(Exception e)
  {
    MessageBox.Show(e.ToString());
    throw e;
  }
}
  • XmlTextReader的Namespaces属性为true的时候支持namespace,如果为false,prefix和local name被组合成local name,这个属性必须在进行任何读操作之前(XmlTextReader的ReadState属性为ReadState.Initial)设置完成;
reader.Namespaces = true;
reader.MoveToContent();
MessageBox.Show("Local Name: " + reader.LocalName);
MessageBox.Show("Prefix: " + reader.Prefix);
MessageBox.Show("Namespace: " + reader.NamespaceURI);
reader.Close();
  • XmlTextReader的Normalization属性控制是否对空格和attribute进行标准化,将相邻的空格合并为一个,将entity reference替换成resolved value(不支持DTD),因为后者总是要进行的(例如对&#gt的解析),实质上Compact Framework只进行空格合并;这个属性可以在解析的任意时候进行;
public static void TestNormalization(bool normOn)
{
  string data =
     @"<Root>
     <Element attr='     &lt;Testing Normalization&gt;
     New Line'/>
      </Root>";
     StringReader str = new StringReader(data);
     XmlTextReader reader = new XmlTextReader(str);
     reader.WhitespaceHandling=WhitespaceHandling.None;

     reader.MoveToContent();
     reader.ReadStartElement("Root");

     reader.Normalization=normOn;
     MessageBox.Show("Normalization On: " + normOn.ToString());
     MessageBox.Show("attr's Value: " + reader.GetAttribute("attr"));
     reader.Close();
}

public static void Main(string[] args)
{

     // Test with normalization off
     TestNormalization(false);
     // Test with normalization on
     TestNormalization(true);
}
  • XmlTextReader的WhitespaceHandling属性决定了如何处理空格,
All  ------    Both Whitespace and SignificantWhitespace nodes are returned.
None  ------   No Whitespace or SignificantWhitespace nodes are returned.
Significant  ------    Only SignificantWhitespace is returned.

  • 在解析XML流的时候,XmlTextReader指向当前遇到的Node,每个Node具有4种信息(但不是每种Node这4种信息都有效):Node Name, Node Namespace, Node Value, and Node Attributes;
NODE TYPE                              VALUE
Attribute ------   The string value of the attribute
CDATA  ------  The content of the CDATA section
Comment  ------  The comment of the comment node
ProcessingInstruction  ------  The entire content, not including the target
SignificantWhitespace  ------  The white space within an xml:space = 'preserve' scope
Text  ------  The content of the text node
Whitespace  ------  The white space between markup
XmlDeclaration  ------  The content of the declaration

NODE TYPE            Available Attribute
Element  ------  Any custom attribute
XmlDeclaration  ------  Version, encoding, & standalone

  • XmlTextReader可以通过NodeType属性判断当前Node的类型,通过HasValue属性判断当前Node是否有Value,HasAttributes属性判断当前Node是否有Attribute;通过Read方法来遍历读取XML内容;
while(reader.Read())
{
  switch(reader.NodeType)
  {
  case XmlNodeType.Element:
    if(reader.IsEmptyElement)
     MessageBox.Show("<" + reader.Name + "/>");
    else
     MessageBox.Show("<" + reader.Name + ">");
     break;
  case XmlNodeType.EndElement:
    MessageBox.Show("</" + reader.Name + ">");
    break;
  case XmlNodeType.CDATA:
    MessageBox.Show("<![CDATA[" + reader.Value + "]]>");
    break;
  case XmlNodeType.Comment:
    MessageBox.Show("<!-- " + reader.Value + " -->");
    break;
  case XmlNodeType.Document:
    MessageBox.Show("Reading an XML document");
    break;
  case XmlNodeType.DocumentFragment:
    MessageBox.Show("Reading an XML document fragment");
    break;
  case XmlNodeType.ProcessingInstruction:
    MessageBox.Show("<? " +
              reader.Name + " " +
              reader.Value + "?>");
    break;
  case XmlNodeType.Text:
    MessageBox.Show("Text: " + reader.Value);
    break;
  case XmlNodeType.XmlDeclaration:
    MessageBox.Show("<?xml " + reader.Value + "?>");
    break;
  }
}
  • XmlTextReader的ReadStartElement方法相当于在IsStartElement方法之后调用Read方法,检查当前Node是否是StartElement并前进到下一个节点;这个方法可以同一个Name/Namespace参数来检查当前Node是否匹配;这个方法主要用于跳过StartElement直接跳转到Element的内容;如果你调用了ReadStartElement并且使用了这个Element中的内容,要对称调用ReadEndElement(除非这个Element的IsEmptyElement属性等于True);
using (XmlReader reader = XmlReader.Create("book3.xml")) {

  // Parse the XML document.  ReadString is used to
  // read the text content of the elements.
  reader.Read();
  reader.ReadStartElement("book");  
  reader.ReadStartElement("title");   
  Console.Write("The content of the title element:  ");
  Console.WriteLine(reader.ReadString());
  reader.ReadEndElement();
  reader.ReadStartElement("price");
  Console.Write("The content of the price element:  ");
  Console.WriteLine(reader.ReadString());
  reader.ReadEndElement();
  reader.ReadEndElement();

}
//book3.xml
<book>
  <title>Pride And Prejudice</title>
  <price>19.95</price>
</book>
    XmlTextReader的ReadElementString方法可以读取text-only的Element内容,其内部首先调用MoveToContent,然后读取content返回一个String,同时相当于调用了ReadEndElement;
reader.Read();
reader.ReadStartElement("Exercise");

string name = reader.ReadElementString();
string bodypart = reader.ReadElementString();

reader.Close();
  • XmlTextReader的MoveToContent方法用于快速前进到一个content node(A content node is an element, an end element, an entity reference, an end entity, or non–white space text.),在向前搜索content node的过程中,会略过DocumentType nodes, ProcessingInstruction nodes, Whitespace nodes, SignificantWhitespace nodes;如果当前node是一个content node的attribute,reader会返回到这个attribute的owner element;MoveToContent的返回类型是System.Xml.XmlNodeType,如果直到文件末尾都没找到合适的Node,返回XmlNodeType.None;
while( XmlNodeType.None != reader.MoveToContent())
{
  if(XmlNodeType.Element == reader.NodeType
     && reader.Name == "book")
  {
    MessageBox.Show(reader.ReadElementString());
  }
}
    XmlTextReader在读取Attribute的时候如果不移动指针就只能得到Value不能得到Name
public int
SearchAttributes(string value, XmlReader reader)
{
  if(!reader.HasAttributes)
    return -1;

  for(int ndx = 0;ndx<reader.AttributeCount;++ndx)
  {
    //既可以根据下标索引,也可以使用attribute name/local name + namespace URI来索引,例如reader["standalone"]
    if(reader[ndx] == value)
      return ndx;
  }
  return -1;
}
  • XmlTextReader可以通过MoveToAttribute/ MoveToFirstAttribute/ MoveToNextAttribute方法将指针从element移动到attribute;
public string
SearchAttributes(string value, XmlReader reader)
{
  if(!reader.HasAttributes)
    return string.Empty;

  for(int ndx = 0;ndx<reader.AttributeCount;++ndx)
  {
    reader.MoveToAttribute(ndx);
if(reader.Value == value){
  reader.MoveToElement();
      return reader.Name;
    }
  }
  reader.MoveToElement();
  return string.Empty;
}
    等效于
public string
SearchAttributes(string value, XmlReader reader)
{
  if(!reader.MoveToFirstAttribute)
    return string.Empty;

  do{
if(reader.Value == value){
  reader.MoveToElement();
      return reader.Name;
    }
  }while(reader.MoveToNextAttribute());
  reader.MoveToElement();
  return string.Empty;
}
  • XmlTextReader在移动指针查找完Attribute后,最好调用MoveToElement确保指针回到Element的开始
  • XmlTextReader提供了多种读取Element内容的方法:ReadString、ReadString、ReadBase64、ReadBinHex、ReadInnerXml、ReadOuterXml
  • ReadString在当前Node是Element的时候会组合内部所有的text, white space, significant white space, CDATA nodes组成一个String,但是组合的过程中不能遇到任何的markup;如果当前Node是text node,也组合前面所述的所有Node得到String,直到一个end tag或者markup;这个方法的返回值如果为String.Empty,表示当前Node没有可读取的text或者当前Node不是Element/Text Node
string data =
  @"<Root>Text data followed by mark up.<Child/></Root>";

StringReader str = new StringReader(data);
XmlTextReader reader = new XmlTextReader(str);
reader.WhitespaceHandling=WhitespaceHandling.None;

reader.MoveToContent();
MessageBox.Show("Content of Root: " + reader.ReadString());
  • ReadChars方法有三个参数,分别是目标地址,拷贝偏移,要拷贝的字符数,返回实际读取的字符数;本方法只能工作在Element Node,不能工作于TextNode,这个方法主要用于分块读取一个Element中大量的数据;
  • ReadInnerXml和ReadOuterXml读取一个element内所有的内容,使用string返回,ReaderOuterXml返回当前Node的start tag, content, 和end tag, ReadInnerXml只返回content;这两个方法可以工作在Element Node和Attribute上;
ELEMENT XML    POSITIONED ON    ReadInnerXml    ReadOuterXml
<author>    <author>    <fn>Ronnie</fn>    <author>
<fn>Ronnie</fn>         <ln>Yates</ln>    <fn>Ronnie</fn>
<ln>Yates<ln>              </ln>Yates</ln>
</author>              </author>

ELEMENT XML    POSITIONED ON    ReadInnerXml    ReadOuterXml
<auth fn="Ronnie"/>    fn    Ronnie    fn="Ronnie"

  • XmlTextReader还支持Skip方法和Depth属性
  • XmlTextWriter是单向的生成XML流的方法,同样也不支持DTD;
XmlTextWriter writer = new XmlTextWriter("output.xml", Encoding.UTF8);
  • XmlTextWriter的Namespace属性指示是否支持Namespace,缺省是支持;修改这个属性必须在任何写操作之前完成(WriteState必须是WriteState.Start)
XmlTextWriter writer = new XmlTextWriter("namespaces.xml");
writer.Namespaces = true;

writer.WriteElementString("po",
                          "test",
                          "http://www.fake.com");
Output
<po:test xmlns:po="http://www.fake.com" />
  • 在开始写入XML数据前,你必须设置XmlTextWriter的输出格式,这通过Formatting属性完成,可以设置成Formatting.Indented或Formatting.None;可以通过IndentChar和Indention属性来设置缩进字符和缩进量;这些设置可以在任何时候进行,对后续的写入操作生效;
  • XmlTextWriter使用WriteStartDocument来写入XML declaration,这必须是构造函数之后的第一个写入操作;
XmlTextWriter writer =
  new XmlTextWriter("startdoc.xml", Encoding.UTF8);
writer.Formatting = Formatting.Indented;

writer.WriteStartDocument(true);
writer.WriteElementString("root", null);
writer.WriteEndDocument();
writer.Close();
Output
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root />
  • XmlTextReader提供WriteStartElement和WriteEndElement方法来逐步写入Element;WriteElementString用于一次性写入只含有text的element
XmlTextWriter writer =
  new XmlTextWriter("startelem.xml", Encoding.UTF8);
writer.Formatting = Formatting.Indented;

writer.WriteStartDocument();
// Make a true empty element
writer.WriteStartElement("root");
writer.WriteEndElement();
// Make an empty element with start and end tags, <root></root>
// writer.WriteStartElement("Empty");
// writer.WriteFullEndElement();

writer.Close();
Output
<?xml version="1.0" encoding="utf-8"?>
<root />
    另一个例子
XmlTextWriter writer =
  new XmlTextWriter("elementstring.xml", Encoding.UTF8);
writer.Formatting = Formatting.Indented;

writer.WriteStartElement("StockQuote", "http://fakequote.com");
writer.WriteElementString
  ("Symbol", "http://fakequote.com", "MSFT");

writer.WriteElementString("Value",
                          "http://fakequote.com",
                          XmlConvert.ToString(123.32));
writer.WriteEndElement();
writer.Close();
Output
<StockQuote xmlns="http://fakequote.com">
  <Symbol>MSFT</Symbol>
  <Value>123.32</Value>
</StockQuote>
  • XmlTextWriter使用WriteStartAttribute和WriteEndAttribute方法来逐步写入Attribute;也支持WriteAttributeString来一次性写入;
public static void Main()
{
  XmlTextWriter writer =
    new XmlTextWriter("startatt.xml", Encoding.UTF8);
  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");
  writer.WriteStartAttribute("po", "att1", "http://bogus");
  writer.WriteString("value");
  writer.WriteEndAttribute();
  writer.WriteEndElement();
  writer.Close();
}
Output
<root po:att1="value" xmlns:po="http://bogus" />
    另一个例子
public static void Main()
{
  XmlTextWriter writer =
    new XmlTextWriter("attstring.xml", Encoding.UTF8);
  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");
  writer.WriteAttributeString("att1", "http://bogus", "value1");
  writer.WriteEndElement();
  writer.Close();
}
Output
<root d1p1:att1="value1" xmlns:d1p1="http://bogus" />
  • WriteAttributeString也可以写入特殊的XML Attribute——xml:space和xml:lang,前者决定Element内的空格如何处理(preserve、default),后者描述这个Element的内容是何种语言
public static void Main()
{
   XmlTextWriter writer =
     new XmlTextWriter("nsdecl.xml", Encoding.UTF8);
   writer.Formatting = Formatting.Indented;

   writer.WriteStartElement("root");

   // set the xml:space attribute to preserver
   writer.WriteAttributeString("xml",
                               "space",
                               null,
                               "preserve");

  // set the xml:lang attribute to lang:en
  writer.WriteAttributeString("xml",
                              "lang",

                              null,
                              "en");
  writer.WriteEndElement();
  writer.Close();
}
Output
<root xml:space="preserve" xml:lang="en" />
  • WriteAttributeString还可以用于声明Namespace
public static void Main()
{
  XmlTextWriter writer =
    new XmlTextWriter("nsdecl.xml", Encoding.UTF8);
  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");

  // redefine the default namespace
  writer.WriteAttributeString("xmlns",
                              null,
                              "http://default");

  // define a namespace prefix "po"
  writer.WriteAttributeString("xmlns",
                              "po",
                              null,
                              "http://post_office");
  writer.WriteEndElement();
  writer.Close();
}
Output
<root xmlns="http://default" xmlns:po="http://post_office" />
  • XmlTextWriter使用WriteString方法来一次性写入内容字符串;如果需要多次写入,使用WriteChars方法;WriteBase64和WriteBinHex用于写入二进制数据;
  • XmlConvert提供了将.NET Compact Framework的类型转换成XML Schema映射,也能将XML String转换成.NET Compact Framework的类型,这比使用.NET自带的ToString和System.Convert安全;
  XmlTextWriter writer =
    new XmlTextWriter("xmlconvert.xml", Encoding.UTF8);
  writer.Formatting = Formatting.Indented;
  writer.Indentation = 2;
    
  writer.WriteStartElement("root");
  writer.WriteElementString("boolean",
     XmlConvert.ToString(false));
  writer.WriteElementString("Single",
    XmlConvert.ToString(Single.PositiveInfinity));
  writer.WriteElementString("Double",
    XmlConvert.ToString(Double.NegativeInfinity));
  writer.WriteElementString("DateTime",
  XmlConvert.ToString(DateTime.Now));
  writer.WriteEndElement();
  writer.Close();
 
  XmlTextReader reader = new XmlTextReader("xmlconvert.xml");
  reader.WhitespaceHandling = WhitespaceHandling.None;
 
  reader.MoveToContent();
  reader.ReadStartElement("root");
  bool b = XmlConvert.ToBoolean(reader.ReadElementString());
  float s = XmlConvert.ToSingle(reader.ReadElementString());
  double d = XmlConvert.ToDouble(reader.ReadElementString());
  DateTime dt = XmlConvert.ToDateTime(reader.ReadElementString());
  reader.Close();

  MessageBox.Show("Boolean: " + b.ToString());
  MessageBox.Show("Single: " + s.ToString());
  MessageBox.Show("Double: " + d.ToString());
  MessageBox.Show("DateTime: " + dt.ToString());
  • 杂项:如果你先要向XmlTextWriter写入Raw数据,使用WriteRaw方法;如果你想要写入注释,使用WriteComment方法;WriteCData方法会写入CDATA Sections;使用WriteProcessingInstruction写入处理指令;WriteWhitespace用于向XML内写入空格;
public static void Main()
{
  string rawText = "<bad_xml att=/"bad chars ='/"<>/">/n/t" +
                   "This is bad content — & < > = ' /"/n" +
                   "</bad_xml>";
  XmlTextWriter writer =
    new XmlTextWriter("xmlconvert.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;
  writer.Indentation = 2;

  writer.WriteStartElement("root");
  writer.WriteRaw(rawText);
  writer.WriteEndElement();
  writer.Close();
}
Output
<root><bad_xml att="bad chars ='"<>">
      This is bad content — & < > = ' "
</bad_xml></root>