Parsing XML Files with PowerShell
来源:互联网 发布:下载安装包软件 编辑:程序博客网 时间:2024/06/07 10:38
In the context of using Windows PowerShell for lightweight software test automation, one of the most common tasks you need to perform is parsing data from XML files. For example, you may want to extract test case input and expected result data from an XML test cases file, or you might want to pull out results data from an XML test results file. Compared to parsing a flat text file, parsing most XML files is a bit tricky because of XML's hierarchical structure. There are several approaches you can take when parsing XML with PowerShell. In general, the most flexible technique is to read the entire XML file into memory as an XmlDocument object and then use methods such as SelectNodes(), SelectSingleNode(), GetAttribute(), and get_InnerXml() to parse the object in memory. Let me demonstrate with typical example. Suppose you want to parse this dummy XML test case data file:
<?xml version="1.0" ?>
<testCases>
<testCases>
<testCase id="001">
<inputs>
<arg1 optional="no">3</arg1>
<arg2>4</arg2>
</inputs>
<expected>7</expected>
</testCase>
<inputs>
<arg1 optional="no">3</arg1>
<arg2>4</arg2>
</inputs>
<expected>7</expected>
</testCase>
<testCase id="002">
<inputs>
<arg1 optional="yes">5</arg1>
<arg2>6</arg2>
</inputs>
<expected>11</expected>
</testCase>
<inputs>
<arg1 optional="yes">5</arg1>
<arg2>6</arg2>
</inputs>
<expected>11</expected>
</testCase>
</testCases>
The dummy file represents test case data for a hypothetical Sum() method. Listed below is a PowerShell script which parses the XML file and produces as output:
The dummy file represents test case data for a hypothetical Sum() method. Listed below is a PowerShell script which parses the XML file and produces as output:
PS C:/XMLwithPowerShell> ./parseXML.ps1
Parsing file testCases.xml
Case ID = 001 Arg1 = 3 Optional = no Arg2 = 4 Expected value = 7
Case ID = 002 Arg1 = 5 Optional = yes Arg2 = 6 Expected value = 11
Case ID = 002 Arg1 = 5 Optional = yes Arg2 = 6 Expected value = 11
End parsing
The complete script is:
# parseXML.ps1
write-host "`nParsing file testCases.xml`n"
[System.Xml.XmlDocument] $xd = new-object System.Xml.XmlDocument
$file = resolve-path("testCases.xml")
$xd.load($file)
[System.Xml.XmlDocument] $xd = new-object System.Xml.XmlDocument
$file = resolve-path("testCases.xml")
$xd.load($file)
$nodelist = $xd.selectnodes("/testCases/testCase") # XPath is case sensitive
foreach ($testCaseNode in $nodelist) {
$id = $testCaseNode.getAttribute("id")
$inputsNode = $testCaseNode.selectSingleNode("inputs")
$arg1 = $inputsNode.selectSingleNode("arg1").get_InnerXml()
$optional = $inputsNode.selectSingleNode("arg1").getAttribute("optional")
$arg2 = $inputsNode.selectSingleNode("arg2").get_InnerXml()
$expected = $testCaseNode.selectSingleNode("expected").get_innerXml()
#$expected = $testCaseNode.expected
write-host "Case ID = $id Arg1 = $arg1 Optional = $optional Arg2 = $arg2 Expected value = $expected"
}
foreach ($testCaseNode in $nodelist) {
$id = $testCaseNode.getAttribute("id")
$inputsNode = $testCaseNode.selectSingleNode("inputs")
$arg1 = $inputsNode.selectSingleNode("arg1").get_InnerXml()
$optional = $inputsNode.selectSingleNode("arg1").getAttribute("optional")
$arg2 = $inputsNode.selectSingleNode("arg2").get_InnerXml()
$expected = $testCaseNode.selectSingleNode("expected").get_innerXml()
#$expected = $testCaseNode.expected
write-host "Case ID = $id Arg1 = $arg1 Optional = $optional Arg2 = $arg2 Expected value = $expected"
}
write-host "`nEnd parsing`n"
The first three statements of the script load file testCases.xml into memory as an XmlDocument object:
[System.Xml.XmlDocument] $xd = new-object System.Xml.XmlDocument
$file = resolve-path("testCases.xml")
$xd.load($file)
$file = resolve-path("testCases.xml")
$xd.load($file)
I could have loaded the XML file in a single line like so:
[xml] $xd = get-content "./testCases.xml"
Using the three-statement approach in the script has no technical advantage but is somewhat more readable by an engineer with C# coding experience. Next I fetch the all testCase nodes into memory:
$nodelist = $xd.selectnodes("/testCases/testCase")
The SelectNodes() method accepts an XPath string which is case-sensitive. With the testCase nodes now in memory I can iterate through each node with a foreach loop. Alternatively I could have iterated using a for loop with an index variable (say $i) in conjunction with the Item() method. For each node, I first fetch the test case ID attribute:
$id = $testCaseNode.getAttribute("id")
I use the GetAttribute() method of the XmlElement class. Interestingly I could have written this instead:
$id = $testCaseNode.id
This alternative illustrates an important point. In an effort to make parsing XML with PowerShell easier than with C# or VB.NET, the designers of PowerShell decided to directly expose attributes and values of XML elements in the form of properties. But since arbitrary XML data is available as properties, PowerShell does not expose standard .NET Framework properties (such as InnerXml) because there could be a name conflict. Note that PowerShell does expose standard .NET Framework methods such as GetAttribute(). Continuing in my script, next I grab the values of arg1:
$inputsNode = $testCaseNode.selectSingleNode("inputs")
$arg1 = $inputsNode.selectSingleNode("arg1").get_InnerXml()
$optional = $inputsNode.selectSingleNode("arg1").getAttribute("optional")
$arg1 = $inputsNode.selectSingleNode("arg1").get_InnerXml()
$optional = $inputsNode.selectSingleNode("arg1").getAttribute("optional")
I use the SelectSingleNode() method to grab the single <input> node. Now instead of using the standard InnerXml property, which PowerShell does not expose, I use the underlying PowerShell get_InnerXml() method which corresponds to the non-exposed InnerXml property. OK, but just how did I know about this get_InnerXml() method? As with many PowerShell scripting tasks, before writing my script I had previously experimented by issuing interactive commands at the PowerShell prompt. For example, after interactively loading the XML file into memory (by typing the first three statements in my script), I typed commands such as:
> $nodelist = $xd.selectnodes("/testCases/testCase")
> $firstnode = $nodelist.item(0)
> $inputs = $firstnode.selectSingleNode("inputs")
> $arg1 = $inputs.selectSingleNode("arg1")
> $arg1 | get-member | more
> $firstnode = $nodelist.item(0)
> $inputs = $firstnode.selectSingleNode("inputs")
> $arg1 = $inputs.selectSingleNode("arg1")
> $arg1 | get-member | more
Using the get-member cmdlet is the key to discovering exactly what properties and methods are available to an object. Anyway, the rest of the script should be reasonably self-explanatory because I use the same coding techniques. To summarize, although there are several ways to parse an XML file using PowerShell, a flexible approach is to use the XmlDocument class. After reading an XML file into memory as an XmlDocument object, you can select multiple nodes into a collection using the SelectNodes() method, grab a single node using the SelectSingleNode() method, retrieve an attribute using either the standard GetAttribute() method or the name of the attribute which PowerShell exposes as a property, and you can obtain an element value using the special get_InnerXml() PowerShell method.
- Parsing XML Files with PowerShell
- Parsing XML Files with SAX
- Tutorial: Loading and parsing external XML and JSON files with Unity
- Parsing XML with SAX_(MyMoviesWithHttpClient)
- Parsing XML with XmlPull_(MyMoviesWithHttpClient)
- 11.12. Parsing XML with NSXMLParser
- Parsing XML Files(用NSXMLParser解析xml文件)
- [iPhone]XML文件解析 parsing-xml-files NSXMLParser
- parsing xml with sax and pull
- How-to: parsing XML with Qt
- Python 101 – Intro to XML Parsing with ElementTree
- Displaying XML Files with ASP.NET 2.0
- Parsing JSON With SBJSON
- Parsing Strings with split
- Parsing Arguments with getopt
- Introduction to DataSets and working with XML files
- Calling Executable Files in PowerShell
- Parsing XML in J2ME
- DiscuzX2.0在windows下的配置
- 位图排序原理及C语言实现(源于《编程珠玑》)
- mysql utf-8乱码的解决
- typedef用法
- typedef用法
- Parsing XML Files with PowerShell
- 一段RUBY的脚本,分析姓名的分数,本来没甚么难的,就是ruby1.91的编码问题,导致一堆问题。
- 不做让开发人员讨厌的产品经理
- ffmpeg+x264 编译main profile,或者baseline 视频
- LoaderRunner常用函数
- .负载测试,英文是Load testing
- DeltaOfficewod在线编辑控件 缺省支持office word 2007
- java面向对象4多态_第8天
- [转]关于Activity和Task的设计思路和方法