CMarkup: fast simple C++ XML parser 学习笔记

来源:互联网 发布:造梦西游1装备数据大全 编辑:程序博客网 时间:2024/05/29 12:13
C++中创建和解析XML最好用的开源工具CMakeup,基本可以满足日常开发之用。
官网:www.firstobject.com, 在这里可以下载CMakeup源码,其中还包括一个Demo,可以用来学习之。
注:不知道为什么这个网站访问很慢,有时直接无法访问,在这里提供CMakeup源代码下载,解析工具下载。

下面摘录自“www.firstobject.com”,注意黄底红字:
Create new XML documents, parse and modify existing XML documents from the methods of one simple C++ XML parser class.

Quick Start

  • Open the zip file and copy Markup.cpp and Markup.h into your C++ project folder
  • Add Markup.cpp and Markup.h to your project (makefile or IDE)
  • #include "Markup.h" where you use the CMarkup class

    Visual C++ specific:

  • In Visual C++ projects that use precompiled headers you will need to turn them off for Markup.cpp (see Pre-compiled Header Issue)
  • In Visual C++ to use STL string instead of MFC CString add MARKUP_STL to your C++ Preprocessor Definitions
  • CMarkup Methods

    This is the master list of CMarkup class methods. The CMarkup methods are based on the originalEDOM design. The shaded methods are only available in the Developer Version of CMarkup.

    Initialization

    LoadPopulates the CMarkup object from a file and parses itSetDocPopulates the CMarkup object from a string and parses it

    Output

    SaveWrites the document to fileGetDocReturns the whole document as a markup stringGetDocFormattedReturns the formatted markup string of the whole document

    File mode

    OpenOpens file, initiating file mode for read or write (and append is a special case of write mode)CloseCloses file and ends file modeFlushFor file write mode, this flushes any partial document in memory (up to the closing tags) and the file stream itself

    Changing the current position

    FindElemLocates next element, optionally matching tag name or pathFindChildElemLocates next child element matching tag name or pathFindPrevElemLocates previous element, optionally matching tag nameFindPrevChildElemLocates previous child element, optionally matching tag nameFindNodeLocates next node, optionally matching node type(s)IntoElemGo "into" current main position element such that it becomes the current parent positionOutOfElemMakes the current parent position into the current main positionResetPosResets the current position to the start of the documentResetMainPosResets the current main position to before the first siblingResetChildPosResets the current child position to before the first child

    Adding to the Document

    AddElemAdds an element after the current main position element or last siblingInsertElemInserts an element before the current main position element or first siblingAddChildElemAdds an element after the current child position element or last childInsertChildElemInserts an element before the current child position element or first childAddSubDocAdds a subdocument after the current main position element or last siblingInsertSubDocInserts a subdocument before the current main position element or first siblingAddChildSubDocAdds a subdocument after the current child position element or last childInsertChildSubDocInserts a subdocument before the current child position element or first childAddNodeAdds a node after the current node or at the end of the parent element contentInsertNodeinserts a node before the current node or at the beginning of the parent element content

    Removing from the Document

    RemoveElemRemoves the current main position element including child elementsRemoveChildElemRemoves the current child position element including its child elementsRemoveNodeRemoves the current nodeRemoveAttribRemoves the specified attribute from the current main position elementRemoveChildAttribRemoves the specified attribute from the current child position element

    Getting Values

    GetDataReturns the string value of the current main position element or nodeGetChildDataReturns the string value of the current child position elementGetElemContentReturns the string markup content of the current main position element including child elementsGetSubDocReturns the subdocument markup string of the current main position element including child elementsGetChildSubDocReturns the subdocument markup string of the current child position element including child elementsGetAttribReturns the string value of the specified attribute of the main position element (or processing instruction)GetChildAttribReturns the string value of the specified attribute of the child position elementHasAttribReturns true if the specified attribute exists in the main position element (or processing instruction)HasChildAttribReturns true if the specified attribute exists in the child position elementGetTagNameReturns the tag name of the main position element (or processing instruction)GetChildTagNameReturns the tag name of the child position elementFindGetDataLocates the next element matching the specified path and returns the string value

    Setting Values

    SetDataSets the value of the current main position element or nodeSetChildDataSets the value of the current child position elementSetElemContentSets the markup content of the current main position elementSetAttribSets the value of the specified attribute of the current main position element (or processing instruction)SetChildAttribSets the value of the specified attribute of the current child position elementFindSetDataLocates the next element matching the specified path and sets the value

    Other Info

    GetNthAttribReturns the name and value of attribute specified by number for the current main position elementGetAttribNameReturns the name of attribute specified by number for the current main position elementGetNodeTypeReturns the node type of the current nodeGetElemLevelReturns the level of the current main positionGetElemFlagsReturns the current main position element's flagsSetElemFlagsSets the current main position element's flagsGetOffsetsObtains the document text offsets of the current main positionGetAttribOffsetsObtains the document text offsets of the specified attribute in the current main position

    Remembering positions

    SavePosSaves the current position with an optional string name using a hash mapRestorePosGoes to the position saved with SavePosSetMapSizeSets the size of a map for use with the SavePos and RestorePos methodsGetElemIndexReturns the integer index of the current main position elementGotoElemIndexSets the current main position element to that of the given integer indexGetChildElemIndexReturns the integer index of the current child position elementGotoChildElemIndexSets the current child position element to that of the given integer indexGetParentElemIndexReturns the integer index of the current parent position elementGotoParentElemIndexSets the current parent position element to that of the given integer indexGetElemPathReturns a string representing the absolute path of the main position elementGetChildElemPathReturns a string representing the absolute path of the child position elementGetParentElemPathReturns a string representing the absolute path of the parent position element

    Document Status

    IsWellFormedDetermines if document has a single root element and properly contained elementsGetResultReturns result markup from last parse or file operationGetErrorReturns English error/result synopsis string from last parse or file operationGetDocFlagsReturns the document flagsSetDocFlagsSets the document flagsGetDocElemCountReturns the number of elements in the document

    Static Utility Functions

    ReadTextFileReads a text file into a stringWriteTextFileWrites a string to a text fileGetDeclaredEncodingReturns the encoding name as a string from the XML declarationEscapeTextReturns the string with special characters encoded for markupUnescapeTextReturns the string with special characters unencoded for a string valueUTF8ToAConverts a UTF-8 string to a non-Unicode ("ANSI") stringAToUTF8Converts a non-Unicode ("ANSI") string to UTF-8UTF16To8Converts a UTF-16 string to UTF-8UTF8To16Converts a UTF-8 string to UTF-16EncodeBase64Encodes a binary data buffer to a Base64 stringDecodeBase64Encodes a Base64 string to a binary data buffer

































































































































































































































    Fast start to XML in C++

    Enough bull. You want to create XML or read and find things in XML. All you need to know about CMarkup is that it is just one object per XML document (for the API design concept see EDOM). And by the way the free firstobject XML Editor generates C++ source code for creating and navigating your own XML documents with CMarkup.

    Creating an XML Document

    To create an XML document, instantiate a CMarkup object and call AddElem to create the root element. At this point your document would simply contain the empty root element e.g. <ORDER/>. Then call IntoElem to go "inside" the ORDER element so that you can create child elements under the root element (i.e. the root element will be the "container" of the child elements).

    The following example code creates an XML document.

    CMarkup xml;xml.AddElem( "ORDER" );xml.IntoElem();xml.AddElem( "ITEM" );xml.IntoElem();xml.AddElem( "SN", "132487A-J" );xml.AddElem( "NAME", "crank casing" );xml.AddElem( "QTY", "1" );

    This code generates the following XML. The root is the ORDER element; notice that its start tag<ORDER> is at the beginning and end tag </ORDER> is at the bottom. When an element is under (i.e. inside or contained by) a parent element, the parent's start tag is before it and the parent's end tag is after it. The ORDER element contains one ITEM element. That ITEM element contains 3 child elements: SN, NAME, and QTY.

    <ORDER><ITEM><SN>132487A-J</SN><NAME>crank casing</NAME><QTY>1</QTY></ITEM></ORDER>

    As shown in the example, you create elements under an element by calling IntoElem to make your current main position (or "place holder") into your current parent position so you can begin adding child elements. CMarkup maintains a current position in order to keep your source code shorter and simpler. This same position logic is used when navigating a document.

    You can write the above document to file with Save:

    xml.Save( "C:\\Sample.xml" );

    And you can retrieve the XML into a string with GetDoc:

    MCD_STR strXML = xml.GetDoc();

    Markup.h defines MCD_STR to the string type you compile CMarkup for, so we use MCD_STR in these examples, but you can use your own string type explicitly (e.g. std::string or CString).

    Navigating an XML Document

    You can navigate the data right inside the same CMarkup object you created in the example above; just call ResetPos if you want to go back to the beginning of the document. Or you can populate a new CMarkup object:

    CMarkup xml;

    From a file with Load:

    xml.Load( "C:\\Sample.xml" );

    Or from an XML string with SetDoc:

    xml.SetDoc( strXML );

    In the following example, we go inside the root ORDER element and loop through all ITEM elements with FindElem to get the serial number and quantity of each with GetData. The serial number is treated as a string and the quantity is converted to an integer using atoi (MCD_2PCSZ is defined in Markup.h to return the string's const pointer).

    xml.FindElem(); // root ORDER elementxml.IntoElem(); // inside ORDERwhile ( xml.FindElem("ITEM") ){    xml.IntoElem();    xml.FindElem( "SN" );    MCD_STR strSN = xml.GetData();    xml.FindElem( "QTY" );    int nQty = atoi( MCD_2PCSZ(xml.GetData()) );    xml.OutOfElem();}

    For each item we find, we call IntoElem to interrogate its child elements, and then OutOfElemafterwards. As you get accustomed to this type of navigation you will know to check in your loops to make sure there is a corresponding OutOfElem call for every IntoElem call.

    Adding Elements and Attributes

    The above example for creating a document only created one ITEM element. Here is an example that creates multiple items loaded from a previously populated data source, plus a SHIPMENT information element in which one of the elements has an attribute we set with SetAttrib.

    CMarkup xml;xml.AddElem( "ORDER" );xml.IntoElem(); // inside ORDERfor ( int nItem=0; nItem<aItems.GetSize(); ++nItem ){    xml.AddElem( "ITEM" );    xml.IntoElem(); // inside ITEM    xml.AddElem( "SN", aItems[nItem].strSN );    xml.AddElem( "NAME", aItems[nItem].strName );    xml.AddElem( "QTY", aItems[nItem].nQty );    xml.OutOfElem(); // back out to ITEM level}xml.AddElem( "SHIPMENT" );xml.IntoElem(); // inside SHIPMENTxml.AddElem( "POC" );xml.SetAttrib( "type", strPOCType );xml.IntoElem(); // inside POCxml.AddElem( "NAME", strPOCName );xml.AddElem( "TEL", strPOCTel );

    This code generates the following XML. The root ORDER element contains 2 ITEM elements and a SHIPMENT element. The ITEM elements both contain SN, NAME and QTY elements. The SHIPMENT element contains a POC element which has a type attribute, and NAME and TEL child elements.

    <ORDER><ITEM><SN>132487A-J</SN><NAME>crank casing</NAME><QTY>1</QTY></ITEM><ITEM><SN>4238764-A</SN><NAME>bearing</NAME><QTY>15</QTY></ITEM><SHIPMENT><POC type="non-emergency"><NAME>John Smith</NAME><TEL>555-1234</TEL></POC></SHIPMENT></ORDER>

    Finding Elements

    The FindElem method goes to the next sibling element. If the optional tag name argument is specified, then it goes to the next element with a matching tag name. The element that is found becomes the current element, and the next call to FindElem will go to the next sibling or matching sibling after that current position.

    When you cannot assume the order of the elements, you must move the position back before the first sibling with ResetMainPos in between your calls to the FindElem method. Looking at the ITEM element in the above example, if someone else is creating the XML and you cannot assume the SN element is before the QTY element, then call ResetMainPos before finding the QTY element.

    {    xml.IntoElem();    xml.FindElem( "SN" );    MCD_STR strSN = xml.GetData();    xml.ResetMainPos();    xml.FindElem( "QTY" );    int nQty = atoi( MCD_2PCSZ(xml.GetData()) );    xml.OutOfElem();}

    To find the item with a particular serial number, you can loop through the ITEM elements and compare the SN element data to the serial number you are searching for. By specifying the "ITEM" element tag name in the FindElem method we ignore all other sibling elements such as the SHIPMENT element. Also, instead of going into and out of the ITEM element to look for the SN child element, we use the FindChildElem and GetChildData methods for convenience.

    xml.ResetPos(); // top of documentxml.FindElem(); // ORDER element is rootxml.IntoElem(); // inside ORDERwhile ( xml.FindElem("ITEM") ){    xml.FindChildElem( "SN" );    if ( xml.GetChildData() == strFindSN )        break; // found}

    You are NOT on your own

    This site has all kinds of examples of doing various XML operations. CMarkup has been widely used for many years. Of course it doesn't do everything, but almost every purpose has at least been discussed. Don't hesitate to ask if you have questions. A good place to go next is the CMarkup Methods.


    0 0
    原创粉丝点击