DOM解析XML(四)
来源:互联网 发布:h动漫推荐 知乎 编辑:程序博客网 时间:2024/05/21 11:21
上一篇中我们讨论了解析器如何将XML文件转化为Document(总体大概知道是如何解析,还需后期对底层代码好好了解),下面我们看看转化为Document之后,是如何取的Document里面的一些元素节点的呢?
再看代码之后我要了解一些基础的知识:
java.util.Vector:我个人的理解它的本质是一个大小可变化的数组,在这个基础上又封装了一些方法来存取数据
上一篇我们知道:Document document = builder.parse(inputStream);返回的对象是DeferredDocumentImpl
那我们继续逐行分析我们的例子代码:Element element = document.getDocumentElement();
它是调用父类com.sun.org.apache.xerces.internal.dom.CoreDocumentImpl的方法:
/** * Convenience method, allowing direct access to the child node * which is considered the root of the actual document content. For * HTML, where it is legal to have more than one Element at the top * level of the document, we pick the one with the tagName * "HTML". For XML there should be only one top-level * * (HTML not yet supported.) */ public Element getDocumentElement() { if (needsSyncChildren()) { synchronizeChildren(); } return docElement; }
最后返回的对象是com.sun.org.apache.xerces.internal.dom.ElementImpl:
我们在来看看这句:NodeList bookNodes = element.getElementsByTagName("book");
/** * Returns a NodeList of all descendent nodes (children, * grandchildren, and so on) which are Elements and which have the * specified tag name. * <p> * Note: NodeList is a "live" view of the DOM. Its contents will * change as the DOM changes, and alterations made to the NodeList * will be reflected in the DOM. * * @param tagname The type of element to gather. To obtain a list of * all elements no matter what their names, use the wild-card tag * name "*". * * @see DeepNodeListImpl */ public NodeList getElementsByTagName(String tagname) { return new DeepNodeListImpl(this,tagname); }
看看这句:bookNodes.getLength():
/** Returns the length of the node list. */ public int getLength() { // Preload all matching elements. (Stops when we run out of subtree!) item(java.lang.Integer.MAX_VALUE); return nodes.size(); }
/** Returns the node at the specified index. */ public Node item(int index) { Node thisNode; // Tree changed. Do it all from scratch! if(rootNode.changes() != changes) { nodes = new Vector(); changes = rootNode.changes(); } // In the cache if (index < nodes.size()) return (Node)nodes.elementAt(index); // Not yet seen else { // Pick up where we left off (Which may be the beginning) if (nodes.size() == 0) thisNode = rootNode; else thisNode=(NodeImpl)(nodes.lastElement()); // Add nodes up to the one we're looking for while(thisNode != null && index >= nodes.size()) { thisNode=nextMatchingElementAfter(thisNode); if (thisNode != null) nodes.addElement(thisNode); } // Either what we want, or null (not avail.) return thisNode; } } // item(int):Node
下面这个方法Node current一开始传入进来就是DeferredElementImpl类
/** * Iterative tree-walker. When you have a Parent link, there's often no * need to resort to recursion. NOTE THAT only Element nodes are matched * since we're specifically supporting getElementsByTagName(). */ protected Node nextMatchingElementAfter(Node current) { Node next; while (current != null) { // Look down to first child. if (current.hasChildNodes()) { current = (current.getFirstChild()); } // Look right to sibling (but not from root!) else if (current != rootNode && null != (next = current.getNextSibling())) {current = next;}// Look up and right (but not past root!)else {next = null;for (; current != rootNode; // Stop when we return to starting pointcurrent = current.getParentNode()) {next = current.getNextSibling();if (next != null)break;}current = next;}// Have we found an Element with the right tagName?// ("*" matches anything.) if (current != rootNode && current != null && current.getNodeType() == Node.ELEMENT_NODE) {if (!enableNS) { if (tagName.equals("*") ||((ElementImpl) current).getTagName().equals(tagName)) {return current; }} else { // DOM2: Namespace logic. if (tagName.equals("*")) {if (nsName != null && nsName.equals("*")) { return current;} else { ElementImpl el = (ElementImpl) current; if ((nsName == null && el.getNamespaceURI() == null)|| (nsName != null && nsName.equals(el.getNamespaceURI()))) {return current; }} } else {ElementImpl el = (ElementImpl) current;if (el.getLocalName() != null && el.getLocalName().equals(tagName)) { if (nsName != null && nsName.equals("*")) {return current; } else {if ((nsName == null && el.getNamespaceURI() == null) || (nsName != null &&nsName.equals(el.getNamespaceURI()))){ return current;} }} }} }// Otherwise continue walking the tree } // Fell out of tree-walk; no more instances found return null; } // nextMatchingElementAfter(int):Node
我们先来看看com.sun.org.apache.xerces.internal.dom.ParentNode类中的current.hasChildNodes()这句都做了些什么:
/** * Test whether this node has any children. Convenience shorthand * for (Node.getFirstChild()!=null) */ public boolean hasChildNodes() { if (needsSyncChildren()) { synchronizeChildren(); } return firstChild != null; }
这里的synchronizeChildren()方法是实现类DeferredElementImpl实现如下:
protected final void synchronizeChildren() { DeferredDocumentImpl ownerDocument = (DeferredDocumentImpl) ownerDocument(); ownerDocument.synchronizeChildren(this, fNodeIndex); } // synchronizeChildren()
这里就把text对象赋值给element: p.firstChild = firstNode;所以在后面才能取到
/** * Synchronizes the node's children with the internal structure. * Fluffing the children at once solves a lot of work to keep * the two structures in sync. The problem gets worse when * editing the tree -- this makes it a lot easier. * This is not directly used in this class but this method is * here so that it can be shared by all deferred subclasses of ParentNode. */ protected final void synchronizeChildren(ParentNode p, int nodeIndex) { // we don't want to generate any event for this so turn them off boolean orig = getMutationEvents(); setMutationEvents(false); // no need to sync in the future p.needsSyncChildren(false); // create children and link them as siblings ChildNode firstNode = null; ChildNode lastNode = null; for (int index = getLastChild(nodeIndex); index != -1; index = getPrevSibling(index)) { ChildNode node = (ChildNode) getNodeObject(index); if (lastNode == null) { lastNode = node; } else { firstNode.previousSibling = node; } node.ownerNode = p; node.isOwned(true); node.nextSibling = firstNode; firstNode = node; } if (lastNode != null) { p.firstChild = firstNode; firstNode.isFirstChild(true); p.lastChild(lastNode); } // set mutation events flag back to its original value setMutationEvents(orig); } // synchronizeChildren(ParentNode,int):void
我们可以看到在 type == 3,会新建DeferredTextImpl对象:
/** Instantiates the requested node object. */ public DeferredNode getNodeObject(int nodeIndex) { // is there anything to do? if (nodeIndex == -1) { return null; } // get node type int chunk = nodeIndex >> CHUNK_SHIFT; int index = nodeIndex & CHUNK_MASK; int type = getChunkIndex(fNodeType, chunk, index); if (type != Node.TEXT_NODE && type != Node.CDATA_SECTION_NODE) { clearChunkIndex(fNodeType, chunk, index); } // create new node DeferredNode node = null; switch (type) { // // Standard DOM node types // case Node.ATTRIBUTE_NODE: { if (fNamespacesEnabled) { node = new DeferredAttrNSImpl(this, nodeIndex); } else { node = new DeferredAttrImpl(this, nodeIndex); } break; } case Node.CDATA_SECTION_NODE: { node = new DeferredCDATASectionImpl(this, nodeIndex); break; } case Node.COMMENT_NODE: { node = new DeferredCommentImpl(this, nodeIndex); break; } // NOTE: Document fragments can never be "fast". // // The parser will never ask to create a document // fragment during the parse. Document fragments // are used by the application *after* the parse. // // case Node.DOCUMENT_FRAGMENT_NODE: { break; } case Node.DOCUMENT_NODE: { // this node is never "fast" node = this; break; } case Node.DOCUMENT_TYPE_NODE: { node = new DeferredDocumentTypeImpl(this, nodeIndex); // save the doctype node docType = (DocumentTypeImpl)node; break; } case Node.ELEMENT_NODE: { if (DEBUG_IDS) { System.out.println("getNodeObject(ELEMENT_NODE): "+nodeIndex); } // create node if (fNamespacesEnabled) { node = new DeferredElementNSImpl(this, nodeIndex); } else { node = new DeferredElementImpl(this, nodeIndex); } // save the document element node if (docElement == null) { docElement = (ElementImpl)node; } // check to see if this element needs to be // registered for its ID attributes if (fIdElement != null) { int idIndex = binarySearch(fIdElement, 0, fIdCount-1, nodeIndex); while (idIndex != -1) { if (DEBUG_IDS) { System.out.println(" id index: "+idIndex); System.out.println(" fIdName["+idIndex+ "]: "+fIdName[idIndex]); } // register ID String name = fIdName[idIndex]; if (name != null) { if (DEBUG_IDS) { System.out.println(" name: "+name); System.out.print("getNodeObject()#"); } putIdentifier0(name, (Element)node); fIdName[idIndex] = null; } // continue if there are more IDs for // this element if (idIndex + 1 < fIdCount && fIdElement[idIndex + 1] == nodeIndex) { idIndex++; } else { idIndex = -1; } } } break; } case Node.ENTITY_NODE: { node = new DeferredEntityImpl(this, nodeIndex); break; } case Node.ENTITY_REFERENCE_NODE: { node = new DeferredEntityReferenceImpl(this, nodeIndex); break; } case Node.NOTATION_NODE: { node = new DeferredNotationImpl(this, nodeIndex); break; } case Node.PROCESSING_INSTRUCTION_NODE: { node = new DeferredProcessingInstructionImpl(this, nodeIndex); break; } case Node.TEXT_NODE: { node = new DeferredTextImpl(this, nodeIndex); break; } // // non-standard DOM node types // case NodeImpl.ELEMENT_DEFINITION_NODE: { node = new DeferredElementDefinitionImpl(this, nodeIndex); break; } default: { throw new IllegalArgumentException("type: "+type); } } // switch node type // store and return if (node != null) { return node; } // error throw new IllegalArgumentException(); } // createNodeObject(int):Node
我这里有一点搞不明白,当执行完这句话node = new DeferredElementImpl(this, nodeIndex);之后,node里面属性name的值就有了。
后来也看了很多次,发现每次传入nodeIndex的值都不一样,我在想原因是不是在这里呢?
- DOM解析XML(四)
- XML学习笔记(四):xml解析详解以及使用 DOM和SAX 解析XML :
- XML(DOM)解析
- DRP项目(七)----XML的四种解析器之DOM解析XML
- Android解析自定义xml文件--Dom解析xml文件,测试demo(方案四)
- XML解析(二),DOM解析XML
- XML解析之DOM解析_四层结合数据库
- dom解析xml(转)
- DOM解析XML(转)
- xml解析(dom java)
- DOM解析XML(一)
- DOM解析XML(二)
- DOM解析XML(三)
- DOM解析xml(三)
- XML DOM---解析xml dom
- 【XML】DOM解析XML
- DOM解析(GDataXMLNode)详解,xml解析
- 解析xml(1) Dom解析(自用)
- Spring Batch的配置文件解读
- Maven的依赖机制简介
- maven 教程一 入门
- PHP通过DOMDocument操作解析xml
- Web项目HA部署方法
- DOM解析XML(四)
- SWD应用接口
- 三色球问题。有红、黄、绿三种颜色的球,其中红球3个,黄球3个,绿球6个。现将这12个球混放在一个盒子里,从中任意摸出8个球,编程计算摸出球的各种颜色搭配。
- eclipse快捷键汇总
- 但是从Linux2.4内核以后,已经完全内置了LVS的各个功能模块
- keil优化等级设置
- Spring事务配置的五种方式
- 游戏中播放不带皮肤的视频
- HttpServletRequest接收参数的几种方法