CMarkup FindNode Method

int CMarkup::FindNode( int nNodeType = 0 );

FindNode moves the current main position to the next node (whether or not it is an element). The FindNode method works together with the element navigation methods, to provide access to nodes under the current parent element position in the document. Elements are nodes too, so you can use FindNode instead of FindElem to locate elements plus other kinds of nodes. This table shows the node types:

Node Example of Node
MNT_ELEMENT <ITEM>data</ITEM>
MNT_COMMENT 
MNT_PROCESSING_INSTRUCTION <?xml version="1.0"?>
MNT_DOCUMENT_TYPE <!DOCTYPE greeting SYSTEM "hello.dtd">
MNT_CDATA_SECTION <![CDATA[data]]>
MNT_TEXT hello
MNT_WHITESPACE
MNT_LONE_END_TAG </P>

Node	Example of Node
`MNT_ELEMENT`	`<ITEM>data</ITEM>`
`MNT_COMMENT`	`<!-- comment -->`
`MNT_PROCESSING_INSTRUCTION`	`<?xml version="1.0"?>`
`MNT_DOCUMENT_TYPE`	`<!DOCTYPE greeting SYSTEM "hello.dtd">`
`MNT_CDATA_SECTION`	`<![CDATA[data]]>`
`MNT_TEXT`	`hello`
`MNT_WHITESPACE`
`MNT_LONE_END_TAG`	`</P>`

The FindNode method sets the current main position to the next sibling node under the current parent position, and matching the nNodeType if specified. It returns the node type of the new current node. If it does not find a (matching) node, it returns 0 and leaves the current position where it was.

The following example document starts with an XML declaration (which is a processing instruction), followed by the root TESTDOC element which contains a comment and an ITEM element.

<?xml version="1.0"?>
<TESTDOC>
  <!-- comment -->
  <ITEM>one</ITEM>
</TESTDOC>

When the document is initialized or ResetPos has been called, FindNode() will set the main position to the document version (this is a Processing Instruction). Subsequent calls to FindNode() will set the main position to the whitespace node between the document version and the TESTDOC element, then the TESTDOC element, then the newline whitespace node following the TESTDOC element (if there is a newline there), and then it would return 0 and leave the current position at the last node found.

xml.ResetPos();
int nNodeType = xml.FindNode(); // MNT_PROCESSING_INSTRUCTION
nNodeType = xml.FindNode(); // MNT_WHITESPACE
nNodeType = xml.FindNode(); // MNT_ELEMENT
nNodeType = xml.FindNode(); // MNT_WHITESPACE
nNodeType = xml.FindNode(); // returns 0, stays at whitespace

Notice that when the current node is an element, the next call to FindNode will look after the element disregarding its content. To scan nodes inside the element you must call IntoElem.

Update March 24, 2009: FindNode can be used with CMarkup release 11.0 developer version file read mode (see C++ XML reader). Unlike in regular mode, if a node is not found in file read mode then the current position will be at the end tag of the parent element or at the end of the document if the starting position was not within a parent element.

Specifying the node type

You can also specify the type of node you want in the optional nNodeType argument to selectively locate a certain type or types of node. The node type integers can be added (binary OR operator) together to find the next node matching one of the combined node types. For example, MNT_COMMENT | MNT_PROCESSING_INSTRUCTION would find comments or processing instructions.

If the main position is the TESTDOC element, calling IntoElem() and then FindNode(MNT_COMMENT) will locate the Comment node.

xml.ResetPos();
xml.FindNode(MNT_ELEMENT); // TESTDOC
xml.IntoElem(); // TESTDOC becomes current parent
xml.FindNode(MNT_COMMENT); // comment

MNT_EXCLUDE_WHITESPACE is a precombined define for all node types except whitespace, so that you can scan for non-whitespace nodes.

xml.ResetPos();
xml.FindNode(MNT_EXCLUDE_WHITESPACE); // XML Declaration
xml.FindNode(MNT_EXCLUDE_WHITESPACE); // TESTDOC root element
xml.FindNode(MNT_EXCLUDE_WHITESPACE); // returns 0, stays at TESTDOC

Whitespace in mixed content

Whitespace nodes consist of any combination of spaces, tabs, carriage returns and linefeeds that exist between markup tags in the XML. With these in addition to the obvious types of nodes, CMarkup provides access to every character in the document and gives complete control over the appearance of the XML when viewed as text. Leading or trailing whitespace in text nodes are not considered separate whitespace nodes, they are part of the adjacent text node. In the following example there are two nodes at the top level, and 3 nodes immediately in the TEST element.

<TEST>
<INFO> this <B>is</B> <I>mixed</I> </INFO>
</TEST>

The individual breakdown of nodes in this document can be represented as follows:

(1.2.2.1) Text "is"

(1.2.4.1) Text "mixed"

Whitespace nodes are often ignored, however if you are processing mixed content in the INFO element, you would treat the whitespace nodes just like you treat the text nodes.