CMarkup GetDocFormatted Method

const MCD_STR& CMarkup::GetDocFormatted( int nFormatFlags = 0 ) const;

CMarkup Developer License

GetDocFormatted is only in CMarkup Developer and the free XML editor  FOAL C++ scripting.

The GetDocFormatted method can be used at any time to return a formatted copy of the full document as a string. For example:

CMarkup xml;
xml.SetDoc( "<root><msg>Hello World</msg></root>" );
str sXML = xml.GetDocFormatted( 2 );

Returns the following text (syntax coloring is only for illustration):

  <msg>Hello World</msg>

The document and current position are not affected by this call. It is generally used after creating or modifying the document when it is time to transport or save the document. Unlike GetDoc which can return immediately with the document string as it is stored in the CMarkup object, the GetDocFormatted method must process the document to format it, but this is high performance (many megabytes per second).

Update February 5, 2011: With release 11.4, the space between an attribute name and value is formatted (previously it was untouched). Also, there are improvements in speed and memory efficiency.

Use the nFormatFlags argument to specify indentation as follows:

0 or unspecifiedalign at left
2indent with 2 spaces
4indent with 4 spaces
17indent with 1 tab

If you want to actually change the document object to contain the formatted text, you must parse the formatted result remembering that this will lose the current position and any saved positions in the CMarkup object.

xml.SetDoc( xml.GetFormattedDoc(17) );

Mixed content

The format process can only modify whitespace (spaces, returns, newlines and tabs) deemed to be insignificant, so to be formatted all the sibling elements must be separated by whitespace only.

It is not intended for HTML or XML documents with mixed content. The contents of elements containing a mix of elements and text (as in HTML paragraphs) will usually be left untouched. However, you can have an HTML paragraph that looks like two discrete elements:


The format process would separate the em and strong elements onto separate lines causing them to appear as separate words instead of one word when the HTML is rendered.