XML Namespaces are an attempt by the creators of the XML standard to systematize the development of markup vocabularies of element and attribute names (see Namespaces in XML for the motivation behind XML namespaces and all the gory details). The bottom line is that XML Namespaces often become an obstacle to developers needing to deploy applications and parse existing documents, but CMarkup allows you to overcome these obstacles painlessly.
XML Namespaces by Example, Tim Bray on XML.com, is a great quick start, but I'll make it even quicker and throw in default namespaces to boot. In the root element of the document you will often find attributes that declare namespaces such as:
<h:html xmlns:h="http://www.w3.org/HTML/1998/html4"
xmlns:xdc="http://www.xml.com/books">
This declares the h
and xdc
prefixes which will be used in the document. It is important to see the names "http://www.w3.org/HTML/1998/html4" and "http://www.xml.com/books" simply as unique identifiers; no attempt is made to access those URLs during manipulation of the XML document. In this example it allows you to show the title
element is from the XDC vocabulary rather than the HTML one.
<xdc:title h:style="font-weight:bold;">
There is also a default namespace (declared with the xmlns
attribute alone) which applies to all non-prefixed elements within and including the element where it is specified. It can be overridden by re-specifying the default namespace in a contained element (using an empty string to turn off the default). Note that in this example the second title
element is from the "http://www.xml.com/books" namespace, not "http://www.w3.org/HTML/1998/html4".
<html xmlns="http://www.w3.org/HTML/1998/html4">
<head><title>Book Review</title></head>
<body>
<bookreview xmlns="http://www.xml.com/books">
<title>XML: A Primer</title>
That is basically all there is to it! Well, there are some complications, like default namespaces not applying to attributes, and the issue of an element from one vocabulary having an attribute from another was hinted in the example with the h:style
attribute.
Well, a soon as you try to hide the complexity of colliding vocabularies from the developer, you are sure to cause twice the grief. Aside from making XML less human readable, the main problem with XML Namespaces really is in trying to gracefully support them in an API.
How do you do it with CMarkup? Don't do anything! Just treat all the tag names and attributes literally and you can access, create and modify any document format at all. CMarkup holds to the tenet that if you cannot greatly simplify it, don't even try.
In MSXML you cannot set the xmlns
attribute like you set other attributes, it needs to be set as part of the createNode
function.
I am able to reproduce what you are observing. Any element I add under mim_0003
gets an xmlns=""
attribute. This is because MSXML requires the default namespace to be set as part of the createNode
function, not with setAttribute
. The default namespace needs to be specified to every element. When you retrieve the XML string with GetDoc
or Save
, MSXML generates xmlns
attributes for each element based on comparing its namespace to that of the parent. The solution in CMarkupMSXML
is a function added in release 7.3 called SetDefaultNamespace
. Here is how you would use it:
CMarkupMSXML xml; xml.SetDefaultNamespace( _T("http://www.mimosa.org/TechXMLV3-0") ); xml.AddElem( _T("mim_0003") ); xml.AddChildElem( _T("connect_req") );
Essentially, what this does in MarkupMSXML.cpp is replace:
MSXMLNS::IXMLDOMElementPtr pNew = m_pDOMDoc->createElement( _bstr_t(szName) );
with:
MSXMLNS::IXMLDOMElementPtr pNew; if ( m_strDefaultNamespace.IsEmpty() ) pNew = m_pDOMDoc->createElement( _bstr_t(szName) ); else pNew = m_pDOMDoc->createNode( _variant_t((short)MSXMLNS::NODE_ELEMENT), _bstr_t(szName), _bstr_t(m_strDefaultNamespace) );
MSXML tracks namespaces to ensure correctness, but in doing so creates extra work for the developer. The SetDefaultNamespace
function has been added to CMarkupMSXML
to make it easier. With CMarkup
you just set the attributes and tags literally with no intervention (or correctness checking) from the XML tool as shown in the example below.
XML namespaces
David Carroll 28-Jul-2006
While checking KDM import, we found that the new KDM SHA-256 message format includes external namespaces - "ETM" and "KDM". The xml parser is not expecting the added namespaces in the new format. These cannot be resolved in our current system configuration (no external access and no DNS). Can you advise how best to parse both formats with CMarkup. We currently cannot resolve external dtd references from the computer doing the parsing.
Old KDM (SHA-1):
<?xml version="1.0" encoding="utf-8"?>
<DCinemaSecurityMessage xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
xmlns:enc="http://www.w3.org/2001/04/xmlenc#"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="./KDM.xsd">
....
New KDM (SHA-256):
<?xml version="1.0" encoding="utf-8"?>
<etm:DCinemaSecurityMessage
xmlns:etm="http://www.smpte-ra.org/schemas/430-3/2006/ETM"
xmlns:kdm="http://www.smpte-ra.org/schemas/430-1/2006/KDM"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
xmlns:xenc="http://www.w3.org/2001/04/xmlenc#">
....
You will have to adjust your tag name logic. For example, if the root element is etm:DCinemaSecurityMessage
then other elements may be named like etm:UserText
(just an example I found online in the SMPTE Standard for
Digital Cinema). One way to do this is to test the root tag name at the start and then use the prefix there (if any) from then on. In MFC it might look like this:
xml.ResetPos(); xml.FindElem(); CString csRootTag = xml.GetTagName(); CString csPre; // namespace prefix int nColon = csRootTag.Find(':'); if ( nColon > 0 ) csPre = csRootTag.Left( nColon+1 );
then later on you can do tag names like
xml.FindElem( csPre+"UserText" ); xml.FindElem( csPre+"AuthenticatedPublicType" );
Steve Reilly 29-Oct-2004
This problem only occurs when you use
MarkupMSXML
, not standardCMarkup
. I'm having a problem - the following code:should produce the following [formatting added for clarity]
instead, I get
The xmlns attribute on element
mim_0003
is being duplicated on theconnect_req
element and there seems to be no way to get rid of it. What am I doing wrong?