The Versatile Way to Program XML

CMarkup avoids the aspects of XML that are complex and prone to incompatibility and dispute. As mentioned in the EDOM Specification Draft, the EDOM methods cut to the core functionality of XML files and strings.

If you are a programmer, you probably understand the potential of encapsulating an XML string with simple tree-like methods that go to the next sibling or child and create them. Simple enough, but many tools are built around XML documents residing in files and streams rather than strings. A common question seen on the comp.text.xml newsgroup is "how to I get my DOM object into a string?"

Something you may not realize is that you can even implement your own XML validation and transformation easily and efficiently using the simple CMarkup object methods. Validation with a DTD or Schema involves a structural definition at the top of the document or in a separate file, which in practice leads to architectural complexity and rigidity (see The Problem With DTD and XML Schema Validation). However, with the CMarkup object methods, you can check parent child relationships and value constraints easily right there in your code. My experience is that keeping the software relating to your document in as few places as possible results in better flexibility as the XML structures evolve.

DTD has its shortcomings; that is why they developed the XML Schema Definition Language. And the XML Schema is not without its own disputes and inadequacies. In the meantime, while these technologies sort themselves out, you may be able to use CMarkup simply and effectively.

XSLT is the established mechanism for generating HTML from XML. A difficulty with XSLT is that you need to use declarative logic. This often requires quite an intellectual stretch and a lot of XSL code to do simple checks that just happen to be a little unusual. Most of us think more quickly in procedural logic because of all our time spent writing procedures. With CMarkup, you can merge two XML files, or process an XML file and generate an HTML file by concatenating strings (ASP/Perl style), or even to generate XHTML. XSLT is powerful in some ways but not in others and as it is improved, compatibility and version issues may arise.

These are problematic aspects of XML and related technologies that CMarkup purposefully avoids. Just keep in mind that you don't always have to set up a complex validation and/or transformation architecture. It may be easier and better to do it with the simple methods of CMarkup!

comment posted Possibly interesting article: SAX beats DOM model in XML for video games

Matt Jessick 27-Apr-2006

http://www.gamasutra.com/php-bin/news_index.php?story=9080
I am quite happy with using CMarkup for game development and use it for rather large files with good success. Perhaps my application is not as difficult as theirs... or perhaps CMarkup just shields me from perceiving the need for the author's multi-tiered solution to the problems identified. ;)
- Matt Jessick, Director of Game Development/Senior Producer, ISE Games

You hit the nail on the head. CMarkup is many times lighter and faster than a typical DOM, and is probably the ideal solution for large gaming configuration files. Unfortunately developers often only consider the major DOM and SAX solutions and miss out on CMarkup.

The article you showed me did not mention any sizes that I noticed, but you want to stop using a typical DOM implementation after a few megabytes, while CMarkup may be adequate even up to 50 MB (some customers use it with several hundreds of megabytes but it does have a contiguous memory requirement for the document text). Although the article mentioned that SAX is more complicated to implement, it did not make clear that it is unnecessarily so (despite the word "Simple" in its acronym), due to its callback architecture. CMarkup's API (EDOM) was designed with file-based access in mind but with a much simpler architecture like "pull-parsing" and will eventually support reading (and writing) of documents without the whole document in memory, using the same methods used for in-memory access.