Standard Special Characters

There are 5 characters that are citical to parsing XML, so that when they appear in the text values of the XML they are "escaped," meaning that they are encoded so that they will not confuse the parser. For example the less than sign < takes the form &lt;. These special characters are handled the same way in HTML too.

CMarkup escapes and unescapes (encodes and decodes) five special chars: the ampersand, double and single quotes, less than and greater than signs. The ampersand is important because it is the escape character used to encode special characters. The XML specification allows single quoted and double quoted attribute values and CMarkup uses double quotes to quote attribute values. Prior to release 6.3, the CMarkup parser did not accept single quoted values but it has always encoded/decoded them inside attribute values.

  Character Encoded Usage
< less than &lt; attribute values and content
> greater than &gt; attribute values and content
& ampersand &amp; attribute values and content
' apostrophe or single quote &apos; attribute values only
" double quote &quot; attribute values only

See also Numeric Character References, EscapeText and UnescapeText.

Update September 27, 2008: With CMarkup release 10.0, over 200 standard HTML entities are now also unescaped by UnescapeText.