| ||||||||
Transformation Example: Apples to OrangesIt is truly impressive that some developers have mastered a challenging technology such as XSLT which makes any moderately complex task nearly impossible. But is it really worth it? It is funny to me when someone being helpful on Microsoft XSL newsgroup produces a big long stylesheet solution that appears absolutely cryptic. Below is a 55 line transformation example taken from a post called how to keep a running count without variable reassigment. As is typical with XSLT, the developer posing the question is having trouble implementing something that is very simple in familiar procedural languages like C++. He just wants to increment an integer for every line of output. It often turns out that there is a way to do it in XSLT, though it might strike you as non-intuitive. I provide this example in order to show how using CMarkup produces a much faster, cleaner and more straight-forward solution to the same problem (see also Transformation Using CMarkup). Buma writes: the inability of variable reassignment to keep a running count in xsl is giving me fits, can someone take the following xml: <columns>
<column>
<col>apple</col>
<col>orange</col>
<col>banana</col>
</column>
<column>
<col>car</col>
<col>train</col>
<col>boat</col>
</column>
<column>
<col>a</col>
<col>b</col>
<col>c</col>
</column>
</columns>
and produce the following output? 1 apple car a 2 apple car b 3 apple car c 4 apple train a 5 apple train b 6 apple train c 7 apple boat a 8 apple boat b 9 apple boat c 10 orange car a 11 orange car b 12 orange car c 13 orange train a 14 orange train b 15 orange train c 16 orange boat a 17 orange boat b 18 orange boat c 19 banana car a 20 banana car b 21 banana car c 22 banana train a 23 banana train b 24 banana train c 25 banana boat a 26 banana boat b 27 banana boat c Dimitre Novatchev responds: It is quite simple, if the transformation is performed in two passes with the results being numbered in the second pass. Below is a one-pass solution: <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="xsl msxsl">
<xsl:output omit-xml-declaration="yes"/>
<xsl:variable name="vrtfnextCombinations">
<xsl:for-each select="*/*">
<n>
<xsl:call-template name="numCombinations">
<xsl:with-param name="pCurGroup" select="."/>
</xsl:call-template>
<xsl:text>
</xsl:text>
</n>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="vnextCombinations"
select="msxsl:node-set($vrtfnextCombinations)/*"/>
<xsl:template match="/">
<xsl:call-template name="combineSiblings">
<xsl:with-param name="pCurGroup" select="*/*[1]"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="combineSiblings">
<xsl:param name="pCurGroup" select="/.."/>
<xsl:param name="pCurCombination"/>
<xsl:param name="pNum" select="1"/>
<xsl:choose>
<xsl:when test="$pCurGroup">
<xsl:for-each select="$pCurGroup/*">
<xsl:variable name="vcurPos" select="position()"/>
<xsl:call-template name="combineSiblings">
<xsl:with-param name="pCurGroup"
select="$pCurGroup/following-sibling::*[1]"/>
<xsl:with-param name="pCurCombination"
select="concat($pCurCombination, ., ' ')"/>
<xsl:with-param name="pNum"
select="$pNum
+
$vnextCombinations[count($pCurGroup/preceding-sibling::*)+1]
*
($vcurPos - 1)"/>
</xsl:call-template>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat($pNum, ' ', $pCurCombination)"/>
<xsl:text>
</xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="numCombinations">
<xsl:param name="pCurGroup" select="/.."/>
<xsl:choose>
<xsl:when test="not($pCurGroup/following-sibling::*[1])">1</xsl:when>
<xsl:otherwise>
<xsl:variable name="vNextCombinations">
<xsl:call-template name="numCombinations">
<xsl:with-param name="pCurGroup"
select="$pCurGroup/following-sibling::*[1]"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="count($pCurGroup/*) * $vNextCombinations"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
If the sight of this does not make you want to avoid XSLT (Why To Avoid XSLT), consider also that it is an order of magnitude slower than the CMarkup solution I give below. Plus, while the XSL works on the given example, it produced invalid line numbers when I added more column and col elements. It appears that it depends on the number of column elements being the same as the number of col elements in each to get the line numbering right. Finding the bug in that stylesheet is your homework if you want to expend the effort :). To be fair, Dimitre said there was a two pass solution that would be quite simple. Unlike in the error-prone XSL above, incrementing the line count is easy in C++. The complexity is in iterating all of the combinations, but in CMarkup we are free to implement the combinations efficiently, rather than being restricted to the pseudo-functions in XSLT. Also, the CMarkup solution is shorter and easier to grasp than the stylesheet. int nCount = 0; xml.FindElem(); xml.IntoElem(); RecurseCols( xml, csResult, nCount, "" ); And this function: void RecurseCols( CMarkup& xml, CString& csResult, int& nCount, CString csRowUpToThis )
{
while ( xml.FindChildElem( ) ) // for each col
{
CString csRowNow = csRowUpToThis;
csRowNow += xml.GetChildData() + " "; // col element value
int nIndex = xml.GetChildElemIndex(); // remember col position
if ( xml.FindElem() ) // another column?
RecurseCols( xml, csResult, nCount, csRowNow ); // run against all combos
else
{
CString csOutput;
csOutput.Format( "%d %s\r\n", ++nCount, csRowNow );
csResult += csOutput;
}
xml.GotoChildElemIndex( nIndex ); // restore col position
}
}
I compared the two solutions using CMarkup for my C++ solution and CMarkupMSXML to run the stylesheet. Below is the code I used to run the two transformations and then compare the two string results. // Transform with MSXML plus style sheet
CMarkupMSXML msxml;
msxml.Load( "c:\\temp\\test.xml" );
CMarkupMSXML msxmlStylesheet;
msxmlStylesheet.Load( "c:\\temp\\test.xsl" );
CString csResultXSL = (LPCTSTR)msxml.m_pDOMDoc->transformNode(
msxmlStylesheet.m_pDOMDoc );
// Transform with CMarkup
CMarkup xml;
xml.Load( "c:\\temp\\test.xml" );
int nCount = 0;
CString csRow;
CSmartStr ssResult( 1024 );
xml.FindElem();
xml.IntoElem();
RecurseCols( xml, ssResult, nCount, csRow );
// Compare results (they are identical)
BOOL bIdentical = (ssResult.csStr == csResultXSL);
To compare performance I created a version of the XML data containing 6 column elements with 6 col elements in each yielding a transformation result of over a megabyte of text. I wrote a little class called struct CSmartStr
{
CSmartStr( int nStartLen ) { nStrLen = nStartLen; csStr.GetBuffer(nStrLen); };
void operator+=( const CString& a )
{
if ( csStr.GetLength() + a.GetLength() > nStrLen )
{
nStrLen = nStrLen * 2 + a.GetLength();
csStr.GetBuffer( nStrLen );
}
csStr += a;
};
int nStrLen;
CString csStr;
};
The Well, we did not transform apples to oranges, but may be I was comparing apples to oranges (declarative vs procedural method, or XSLT vs CMarkup). Anyway, the simple point is to consider ways other than XSLT to do transformation; you may be glad you did.
Drazen wrote a good response to me in Quick, which is better suited for XML transformation: C++ or XSLT? It is nicely presented, but I've got to say it ultimately only makes the point here stronger. It is significant that he uses "EXSLT" to get the Developing in C++, using CMarkup for XML navigation and creation, is a versatile and scalable way to do transformation. CMarkup does not wrap some handy capabilities like sorting, and on the face of it this may seem like a disadvantage. Then again many of the handy features of XSLT allow you to get started quickly but then hamper your progress on that one additional thing that is not so simple in XSLT. For example, there are many different ways of sorting different types of data which is a process that it must ultimately be very costly for XSLT to preside over if it does indeed provide a mechanism for extending its sorting functionality. This is why I began this article referring to "moderately complex tasks" which is where investing in XSLT has serious risks. On that note I am going to refer this discussion to the bottom of Why To Avoid XSLT.
At the end of More on XSLT vs C++ for XML transformation Drazen provided a 2 pass solution which works in my MSXML 4.0 test, is a simpler stylesheet, and appears to be the same speed. It is still a bit of a leap to get into the declarative mind frame but you can see the two templates: the first one takes the result set from the second one and puts line numbers at the beginning of the lines. Drazen concludes:
Actually I think most people assume you need to use XSLT to do transformation so the purpose of this article is to provide an example with CMarkup to demonstrate the potential of C++ based transformation. If you are already using CMarkup or have the option of using it I certainly would recommend against going the XSLT route for transformation because of the XSLT's deployment complexities (stylesheet, component availability and version). | |||||||||||||
|
Posted December 21, 2005 updated January 10, 2006. Question or comment about this article? ©Copyright 2008 First Objective Software, Inc. All rights reserved. |