XslCompiledTransform

| No Comments | No TrackBacks

Happy new year. (I was totally hibernating this winter ;-)

Recently Microsoft pushed another CTP version of Whidbey and I found there is a new XSLT implementation named XslCompiledTransform (BTW MSDN documentation are so obsolete that it still contains XsltCommand and XQueryCommand). It is in System.Data.SqlXml.dll, and (as long as I see Object Browser) there is no other type than XslCompiledTransform related things. (As compared to the assembly file name, it is somewhat funky.)

XslCompiledTransform looks coming from XsltCommand which is based on executable stylesheet IL code like Apache XSLTC. I wonder how many existing types such as XPathExpression and XPathNodeIterator are used in this new implementation. They might exist just for historical extension support.

I noticed that XslCompiledTransform is pretty complete. It looks a great work. I guess it would be the best improvement of System.Xml 2.0.

Here I put two important behavioral differences from existing XslTransform:

Error recovery

According to XSLT 1.0 specification, it is an error if an attribute node is appended to an element where child elements or texts were already appended. In such cases, XslCompiledTransform throws an exception. XslTransform ignores such attributes. Both behavior are allowed in the specification.

Similarly, it is an error that if element nodes are added to an attribute as its content. Here XslCompiledTransform also rejects such output (XslTransform doesn't).

So which is better? I believe that XslCompiledTransform is. Because if you bring your stylesheet to other platform, it might be rejected as wrong stylesheet. With XslTransform, there is no way to check if your stylesheet is sane.

Space stripping

If there is xsl:strip-space in the stylesheet, XslCompiledTransform will reject IXPathNavigable as Transform() input, saying that:

System.Xml.Xsl.XslTransformException: Whitespace cannot be stripped from input documents that have already been loaded. Provide the input document as an XmlReader instead.

The reason is, it is much easier and efficient for XSL transformation engine that those whitespaces in such elements that are listed in xsl:strip-space are originally excluded from the input document (in the transformation process, they are totally ignored). So there must be a filtering XmlReader that skips those whitespace nodes (or IXPathNavigable implementation must just do that).

If you can't change the input source from IXPathNavigable to XmlReader, you could still use new XPathNavigator.ReadSubtree() method.

Other than them, there are some minor changes (such as having Roman numbering as usual formatting, avoiding run-time prefix evaluation on "name" attributes in xsl:attribute and xsl:element), but in general, it looks good.

As for performance wise, node iterators seem to be changed as structs. That means, it does not have to worry about so extraneous object creation. I believe that it must have resulted in significant performance improvements.

So now I tend to throw away existing XslTransform and implement a new XSLTC-like transformation engine, but still not sure. With the stylesheet used in corcompare, XslCompiledTransform just resulted in only about 1.5x - 2x boost.

No TrackBacks

TrackBack URL: http://veritas-vos-liberabit.com/monogatari/mt-tb.cgi/34

Leave a comment