Apr 252011
 

XSLT/XPath 2.0/3.0 are powerful technologies. But sometimes they’ll drive you nuts. A large share of issues falls into the category of “why doesn’t my template match?” The reasons are manifold. Here are some typical traps:

  • not including the namespace prefix
  • typo in the namespace URI
  • processing the document in a different mode than the template is supposed to match in. A very subtle example, accidentally closing the xsl:template start tag, cutting off the existing mode declaration – this actually happened to me:
<xsl:template match="HyperlinkTextSource[…]">
    mode="idml2xml:ConsolidateParagraphStyleRanges-remove-empty" priority="4">
  • other typos (in predicates, element names, modes)
  • skipping intermediate elements, e.g., formulating predicates for <td>s when the template is supposed to match <tr>
  • other templates have higher priority
  • import precedence: this template is an imported one, and there is a matching template of whatever priority in the importing stylesheet, or in a template imported after the template in question
  • logical misconceptions in the predicates
  • if the template is supposed to match a result of a previous transformation: the output of the previous tranformation is not as expected
  • processing some surrounding element with xsl:copy-of instead of xsl:apply-templates
  • not looking at the actual output document, or looking at another part of the document, while your template indeed matched (thanks, @fbuehring)

My tactics of debugging these cases include

  • to make the non-matching template increasingly less specific (dropping terms in the predicate),
  • increase the priority to insane values,
  • make the template output debugging info such as “I’m here”, or count(*) and count(B) in the context of A (first make sure that the template matches A – ‘I’m A’),
  • output intermediate processing steps to a file (<xsl:result-document href=”_debug.35.fooify-bar.xml”><xsl:copy-of select=”$foo”/></xsl:result-document>)

Schema-aware processing will probably help detect many of the issues, too. But I have to admit that I don’t usually undergo the effort of creating schemas for intermediate transformation results.

Templates that match too much

This might also happen, and in fact a template that matched unexpectedly many nodes was the reason that I wrote this post.

Last night I spent hours debugging this template:

<xsl:template match="ParagraphStyleRange [count(XMLElement) eq 1] [every $x in descendant::XMLElement [ancestor::ParagraphStyleRange[1] is current()/..] satisfies ( not($x/XMLAttribute[@Name = 'aid:pstyle']) ) ] [every $c in CharacterStyleRange satisfies (matches($c, '^$'))] /XMLElement [not(every $c in * satisfies ($c/self::Table or $c/self::XMLAttribute))]" mode="idml2xml:GenerateTagging">
  <xsl:copy>
    <xsl:copy-of select="@*" />
    <xsl:apply-templates mode="#current" />
    <XMLAttribute Name="xmlns:idml2xml" Value="http://www.le-tex.de/namespace/idml2xml" />
    <XMLAttribute Name="idml2xml:AppliedParagraphStyle" Value="{replace(../@AppliedParagraphStyle, '^ParagraphStyle/', '')}" />
    <XMLAttribute Name="idml2xml:reason" Value="gp3" />
  </xsl:copy>
</xsl:template>

This is an example from an IDML processing pipeline as described in the ropes of sand post. After splitting the text at every paragraph break, I want to attach pstyle information to elements that span a whole paragraph and that don’t already carry an aid:pstyle attribute. But there may already be XMLElements carrying a pstyle that are not immediately below the ParagraphStyleRange element. So in lines 3–8, I’m filtering the template to apply only to non-pstyled XMLElements. Caveat: there may be embedded ParagraphStyleRanges further down, for example, when there’s a Table or another Story contained in this range. So I’ll have to exclude XMLElements that are located in embedded ParagraphStyleRanges. And that’s what I originally wanted to check by applying the condition that every XMLElement[ancestor::ParagraphStyleRange[1] is current()] doesn’t have a pstyle. current() was working well as long as the template matched ParagraphStyleRanges. But then I modified it to have the current form, to match ParagraphStyleRange/XMLElement. Then suddenly it matched many more XMLElements, including such that had a pstyle attached to them. Why?

The reason is that by modifying the pattern from A[…] to A[…]/B (where A stands for ParagraphStyleRange, etc.), the term current() in A’s predicate no longer referred to A, but to B!. Therefore the condition [every $x in descendant::B[ancestor::A[1] is current()] satisfies …] was always true because in no circumstances may there be an ancestor::A that is of type B. And if the sequence after “every $x in” is empty, the “every … satisfies” clause is always true. Therefore my filter filtered too little of the ParagraphStyleRanges away, so that my template matched too many XMLElements. Changing current() to current()/.. in line 4 improved the situation.

Subtle, subtle. If I had to categorize it, I’d put it into the “logical misconceptions in the predicate” basket.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)