Understanding XPATH

1. Complex Path Checking
2. XPATH expressions
3. xpath to any node

1.

Complex Path Checking

Jeni Tennison


> My problem is, I want to check the content 
>of the first <node2> (with "some content") before I decide if I 
>display the content of the other <node2> (with "other content"). I
>need to do this through the whole large document.
>
>Now, I'm not exactly experienced when it comes to XPath. I tried the 
>following, but as I already expected ;) , it didn't work.
>
> <xsl:template match="//node2">
    

I'm just going to go over your XPaths to make sure that you understand what they're doing before helping you with your actual problem. The syntax '//' within an XPath is short for '/descendant-or-self::node()/', so this expands to: /descendant-or-self::node()/node2 If you were using this expression to *select* nodes (e.g. in xsl:apply-templates or xsl:for-each), then it would select: all the 'node2' elements that are a child of any node that is a descendent of (or is itself) the root node

When you use an XPath to *match* nodes, as you are here (or if you were matching nodes to key into), it will match: any 'node2' element that is a child of any node that is a descendent of (or is itself) the root node

In other words, this matches any node2 element in the document. Another XPath that matches any node2 element in the document is simply 'node2', so you may as well say:

<xsl:template match="node2">
  ...
</xsl:template>

It's just a small thing, but it took me ages to get my head around the fact that the 'match' attributes don't *select* nodes, they test a node you've already select, which means they can usually be fairly simple.

> <xsl:if 
>test="contains(ancestor::following-sibling::child::node(),'X')">

You're going wrong here because you're forgetting to separate your steps properly. Each step is made up of an axis (like ancestor, following-sibling or child) and a node test. The node test usually gives the name of the node. Each step is separated from the next step using a '/'. I think you were trying for something like:

  ancestor::*/following-sibling::*/child::node()

This will select: any node (of any type) that is a child of any element that is a sibling that follows any element that is an ancestor of the current node

I doubt that you are really after any node of any type - are you really interested if an attribute or a comment contains an 'X'? If you are just after elements, then it would help to say so. You can also lose the child:: axis if you want - it's assumed by default. So try:

  <xsl:if test="contains(ancestor::*/following-sibling::*/*, 'X')">
    ...
  </xsl:if>

Anyway, your actual problem was that you wanted to check the content of the first node2 element to see whether you should process the second node2 element.

Within XSLT, processing flows from the top of the *tree* to the bottom of the tree rather than from the top of the *document* to the bottom of the document. So, the children of the root node are processed, and their processing involves the processing of their children, which involves the processing of their children and so on.

This means that the right place to decide whether to process a particular node is either higher up the *tree* rather than higher up the *document*. Another approach is is to process the node, but not produce any output unless certain conditions are met.

Here's an example that always processes the first node2, but only processes the second node2 if there's an ancestor of the first node2 which has a following sibling which has a child that contains an 'X':

<xsl:template match="root">
  <xsl:apply-templates select="node1/node2[1]" />
  <xsl:if
      test="contains(node1/node2[1]/ancestor::*/following-sibling::*/*, 'X')">
    <xsl:apply-templates select="node1/node2[2]" /> 
  </xsl:if>
</xsl:template>

Here's another example which decides whether to produce any output within the node2-matching template:

<xsl:template match="root">
  <xsl:apply-templates select="node1/node2" />
</xsl:template>

<xsl:template match="node2">
  <xsl:if
    test="position() = 1 or
          contains(node1/node2[1]/ancestor::*/following-sibling::*/*, 'X')">
    ...produce some output...
  </xsl:if>
</xsl:template>

2.

XPATH expressions

David Allouche

> <box>
>    <category name="someType">
>       <header>
>         <self>
>            <host>myhost</host>
>            <instance>9</instance>
>         </self>
>        <ref>
>           <host>thathost</host>
>           <instance>1010101</instance>
>        </ref>
> </header>
> 
> And the value of the header instance is 9 (passed from a web page to a
> servlet)
> What expression can I use to get the ref elemenent under the same header
> parent?

/box/category[@name='someType']/header is a nice start.

Just go on you XPath until you test what you want to test:

/box/category[@name='someType']/header/self[instance=$parameter]

Now use the parent axis to go up to the header element

/box/category[@name='someType']/header/self[instance=$parameter]/parent::header

Then go on as usual using the implied child axis

/box/category[@name='someType']/header/self[instance=$parameter]/parent::header/ref

It's done, but it needs to be shortened a bit to stay readable. First, as you know that the parent of the self element is a header element you can use a wildcard element name without changing the meaning of the XPath:

/box/category[@name='someType']/header/self[instance=$parameter]/parent::*/ref

Actually using parent axis with an explicit element name is only useful to test that a parent has a given name... Then, if your file structure is regular enough you can put in more wildcards without change of signification

/*/category[@name='someType']/*/self[instance=$parameter]/parent::*/ref

I guess you know enough of XSLT to use the context to get rid of any unnecessary steps at the beginning of the XPath.

But there is still other approaches, making heavier use of predicates, and possibly less performant. But it's mainly a matter of programming style.

//category[@name='someType']/*/ref[parent::*/self[instance=$parameter]]

//ref[ancestor::category[@name='someType'] and
                                  parent::*/self[instance=$parameter]]

Or even the most perverse and obfuscated:

ancestor-or-self::node()[boolean(count(.|/)-2)]/node()[ancestor::*
[not(descendant-or-self::*/parent::category)]/child::self[instance=
$parameter]][self::ref][generate-id(self::ref)=generate-id(//ref
[ancestor::category[@name='someType'])]

(if someone can find a bug in this one I'll buy him/her a beer) :-)

3.

xpath to any node

Wendell Piez

And here's some generic XSLT 1.0 that will return an XPath for any node on request. Just apply templates in the "xpath" mode to the node for which you want the path.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

  <xsl:template match="node()" mode="xpath">
    <xsl:apply-templates select="ancestor::*|." mode="xpath-step"/>
  </xsl:template>

  <xsl:template match="/" mode="xpath-step">
    <xsl:text>/</xsl:text>
  </xsl:template>

  <xsl:template match="*" mode="xpath-step">
    <xsl:text>/</xsl:text>
    <xsl:value-of select="name()"/>
    <xsl:if test="count(../*[name()=name(current())]) > 1">
      <xsl:text>[</xsl:text>
      <xsl:value-of select="count(
	  preceding-sibling::*[name()=name(current())]) + 1"/>
      <xsl:text>]</xsl:text>
    </xsl:if>
  </xsl:template>

  <xsl:template match="text()" mode="xpath-step">
    <xsl:text>/text()</xsl:text>
    <xsl:if test="count(../text()) > 1">
      <xsl:text>[</xsl:text>
      <xsl:value-of select="count(preceding-sibling::text()) + 1"/>
      <xsl:text>]</xsl:text>
    </xsl:if>
  </xsl:template>

  <xsl:template match="processing-instruction()" mode="xpath-step">
    <xsl:text>/processing-instruction()</xsl:text>
    <xsl:if test="count(../processing-instruction()) > 1">
      <xsl:text>[</xsl:text>
      <xsl:value-of select="count(
	  preceding-sibling::processing-instruction()) + 1"/>
      <xsl:text>]</xsl:text>
    </xsl:if>
  </xsl:template>

  <xsl:template match="comment()" mode="xpath-step">
    <xsl:text>/comment()</xsl:text>
    <xsl:if test="count(../comment()) > 1">
      <xsl:text>[</xsl:text>
      <xsl:value-of select="count(preceding-sibling::comment()) + 1"/>
      <xsl:text>]</xsl:text>
    </xsl:if>
  </xsl:template>

  <xsl:template match="@*" mode="xpath-step">
    <xsl:text>/@</xsl:text>
    <xsl:value-of select="name()"/>
  </xsl:template>

</xsl:stylesheet>

DC adds

The schematron sources have several variants of this, targetting different use cases. using name() is fine for human-oriented paths in error reprorting etc, but if you need to execute the paths the problem is thatit uses namespace prefixes from the source file, so you need to bind the same prefixes in the context that the xpath is executed.

   <xsl:template match="*" mode="xpath-step">
     <xsl:text>/*</xsl:text>
     <xsl:if test="count(../*) > 1">
       <xsl:text>[</xsl:text>
       <xsl:value-of select="count(preceding-sibling::*) + 1"/>
       <xsl:text>]</xsl:text>
     </xsl:if>
   </xsl:template>

makes xpaths that are a bit less informative to a human reader /*[33] instead of /xhtml:table[2] but don't have a dependency on namespace context.

(variants of this generate /*[local-name()='table'][2] or (for xpath 2) /*:table[2] if you want a version which shows the variable name but without namespace dependency)

Florent Georges adds

A few remarks. 1) The test as it is make that the first element with a particular name never has a positional predicate, and the following always have one: "elem", then "elem[2]", "elem[3]", etcetera. I would rather say that if there is only one element of a particular name, you can drop the predicate, if not you should add "[1]".

2) The test for reporting the attribute is wrong if one asks the path to something else than an element or an attribute.

3) This has been developed for XSLT 1.0 initially, but even in the XSLT 2.0 version it uses name() instead of node-name().

So for an XSLT 2.0 version, to generate paths targeted at human beings, I propose the following. Any comment?

    <!-- for any kind of node, report the elements path -->
    <xsl:template match="node() | @*" priority="-2" mode="path">
       <xsl:for-each select="ancestor-or-self::*">
          <xsl:text>/</xsl:text>
          <xsl:value-of select="name(.)"/>
       </xsl:for-each>
    </xsl:template>

    <!-- add a positional predicate if there are other siblings of the
         same name -->
    <xsl:template match="*" mode="path">
       <xsl:next-match/>
       <xsl:if test="
           count(../*[node-name(.) eq node-name(current())]) gt 1">
          <xsl:text>[</xsl:text>
          <xsl:value-of select="
              count(preceding-sibling::*[
                        node-name(.) eq node-name(current())
                      ])
                + 1"/>
          <xsl:text>]</xsl:text>
       </xsl:if>
    </xsl:template>

    <!-- add a positional predicate if there are other siblings of the
         same name -->
    <xsl:template match="processing-instruction()" mode="path">
       <xsl:next-match/>
       <xsl:if test="count(../processing-instruction()[
                                  node-name(.) eq node-name(current())
                                ]) gt 1">
          <xsl:text>[</xsl:text>
          <xsl:value-of select="
              count(preceding-sibling::processing-instruction()[
                        node-name(.) eq node-name(current())
                      ])
                + 1"/>
          <xsl:text>]</xsl:text>
       </xsl:if>
    </xsl:template>

    <!-- add the attribute step -->
    <xsl:template match="@*" mode="path">
       <xsl:next-match/>
       <xsl:text>/@</xsl:text>
       <xsl:value-of select="name(.)"/>
    </xsl:template>

    <!-- add the string text() and maybe a positional predicate -->
    <xsl:template match="text()" mode="path">
       <xsl:next-match/>
       <xsl:text>/text()</xsl:text>
       <xsl:if test="count(../text()) gt 1">
          <xsl:text>[</xsl:text>
          <xsl:value-of select="count(preceding-sibling::text()) + 1"/>
          <xsl:text>]</xsl:text>
       </xsl:if>
    </xsl:template>

    <!-- add the string comment() and maybe a positional predicate -->
    <xsl:template match="text()" mode="path">
       <xsl:next-match/>
       <xsl:text>/comment()</xsl:text>
       <xsl:if test="count(../comment()) gt 1">
          <xsl:text>[</xsl:text>
          <xsl:value-of select="
              count(preceding-sibling::comment()) + 1"/>
          <xsl:text>]</xsl:text>
       </xsl:if>
    </xsl:template>