XSLT axis, child axis, following axis, preceding axis ancestor axis

Axes

1. Forward or Reverse axis?
2. How to retain the lineage of the selected node.
3. Understanding Axes
4. Terminology ::
5. Preceding axis
6. Immediacy of preceding-siblings
7. Picturing the axes
8. Nearest ancestor
9. following sibling question
10. following-sibling problems
11. Axis clarification
12. Finding the nearest ancestor, one from two
13. Finding the nearest ancestor, one from two
14. Testing the current node
15. Context when processing in sibling axis
16. Checking the name
17. Descendent or //

1.

Forward or Reverse axis?

Evan Lenz

Dave Pawson requested that I provide a "Jeni Tennison explanation" of this business about XPath 1.0 positional predicates and forward and reverse axes, for inclusion in the FAQ. Not to be outdone, here goes:

Positional predicates in XPath 1.0 can be confusing. If you see "[3]" in an expression, it might select the third node in *document order*, or it might select the third node in *reverse document order*. How do you know which? Let's try a brief quiz. For each of the expressions below, does "[3]" select the first node in (forward) document order, or reverse document order?

foo[3]
ancestor::foo[3]
following::foo[3]
preceding::foo[3]

If you answered forward/reverse/forward/reverse, then you'd be right! The (default) child axis is used in "foo[3]" and since child is a forward axis, "[3]" selects the third node in document order. The ancestor axis in "ancestor::foo[3]" is a reverse axis; consequently, that expression selects the third node in *reverse document order*. The same principle applies for the following (forward) and preceding (reverse) axes.

Now try these:

$var[3]
(foo | bar)[3]
(ancestor::foo)[3]
id('foo')[3]

Hmm, how do we know which axes to use (forward or reverse) for these? The answer may surprise you. It is: ALWAYS FORWARD. Why? Because these examples represent a different kind of predicate! In XPath 1.0, predicates can be part of a location step (as in the first quiz's examples) and they can also be applied to any expression (as in the second quiz's examples). When a positional predicate is applied to an *expression* (as opposed to being part of a location step), it is always evaluated with respect to forwardness, which is to say that it always filters nodes with respect to document order.

This rule applies even in the third example above:

(ancestor::foo)[3]

But, you might say, the ancestor axis is a reverse axis! And you'd be right, but it doesn't matter in this case! The predicate doesn't care what axis was used here because it is not part of the location step; rather it filters an expression. The parentheses ensure that "ancestor::foo" is first evaluated as an expression in its own right (without a predicate), yielding an unordered node-set. Only after that is the predicate applied to the result of that expression, selecting the third node in document order. Such a big difference those little parentheses can make!

Hopefully the above examples have helped clear up some confusion, but if you'd like a slightly more technical explanation (and review) of how this all works, keep on reading.

The relevant prose in the XPath 1.0 spec can be found at [URI1] and [URI2].

"2.4 Predicates"[URI1] explains the difference between "forward" and "reverse" axes and how predicates are always evaluated with respect to a forward or reverse axis. The distinction only makes a difference to the expression result when the predicate is a positional predicate, i.e. when the predicate expression evaluates to a number.

The indented "NOTE" in "3.3 Node-sets"[URI2] is helpful as it contains an example that highlights the difference between the following two expressions:

preceding::foo[1]

(preceding::foo)[1]

The former selects the first node in *reverse document order*. The latter selects the first node in *document order*. This is determined by the axis with respect to which the predicate is evaluated, i.e. whether it is a forward or reverse axis.

While predicates are defined semantically in one way (and in one place[URI1]), they appear syntactically (i.e. in the XPath grammar productions) in two places: 1) as part of the Step production[URI3], and 2) as part of the FilterExpr production[URI4].

The predicate in the first example above is part of the Step itself, which is to say that it is tightly bound to the preceding::foo step. Therefore, the axis used to evaluate the positional predicate is a reverse axis, because the preceding axis is a reverse axis. Consequently, the predicate filters out all nodes but the first node in *reverse document order*.

The predicate in the second example is not part of the Step, but is part of a more general FilterExpr. In XPath 1.0, a predicate may follow any kind of expression; in this case, it follows a parenthesized expression. The parentheses render preceding::foo an expression in its own right (without a predicate), yielding an opaque node-set. A node-set never retains information about what axis was used to select it. That node-set result is subsequently filtered with a predicate. When a predicate applies to an expression, the XPath spec says, it is evaluated with respect to the child axis. (The child axis is arbitrarily chosen because it's an example of a forward axis.) Consequently, the predicate filters out all nodes but the first node in *document order*.

The Step production[URI3] is as follows:

[4]    Step    ::=    AxisSpecifier NodeTest Predicate*
                      | AbbreviatedStep

Note the "Predicate*" part above. This denotes that multiple predicates can be part of a single location step. Understanding this will dispel any confusion about how an expression like the following is evaluated:

preceding::foo[@bar][1]

"[1]" selects the first node in *reverse document order*, because the positional predicate, even though it's not the first predicate, is still tightly bound to, i.e. a part of, the location step.

The FilterExpr production[URI4] is as follows:

[20]    FilterExpr    ::=    PrimaryExpr
                             | FilterExpr Predicate

An example instance of this production can be seen by slightly modifying the last example:

(preceding::foo[@bar])[1]

Here, "[1]" selects the first node in document order.

Unless a predicate is bound (without intervening parentheses) to a node test, it is always evaluated with respect to a "forward" axis (the XPath spec arbitrarily chooses the child axis).

So far, so good.

 ancestor-or-self::*[@source][last()]

*both* predicates in the above example are tightly bound to the node test (and thus the reverse axis), because the Step production is as follows:

[4]    Step    ::=    AxisSpecifier NodeTest Predicate*
   | AbbreviatedStep

where the Step itself may have more than one predicate. So [last()] will select the last node in *reverse* document order (or first in document order), not the other way around! To get the intended result (getting the "closest" ancestor), you would either write:

ancestor-or-self::*[@source][1]

or

(ancestor-or-self::*[@source])[last()]

In this second example, the [last()] predicate is no longer tightly bound to the node test as part of the Step production; the parentheses render it as a predicate of the second kind--as part of the FilterExpr production:

[20]    FilterExpr    ::=    PrimaryExpr
   | FilterExpr Predicate

And thus [last()] is applied to the result of the expression to the left with respect to document order, selecting the last node in document order.

Multiple predicates in a reverse step with the last being positional--that has got to rank pretty highly on the confuse-the-experts list of XPath expressions.

2.

How to retain the lineage of the selected node.

G. Ken Holman

How to return a document fragment that includes the necessary parent nodes for a selected node, so that the lineage of the selected node is maintained?

E.g. From
 <vendor name="james">
  <product id="1234">
   <material>SiO2</material>
  </product>
  <product id="5678">
   <material>CO2</material>
  </product>
 </vendor>

to return the fragment

 <vendor name="james">
  <product id="1234">
   <material>SiO2</material>
  </product>
 </vendor>

for element SiO2

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"
                 version="1.0">

<xsl:output method="xml"/>

<xsl:template match="/">
   <xsl:for-each select="/vendor/product/material
    [.='SiO2']">
     <xsl:call-template name="show-context"/>
   </xsl:for-each>
</xsl:template>

<xsl:template name="show-context" match="*" 
   mode="show-context">
   <xsl:param name="node-ids" 
       select="' '"/>
   <xsl:choose>      
<!--walk up tree until at document element-->
     <xsl:when test="ancestor::*">
       <xsl:apply-templates 
          mode="show-context" 
          select="..">
         <xsl:with-param name="node-ids"
               select="concat( $node-ids, generate-id(.), 
           ' ' )"/>
       </xsl:apply-templates>
     </xsl:when>
     <xsl:otherwise>            
    <!--must be at document element-->
       <xsl:apply-templates 
        mode="show-context-members"
        select=".">     <!--walk down tree -->
         <xsl:with-param name="node-ids"
               select="concat( $node-ids, 
            generate-id(.), ' ' )"/>
       </xsl:apply-templates>
     </xsl:otherwise>
   </xsl:choose>
</xsl:template>

<xsl:template match="*" 
        mode="show-context-members">
   <xsl:param name="node-ids"/>   
         <!--lineage of generate-id(.)-->
   <xsl:choose>            
             <!--first in list is last in descent-->
     <xsl:when test="starts-with($node-ids,
     concat( ' ', generate-id(.), ' '))">
       <xsl:copy-of select="."/>
     </xsl:when>
 <!--others reflect hierarchy-->
     <xsl:when test="contains($node-ids,
   concat( ' ', generate-id(.), ' '))">
       <xsl:copy>
         <xsl:copy-of select="@*"/>
         <xsl:apply-templates 
                mode="show-context-members"
                select="*">
           <xsl:with-param 
                  name="node-ids" 
                  select="$node-ids"/>
         </xsl:apply-templates>
       </xsl:copy>
     </xsl:when>
   </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Gives output of

<vendor name="james"><product 
id="1234"><material>SiO2</material>
</product></vendor>

David Carlisle also offered:

<?xml version="1.0"?>
<xsl:stylesheet 
       xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"
                 version="1.0">

<xsl:output method="xml"/>

<xsl:variable name="x" 
              select="generate-id(/vendor/product/material
          [.='SiO2'])"/>

<xsl:template match="*">
<xsl:if 
  test="descendant-or-self::*[generate-id(.)=$x]">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:if>
</xsl:template>

to give:

<vendor name="james">
 <product id="1234">
  <material>SiO2</material>
 </product>
 
</vendor>


            

3.

Understanding Axes

Mike Kay

An XPath expression says where to find the members of a node set by identifying the locations of nodes *relative* to some initial set of nodes. It's a "path" because it says, essentially, that to find all the nodes in the set you want, "start at the nodes that match this pattern, then go in this direction (along an axis) to all nodes that match this other pattern, then go in this other direction (along another axis) to the nodes that match this other pattern..." ad infinitum.

An axis very broadly identifies a set of nodes relative to a context node. The self axis is the context node itself, the child axis is all the child nodes of the context node, the parent axis is the node one level "above" the context node, and so on.

A predicate is a filter for a pattern, reducing the set of nodes matching the pattern to only those for which the expression in the predicate is true. Each combination of axis::pattern[predicate] is a location step, and location steps are separated by /.

For example, when you are following the path along the attribute axis from one node (hopefuly an element, since other types of nodes won't have attributes) to some other nodes matching the pattern "foo" (i.e., attributes named foo), you can use a predicate to say that of those nodes, you really only want the ones for which the string-value is 'bar':

  attribute::foo[. = 'bar']

This example is pretty straightforward because the predicate isn't filtering anything that is dependent on the axis.

The note you quoted, though, gives an example of how the semantics of a predicate can be influenced by the axis. It shows that a predicate can identify a proximity position of a node relative to a context node. This concept is perhaps better explained in the first paragraph of section 2.4. The context node is the node you are at when you are embarking on the next step.

So let's say you want to go along the preceding-sibling axis to the node with proximity position 5. How do you know which node that is? It wouldn't make much sense for it to be, of all the nodes in the preceding-sibling axis, the 5th one in forward document order, because that means the node's position is 5 nodes away from the beginning of the document, rather than 5 nodes away from the context node. But if you say it's 5 nodes away in reverse document order, it will be proximal to the end of the document, which is what you want since the preceding-sibling axis won't contain nodes that come between the context node and the end of the document; in effect it is 5 nodes away from the context node.

So that's my interpretation of why these statements are in the spec, and also why there are no axes that simultaneously identify nodes that precede and nodes that follow the context node.

4.

Terminology ::

David Carlisle



> how to explain to a non-programmer the "verbal sense" conveyed/represented
> by the "::"
    
read x::y as

all y that are a(n) x

select="ancestor-or-self:xxx"

select all xxx nodes that are an ancestor of, or are, the current node.

(preceding::foo)[1] example, I can't explain

That's because the spec defines this by sleight of hand.

The [4] syntax and more generally position() needs to know whether the node set is in document or reverse document order. By definition the ordering depends on the axis.

For something like following::* it is clear what the natural ordering of the resulting node list should be but for a general expression

(following::xx | child::y | ancestor::*)

there isn't a natural order to the resulting node set as it might be constructed from several axes,

so what does (following::xx | child::y | ancestor::*)[1] mean? They could have defined that such expressions were ordered in document order, but instead the spec defines that by default this

> The Predicate filters the node-set with respect to the child axis.

This has the effect that position() counts in document order because child is a forward axis. But that is, I think, the _only_ effect of this statement, so it might have been clearer to just have said that, rather than introduce the confusing reference to the child axis.

5.

Preceding axis

>I would like to know why 
><xsl:when test="customer = preceding::row[position()=1]/customer"> 
>behaves as it does?

Jeni Tennison

Your test returns true when the value of the child customer element of the current node (a row) is the same as the value of the first preceding row's child customer element. In other words, if the customer for this row is the same as the customer for the row before.

When you use an axis that works backwards (a 'reverse axis') like preceding, preceding-sibling, ancestor or ancestor-or-self, then the node list is given in reverse document order until you take the next step in the XPath. So, preceding::row[1] (which is exactly the same as preceding::row[position()=1]) gets the row immediately before the current node.

Mike Kay adds an example of use: To test if an <item> element is the first <item> child of its parent:

<xsl:if test="not(preceding-sibling::item)">This is the first</xsl:if>

6.

Immediacy of preceding-siblings

Warren Hedley


> Does "preceding-sibling::*[1]" select the immediately preceding sibling, or
> would that be "preceding-sibling::*[last()]" ?

You want to have a look at section 2.4 of the XPath spec.

Basically, the preceding-sibling axis is a reverse axis, and position() returns the proximity of the node in the node-set with respect to the context node. So preceding-sibling::*[1] should return the immediately preceding sibling.

I just did a quick test and got reasonably confused because of the following expression:

<xsl:for-each select="preceding-sibling::*"> ...

It should be noted that <xsl:for-each> always iterates through nodes in document order regardless of the direction of the axis of the selection path. I can't say I've ever run into this before, but I'm not sure if this is obvious or desirable behaviour.

Mike Brown elaborates:

Let's say you have

<parent>
  <sibling num="1"/>
  <sibling num="2"/>
  <sibling num="3"/>
</parent>

and the current node is sibling number 3.

preceding-sibling:: will select siblings 1 and 2, and the whitespace-only text nodes in between.

* will reduce that set down to just those that are elements (siblings 1 and 2)

[1] will reduce that set down to just the one at position 1 in that set. the position ordering depends on the directionality of the axis. for preceding axes, reverse document order is used. so sibling 2 is at position 1, and sibling 1 is at position 2.

[last()] will instead reduce that set down to just the one at the last position in the set. this will be sibling 1. actually the function will be evaluated first, so saying [last()] is the same as [2].

7.

Picturing the axes

Dave Hartnoll

Perhaps you'll find a picture of all the axes useful. You'l find it at: Ken Holmans site

Also here without need for the download:

8.

Nearest ancestor

Mike Kay




 > (And just checking ... XPath experts ... 
 > ancestor-or-self::*[@source][last()] will give me the 
 > *closest* 'source' 
 > attribute on an ancestor or self ... not the most distant, 
 > won't it?)
 

It will give you the most distant. To get the innermost one, use

 ancestor-or-self::*[@source][1]
 
 or
 
 (ancestor-or-self::*[@source])[last()]

9.

following sibling question

Jeni Tennison


> I'm trying to determine when the current node
> has no following siblings. One expression that seems to work is:
>
>     not(following-sibling::*)

Yes. This gathers together all the following sibling elements of the context node into a node set. The not() function converts that node set to a boolean -- true if there are any nodes in a node set and false if there aren't -- and then negates that boolean. So you get true if you don't manage to find any following sibling elements.

> but the following expressions don't "work", and I thought they would
>
>     following-sibling::* = ''

This tests whether any following sibling element has a string value of an empty string. It gathers together a node set of the following sibling elements and compares them each to the empty string. If any of them are the empty string, then it returns true. If there aren't any following sibling elements, or if none of them are empty strings, then it returns false.


>     following-sibling::* = '/..'

This tests whether any following sibling element has a string value of '/..' (you put quotes around it, so it's a string). As above, if there aren't any following sibling elements, or if none of them have the string value '/..', then it returns false.

Possibly what you were trying to test here was:

  following-sibling::* = /..

But = doesn't compare node sets as a whole (i.e. it won't say "the node set of following sibling elements is the same as the empty node set"). Instead it compares pairs of nodes from the two node sets -- it's the same as "is there a node in the node set of following siblings with the same value as one of the nodes in the empty node set?". Since there's nothing in the second node set, the answer is always false.

10.

following-sibling problems

Jeni Tennison



> Given an XML fragment like:
>
> <outer><inner>blah</inner> following text.</outer>
>
> Before I can finish processing the "inner" element, I need to
> examine the next sibling node (following-sibling::*[1]?) to
> determine if it is a text node whose contents begin with a
> non-whitespace character.

You're close. following-sibling::*[1] gets the next sibling *element*. You want to get the next sibling node:

  following-sibling::node()[1]

then test whether it's a text node:


  following-sibling::node()[1][self::text()]

Having found it, you can then test whether it starts with a whitespace character:

  contains(' &#x9;&#xA;&#xD',
           substring(following-sibling::node()[1][self::text()], 1, 1))

This test will return true if the next sibling node is a text node that starts with a whitespace character, and false if there isn't a following sibling node, it's not a text node, or if it doesn't start with a whitespace character.

Michael Kay adds

> * Is my next sibling a text node?
> * If so, does its content begin with a non-whitespace character?

if test="following-sibling::node()[self::text()]
  [not(translate(substring(.,1,1), ' &#x9;&#xa;&#xd;', '')='')]

11.

Axis clarification

David Carlisle and Mike Kay


I created in an XSL-T stylesheet a variable that contains presentation
slides. Each of the slides is a '<div>' element with a 'class' attribute
with the value 'slide' and all these elements have and 'id' attribute with a
unique value. All slides exist in the same level in the hierarchy of the
document.

I wanted to add a 'previous' and 'next' button to each slide that (off
course) links to the previous and next slide in the presentation.

to be able to create the link to the next slide, I used the following lines:

<a>
<xsl:attribute name="href">
#<xsl:value-of select="following-sibling::div[@class =
'slide']/@id"/>
</xsl:attribute>
<img src="images/icons/next.gif" border="0"/>
</a>

This works perfect, however for the previous slide I wanted to use:

<a>
<xsl:attribute name="href">
#<xsl:value-of select="preceding-sibling::div[@class =
'slide']/@id"/>
</xsl:attribute>
<img src="images/icons/back.gif" border="0"/>
</a>

This always results in a link to the first slide (msxml). A bell started to
ring and I thought that the listing of the elements probabley would be
reversed, so I tried the following:

<a>
<xsl:variable name="pos" select="position()" />
<xsl:attribute name="href">
#<xsl:value-of select="preceding-sibling::div[@class =
'slide' and
position() = $pos -1]/@id"/>
</xsl:attribute>
<img src="images/icons/back.gif" border="0"/>
</a>

This also results in a link to the first slide. Now I have the following
that works fine, but I don't really understand why! Can anybody please
explain me this?

<a>
<xsl:attribute name="href">
#<xsl:value-of select="preceding-sibling::div[@class =
'slide' and
position() = 1]/@id"/>
</xsl:attribute>
<img src="images/icons/back.gif" border="0"/>
</a>

Answer

You just want <xsl:value-of select="preceding-sibling::div[@class = 'slide'][1]/@id to get the nearest previous sibling with slide class.

what you have


<xsl:value-of select="preceding-sibling::div[@class = 'slide' and
position() = 1]/@id

selects the nearest sibling if it has a slide class and nothing otherwise. If all your div elements have a class="silde" then these are equivalent, otherwise not.

in the step preceding-sibling::div you are in a reverse axis, so all teh selected nodes are numbered in reverse document order.

so your filter

[@class = 'slide' and position() = 1]

selects all (at most one) of those nodes that satisify the stated condition.

however if you do it with two filters first

preceding-sibling::div[@class = 'slide']

now the resulting node set is renumbered (still in reverse order, as you are still in the same step) so now the [1] or equivalenetly [position() = 1] selcts the first of _these_ elements, ie the first element (looking back) with class slide.

In response to a clarification question, David went on:

> I'm still a bit perplexed. 

that's usually the way with my answers until Jeni re-words them (I don't know why I bother:-)

> If you evaluate a nodeset in a string context
> then only the first
> node in the nodeset is considered. 

note that node sets are sets so unordered.

If you evaluate in a string context then the first node in document order is taken.

However within a step position referes to the order in the direction specified by the axis that started the step.

> following-sibling::div[@class = 'slide']/@id is the same as
> following-sibling::div[@class = 'slide'][1]/@id.

these are the same, yes.

On the other hand, 
>  preceding-sibling::div[@class = 'slide']/@id is the same as
>  preceding-sibling::div[@class = 'slide'][last()]/@id.

yes that [last()] is part of a step using a reverse axis. the "first node" semantics is the same as using (...)[1]

preceding-sibling::div[@class = 'slide']/@id if used as a string selects more than one node (potentially) so it is first evaluated as


(preceding-sibling::div[@class = 'slide']/@id)[1]

which takes the first node from teh set in document order.

in this case, that happens to be the same as


(preceding-sibling::div[@class = 'slide'])[1]/@id

as every div which has a class=slide also has an id attribute, so taking the first id attribute is the same as taking the first div and then taking its id.

with the () the [1] is acting on the node set, so uses document order, but in preceding-sibling::div[@class = 'slide'][1] th e[1] is part of the step and so referes to the order specified by the axis of the step. Note that [] appearing in steps and [] being predicates on node sets are in fact completely separate parts of teh Xpath grammar, they just happen to use the same concrete syntax.

Mike Kay then came in, laser like as ever!

> So it seems as if the reversed axis doesn't come into play - 
> of the entire nodeset, only the first is considered, but it's 
> the first in *document order*. Why?

A positional predicate in a step of a path expression considers the nodes in axis order: so for a reverse axis, [1] selects the last node in document order.

XSLT always processes node-sets in document order, and the conversion of a node-set to a string always uses the node that is first in document order; the axis that was used to select the nodes is irrelevant.

12.

Finding the nearest ancestor, one from two

David Carlisle (in JT mode)


Given xml such as:
  define
      choice
        define
          ref

is the nearest ancestor a choice or a define element?

perhaps the most direct way of saying nearest (choice or define) ancestor is a choice would be

    
     ancestor::*[self::define or self::choice][1][self::choice]

ancestor::*

is all the ancestor element nodes, and within this step (only) they will be ordered in reverse document order. as ancestor is a reverse axis. "this step" means up to the end of the expression or to the next / , predicates in [] are part of the step.

[self::define or self::choice] this is a boolean valued predicate so after evaluating this, the current node list just consists of nodes for which the predicate is true, ie define or choice elements.

[1] This is a numeric valued predicate so is [positon()=1] and as we are in a reverse axis step this is true for just the innermost element in the current node list ie the innermost define or choice

[self::choice] This is a node-set predicate so is true just if the node set is non-empty: it will be non-empty if the single node which got past the [1] is a choice and false otherwise, so it is true just if the innermost define-or-choice element is a choice.

David (Jeni) Carlisle :-)

13.

Finding the nearest ancestor, one from two

David Carlisle (in JT mode)


Given XML with this structure, and ref as the current node

  define
      choice
        define
          ref

is the nearest ancestor a choice or a define element?

Perhaps the most direct way of saying nearest (choice or define) ancestor is a choice would be

     ancestor::*[self::define or self::choice][1][self::choice]
ancestor::*

is all the ancestor element nodes, and within this step (only) they will be ordered in reverse document order. as ancestor is a reverse axis. "this step" means up to the end of the expression or to the next / , predicates in [] are part of the step.

  [self::define or self::choice]

this is a boolean valued predicate so after evaluating this, the current node list just consists of nodes for which the predicate is true, ie define or choice elements.

  [1]

This is a numeric valued predicate so is [positon()=1] and as we are in a reverse axis step this is true for just the innermost element in the current node list ie the innermost define or choice

  [self::choice]

This is a node-set predicate so is true just if the node set is non-empty: it will be non-empty if the single node which got past the [1] is a choice and false otherwise, so it is true just if the innermost define-or-choice element is a choice.

Jeni

14.

Testing the current node

Jeni Tennison



> How can I define a template for the "anchor" so that I can test if the 
> first "anchor" inside the immediate ancestral "section" is the current 
> "anchor" node?

You can find the first <anchor> inside the immediate ancestral <section> element with:

  ancestor::section[1]/descendant::anchor[1]

You can test whether that <anchor> element is the same as the current <anchor> element with:

  generate-id(current()) =
  generate-id(ancestor::section[1]/descendant::anchor[1])

15.

Context when processing in sibling axis

J.Pietschmann



> the context of the following-sibling following a preceding-sibling 
> should result you in the context of the current sibling?

It depends. Note:

> The short of it is, I am comparing the preceding-sibling with the 
> current.  The preceding-sibling must contain the code "XYZ" when the 
> current contains the code "WXY".

If you talk about "content", be aware that looking up elements both in the preceding-sibling and the following-sibling axis are likely to produce node sets, and the stringification of a node set will result in the string value of the first element in document order. An example XML

  <foo>
    <bar>1</bar>
    <bar>2</bar>
    <bar id="3">3</bar>
  </foo>

The statement

  <xsl:value-of select="
    /foo/bar[@id='3']/preceding-sibling::bar
    /following-sibling::bar"/>

Will get you a 2, not a 3 as you might expect. If in doubt, use a position predicate

  <xsl:value-of select="
    /foo/bar[@id='3']/preceding-sibling::bar[1]
    /following-sibling::bar[1]"/>

16.

Checking the name

Michael Kay



>
>     .[self::firstname or self::lastname or self::email]
>

Actually you can't write .[...] in XPath 1.0 (a peculiar accident of the syntax rules). But you can just write:

<xsl:if test="self::firstname or self::lastname or self::email">

or if you prefer

<xsl:if test="self::firstname|self::lastname|self::email">

Note that the self::xxx form is a better way of testing the element name than name()='xxx', because (a) it's namespace-aware, and (b) it's easier to optimize. The two reasons are related - the only reason that the test name()='xxx' can't be rewritten as self::xxx is that the former expression has to check the prefix rather than the namespace URI. Implementations are likely to be optimized for checking the URI, because that's what happens most of the time - there are very few operations in XPath that require access to the prefix.

17.

Descendent or //

David Carlisle, Andrew Welch



>> //doc//pb[1]
>> will return the first <pb> descendant of all <doc> elements
>
> that would be //doc/descendant::pb[1]
>
>  //doc//pb[1]
>
> returns all pb elements that are descendants of a doc element, that have
> no earlier siblings of the same name.

Well I had read that a few times (and do a few tests), but got there in the end... that's a real gotcha.

//doc//pb[1]

will return the first <pb> child of any descendant of <doc>, so you could get many <pb>s for a single <doc>

it's too easy as its explictly highlighted in the spec:-)

http://www.w3.org/TR/xpath20/#abbrev

Note:

The path expression //para[1] does _not_ mean the same as the path expression /descendant::para[1]. The latter selects the first descendant para element; the former selects all descendant para elements that are the first para children of their respective parents.

So is it correct to say that:

(//para)[1]

and

/descendant::para[1]

are equivalent.

yes, although of course the real expansion is

(/descendant-or-self::node()/child::para)[1]

but it comes to the same thing in this case.

note though that

(.//para)[1]

is not

./descendant::para[1]

it is

./descendant-or-self::para[1]