xslt position function

Position

1. Understanding position()
2. position() shows wrong value
3. position question
4. position function and select all elements
5. Is one tag directly preceding by another
6. Position function not behaving due to context change
7. Position of parent
8. Using position with sorted node
9. position() and last()
10. Position() calculation
11. position() question
12. Understanding positional predicates

1.

Understanding position()

Wendell Piez

Position() can be tricky to use except in very clearly controlled situations. (It's sometimes recommended as an alternative to more expensive operations, but it has its own traps to watch out for.)

The function returns the "context position" of the node being processed. This is tricky because the context position is the position of the node among the nodes it is picked up with, when it's picked up, i.e. the "current node list". And what, exactly, the current node list is, is sometimes a bit hard to see -- especially when things like whitespace-only nodes (which are easy not to see) get involved.

>I am trying to work through an example in Khun Yee Fung's XSLT book.  It is
>not giving me the solution he claims it should, but I do not understand why
>not.
>
>Here is the sample XML:
>[snipped]
>And here is the stylesheet:
>[more snipped]
><xsl:template match='warehouse'>
>   <storage>
>     <xsl:apply-templates select='item/country'/>
>   </storage>
></xsl:template>
>
><xsl:template match='country'>
>   <xsl:copy-of select='.'/>
></xsl:template>
>
><xsl:template match='country[1]'>
>   <first-country>
>     <xsl:copy-of select='.'/>
>   </first-country>
></xsl:template>
>
></xsl:stylesheet>
>
>The output I am getting [using Saxon] is:
><?xml version="1.0" encoding="utf-8"?>
><storage>
>    <first-country>
>       <country>US</country>
>    </first-country>
>    <first-country>
>       <country>Canada</country>
>    </first-country>
></storage>
>
>i.e., the first-country template is getting matched both times, even though
>the XPath expression uses country[1].

This is correct behavior. Each time the country element in your source is selected, it is the first country element among country element children of its parent. So both of them match the expression "child::country[position()=1]" (which is the unabbreviated form of the expression "country[1]").

>   What seems strange to me is that when
>I include the line <xsl:copy-of select='position()'/> at the beginning of
>the template matching country[1], I get a 1 and a 2.

This is also correct (though perhaps a bit arguable). Both 'country' elements were selected by the <xsl:apply-templates select="item/country"/> instruction in the template matching the 'warehouse' element. This is short for select="child::item/child::country". Since each is the first 'country' child of its parent 'item', both get position()=1. (The reason I say it's arguable is that I can see someone saying "but isn't the current node list the list of country children of item children"? But I'm not quite willing to dig into the spec this second and hash out why it's not.)

>I believe I am supposed to get:
><storage>
><first-country>
>   <country>US</country>
></first-country>
><country>Canada</country>
></storage>
>
>Am I doing something wrong, or is the example in Fung's book (Chapter 6, p.
>149) wrong?  How should the stylesheet be structured to get the intended
>output?

I can't speak to Fung's book since I don't have a copy. :-(

But a more robust match to do what you want would be

<xsl:template match="country[not(preceding::country)]" >

or, if you're concerned about performance (since this match will be slow on large documents), you can try

<xsl:template match="country[not(preceding::country)[1]]" >

which will allow some processors to optimize the test somewhat.

2.

position() shows wrong value

Jeni Tennison

Consider this input xml:

<uml>
  <package id="1"/>
  <package id="2"/>
  <package id="3">
    <package id="31"/>
    <package id="32"/>
  </package>
  <package id="4">
    <package id="41">
      <package id="411"/>
      <package id="412"/>
      <package id="413"/>
    </package>
  </package>
  <package id="5"/>
</uml>

Now I want to print the position of every 'package' element:

  <xsl:template match="package">
    <xsl:copy>
      <xsl:copy-of select="@id"/>
      <xsl:attribute name="position"><xsl:value-of
select="position()"/></xsl:attribute>
      <xsl:apply-templates/>
    </xsl:copy>    
  </xsl:template>

And the result under Saxon and Xalan is:

  <result>
    <package id="1" position="1"/>
    <package id="2" position="2"/>
  <package id="3" position="3">

> please take a look at this xsl fragment, maybe you find a simple
> explanation to strange behaviour of position() function for nested
> elements.

Answer.

When an XSLT processor builds a tree from an XML document, all the text in the document is included in the tree. This includes "insignificant whitespace" that appears between element nodes (purely to make the document more readable).

So when your document is converted to a tree, the tree actually looks like:

  uml
   +- text: "&#xA;  "
   +- package
   +- text: "&#xA;  "
   +- package
   +- text: "&#xA;  "
   +- package
   |   +- text: "&#xA;    "
   |   +- package
   |   +- text: "&#xA;    "
   |   +- ...
   +- ...

When you apply templates with just <xsl:apply-templates /> then the processor selects *all* the children of the current node: that includes all the whitespace. If you number the children, 1 is a text node (whitespace), 2 is a package element, 3 is another text node, 4 is a package element and so on. This is why you only get even numbers when you use <xsl:apply-templates />.

When you apply templates with <xsl:apply-templates select="package" /> then the processor selects only the package elements. If you number the children, 1 is the first child package element, 2 is the second child package element and so on. This is why you get the correct numbering in this case.

There are three solutions to your problem.

First, you could always select the nodes that you actually want to process (the package elements), so that you get numbering amongst those nodes.

Second, you could strip out the insignificant whitespace from the uml and package elements using the <xsl:strip-space> declaration. At the top level of your stylesheet, put:

<xsl:strip-space elements="uml package" />

This tells the processor to strip out all the whitespace-only text nodes that are children of uml or package elements. Then, when you use <xsl:apply-templates> without a select attribute, it will still select all the children of the uml or package element, but the tree won't contain any whitespace so the only children selected will be package elements.

Third, rather than using the position() function to get numbers for the package elements, you could use the <xsl:number> instruction:

  <xsl:attribute name="position">
    <xsl:number />
  </xsl:attribute>

By default, the <xsl:number> element counts only those nodes that are of the same kind (and have the same name, if they're elements) as the node you're numbering, so it will ignore the text nodes in the tree. If you use this option, the numbering will always reflect the position of the elements in the source tree, so might not give you the numbering you expect if you sort the packages into some other order to process them.

The second option is probably the best because as well as making your code easier, it also makes the tree smaller, which cuts down on the amount of memory the processor requires, which probably speeds up the processing (though probably unnoticeably unless your document is huge).

3.

position question

Michael Kay

 
> I'm a little bit confused about the return value of position().
> Given the following xml fragment: 
<list> 
  <item name="one"/>
  <item name="two"/> 
  <item name="three"/> 
</list> 
    

The <list> element has seven children, four of which are text nodes containing whitespace only. These are included in the numbering if they are included in the selection. You can either eliminate the whitespace nodes using <xsl:strip-space>, or you can select only the element children (or only the <item> children) for numbering.

4.

position function and select all elements

Gary L Peskin

Someone explain this to me:

<xsl:template match="/">

	<xsl:variable name="all_elements" select="//*"/>
	<xsl:variable name="first_3_elements" 
    select="//*[position() &lt; 3]"/>

	<xsl:text>&#xA;</xsl:text>
	<xsl:comment> selecting as $first_3_elements </xsl:comment>
	<xsl:for-each select="$first_3_elements">
		<xsl:value-of 
    select="concat('&#xA;node ',position()
     ,'=',name(),': ',normalize-space(text()))"/>
	</xsl:for-each>

	<xsl:text>&#xA;</xsl:text>
	<xsl:comment> 
 selecting as $all_elements[position() &lt; 3] </xsl:comment>
	<xsl:for-each select="$all_elements[position() &lt; 3]">
		<xsl:value-of 
         select="concat('&#xA;node ',position(),'=',
             name(),': ',normalize-space(text()))"/>
	</xsl:for-each>

	<xsl:text>&#xA;</xsl:text>
	<xsl:comment> selecting directly 
     as //*[position() &lt; 3] </xsl:comment>
	<xsl:for-each 
          select="//*[position() &lt; 3]">
		<xsl:value-of 
            select="concat('&#xA;node ',position(),'=',
                  name(),': ',normalize-space(text()))"/>
	</xsl:for-each>

</xsl:template>

The first and 3rd methods return all elements, regardless of position. The second method returns only the first 2 elements.

Answer

This points out the difference between using a predicate as part of a location step (Xpath production [4]) and using a predicate as part of a FilterExpr (Xpath production [20]).

In the case of //*[position() &lt; 3], 
this is an abbreviation for
/descendents-or-self::node()/child::*[position() &lt; 3]

In other words, this will select any node that is one of the first two children of another node. This is how a predicate is used as part of a location step.

In the case of

$all_elements[position() &lt; 3]

which is equivalent to

(//*)[position() &lt; 3])

, this will select the first two items in the node-set $all-elements.

As Mike Kay mentions in his excellent book (at p. 382 at the bottom), [] binds more tightly than /.

>The first and 3rd methods return all elements, regardless of position.

Well, not really, depending on your input XML. They match any node that is one of the first two children of another node. If none of your nodes have more than two children, the first and 3rd methods will return all elements. Try adding a third child to one of your nodes and you'll see that it won't be returned.

5.

Is one tag directly preceding by another

Mike Brown

> Simplified, part of my XML looks like this:
> 
>    <para>here is some content<button.../> more optional content 
>    <link... /></para>
> 
> My problem is that if there is nothing between button and link (i.e.,
> <button.../><link... />) then I need to do something 
> marginally different
> than if there is intervening content.  Is there any way to identify when
> processing the link template whether it was directly preceded by button.
> Thanks  Eric

'something' intervening = a text node is the 'link' element's preceding sibling. 'nothing' intervening = an element node ('button') is the link element's preceding sibling.

so if you have a template that matches "link" elements, you could probably do something like

<xsl:if test="generate-id(preceding-sibling::node()[1]) =
generate-id(preceding-sibling::*[1])"/>
  ...
</xsl:if>

to test for the first preceding sibling node and the first preceding element having the same internal ID (and thus being the same node). The predicate [1] on each is probably not necessary since generate-id() will only look at the first node in the given set, but I've included it for clarity.

Untested code.

6.

Position function not behaving due to context change

Mike Brown



> <xsl:template name="comma-block">
> <xsl:for-each select=".">
>         <xsl:value-of select="."/>
> <xsl:if test="not(position()=last())">, </xsl:if>
> </xsl:for-each>
> </xsl:template>
> 
> <xsl:template match="preferred-locations">
> <xsl:element name="p">
> Preferred Locations:
> <xsl:call-template name="comma-block"/>
> </xsl:element>
> </xsl:template>
> 
> now when I call the template it correct outputs the correct nodes but
> the xsl:if test doesn't work  .... position==last for all nodes.
> 
> When I inline the above code instead of using call-template the
> xsl:if test works as expected.
	

You are probably assuming that the current node list is defined by the template match pattern. Sections 1 (starting around "A template is instantiated for...") and 5.1 (one paragraph) of the XSLT spec explain the actual processing model.

The node-set that has been selected for processing is the current node list. There is always a current node list, and from that list there is always a current node being processed. You don't have direct access to the list, but you can get some information about it via functions like position(), last(), and count(). You do have access to the current node ('.' in your patterns and expressions).

The current node list starts out as being just the root node. The best matching template for that node is instantiated, and processing ends. For further node processing to occur, the template must contain an xsl:apply-templates or xsl:for-each instruction, either of which will select a new set of nodes for processing. In the case of apply-templates, the best matching template for each node is determined, and the instructions therein are executed. In the case of for-each, the content of the for-each element is the template used for each node.

Each selected node is processed one by one, each becoming the current node while the best template is instantiated, and the current node list staying the same (the set that was originally selected). When you do a call-template, variables from the calling-template go out of scope, but the current node and current node list stay the same.

Somehow you are getting to a point where you have at least one preferred-locations element being processed. How are you getting there? Elsehwere in your stylesheet there must be an xsl:apply-templates that is selecting preferred-locations elements. It is there that the current node list is being established. It is relative to this list that position() and last() will operate outside of that xsl:for-each in your named template. Inside the xsl:for-each, you've resent the current node list to the current node only, so position() and last() are both going to be 1 no matter how many times you call that template.

At the point where you are selecting the preferred-locations elements for processing, you'll want to put

<xsl:element name="p">
  <xsl:text>Preferred Locations: </xsl:text>
  <xsl:call-template name="comma-block">
    <xsl:with-param name="nodes" select="preferred-locations"/>
  </xsl:call-template>
</xsl:element>

and then have

<xsl:template name="comma-block">
  <xsl:param name="nodes"/>
  <xsl:for-each select="$nodes">
    <xsl:value-of select="."/>
    <xsl:if test="position() != last()">, </xsl:if>
  </xsl:for-each>
</xsl:template>

There is no need for a template to match the preferred-locations elements because you're using the contents of the for-each as the template for them.

7.

Position of parent

Joe English


> Can anyone provide me with the syntax for getting the position() value of
> the current nodes' parent node (and the parent parent etc. position()
> value).  I seem only able to return the current position().
	

The position() is not an intrinsic property of a node -- it only makes sense to ask for the position() of a node _in the context of some node list_ I'm using the term "node list" to mean "a node set with

    an associated order", e.g., proximity position order,
    document order, an <xsl:sort...>-defined order, etc.

Most likely you want the position of the parent node with respect to its siblings, in document order; you can compute this with:

	1 + count(parent::*/preceding-sibling::*)

8.

Using position with sorted node

Michael Kay

> sorting the node list > makes accessing the preceding nodes rather tricky since the > preceding and > preceding-sibling axes work on the document and not the > sorted node list. I think the only way of accessing the preceding node in sorted order is to create a sorted copy of the original data and then process this using the node-set() extension function. Either that, or find a different solution to the requirement.

9.

position() and last()

Michael Kay



> I got several problems with position() and last(). The result of 
> position() is twice of the real position and the result of last() is 
> twice of  value or twice of the real value +1. 

When you do <xsl:apply-templates/> you don't only select the child elements of the context node, you also select the whitespace text nodes that lie between the elements. Usually the elements are in positions 2,4,6, and the whitespace text nodes are in positions 1,3,5,7.

If the whitespace isn't significant, get rid of it using <xsl:strip-space elements="*"/>

Alternatively use <xsl:apply-templates select="*"/> to select only the element children.

10.

Position() calculation

Michael Kay



>  For example,
>  
> <element>
>  <A/>
>  <B/>
>  <A/>
>  <C/>
>  <C/>
>  <D/>
> </element>
>  
> I want to get the position first element whoes name is not A and B, 
> but C & D. So it's 4 for this tree segment.
> How can I write the xslt to figure out this ?

count((C|D)[1]/preceding-sibling::*) + 1

alternatively:

<xsl:for-each select="(C|D)[1]">
  <xsl:number count="*"/>
</xsl:for-each>

11.

position() question

Michael Kay



> I want to get the position first element whoes name is not A and B, 
> but C & D. So it's 4 for this tree segment.

>  example,
>  
> <element>
>  <A/>
>  <B/>
>  <A/>
>  <C/>
>  <C/>
>  <D/>
> </element>
>  

count((C|D)[1]/preceding-sibling::*) + 1

alternatively:

<xsl:for-each select="(C|D)[1]">
  <xsl:number count="*"/>
</xsl:for-each>

12.

Understanding positional predicates

Josh Canfield and Michael Kay

Take a look at W3C

"The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step. The axis affects how the expression in each predicate is evaluated and so the semantics of a predicate is defined with respect to an axis. See [2.4 Predicates]."

This clearly lays out how multiple predicates are evaluated, and what their node-set context is.

Try thinking about it as layers of filters.

The first predicate [something] returns a node-set containing only nodes which have a child element named "something". The second predicate is operating on this new node-set, so [2] returns the second node in that list.

Consider the following xml:

<nodes>
  <b/>
  <b id="1"><something id="1.1"/></b>
  <b id="2"/>
  <b id="3"><something id="3.1"/></b>
</nodes>

this xsl:

<xsl:copy-of select="/nodes/b[something]"/>

would output

<b id="1"><something id="1.1"/></b>
<b id="3"><something id="3.1"/></b>

this xsl:

<xsl:copy-of select="/nodes/b[something][2]"/>

would output

<b id="3"><something id="3.1"/></b>

MK offers

> b[something][2] - match the 'b' which is the second sibling 'b' if 
> 'something' evaluates to true. 

No, it is 'match the second sibling b for which "something" is true'.

1. Take all the b's

2. remove those for which predicate 1 is false 3. take the 2nd of those that remain