xslt2 functions

1. tokenize over multiple elements
2. Spurious spaces in function output
3. xsl:function, user defined functions
4. Sorting
5. Including CSS files
6. calculate depth of an xml-tree
7. Index-of, using nodes
8. Multiple string replacements
9. Insert a character every nth character
10. Counting characters
11. Match an element and last two words of preceding element content
12. Find Word frequency [tokenise]
13. Configuration file as unparsed-text()
14. index-of function
15. sequence()
16. Escaping quotes
17. Fast node comparison
18. URI Escaping
19. Data types in functions
20. Document Crossreferences. Keys?
21. document-available()
22. Find longest row (max function)
23. Hex to decimal conversion
24. intersect function
25. How to use generate-id() inside an xsl:function without a node available?
26. Understanding position()
27. Count all caution elements
28. Document available?
29. for-each context
30. Function with variable number of arguments
31. Check for duplicate ID values across files
32. Intersection 2.0

1.

tokenize over multiple elements

Andrew Welch


I need to tokenize a string that may span
multiple text nodes or elements.  The tokenize() function won’t take a
sequence of more than one as its first argument, and I can’t figure out
how to concatenate the values of the nodes in the set in situ (I’ve even
tried a FLWR expression!).

 tokenize( current-group()[position() > 1], '\s*;\s*' )

is what I have right now.  I’ve tried wrapping the sequence in concat()
(which wants more than one argument), in string-join(), and a FLWR that
just resulted in a sequence of strings rather than a concatenation.

how about:

current-group()[position() > 1]/tokenize(. , '\s*;\s*' )

2.

Spurious spaces in function output

Michael Kay

Demonstrated using this example.

 
<xsl:stylesheet
   version="2.0"
   xmlns:d="data:,dpc"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   exclude-result-prefixes="d">

<xsl:function name="d:chars">
[<xsl:value-of select="'x'"/>]
</xsl:function>


<xsl:template match="/">
<xsl:value-of select="d:chars()"/>
[<xsl:value-of select="'x'"/>]
</xsl:template>

</xsl:stylesheet>

Running the above stylesheet on itself produces:

bash-2.05b$ saxon8 bugchar.xsl bugchar.xsl 

Provides the following output

<?xml version="1.0" encoding="UTF-8"?> 
[ x ]

[x]

Note the d:chars function produces [ x ] with spaces around the x.

MK replies

This isn't in fact a bug, it's just a surprising consequence of the current language specification. If a function produces as its result a sequence of text nodes, and this sequence is then displayed using xsl:value-of, the xsl:value-of instruction atomizes the sequence of text nodes into a sequence of strings, and the strings are then space-separated.

the explanation of this effect is:

The function returns a sequence of three text nodes, whose contents are '#[', 'x', and ']#' where # represents a newline.

The template uses xsl:value-of with a select expression that selects this sequence of text nodes. By default, xsl:value-of atomizes the value of the select expression, and then inserts spaces between adjacent items. Atomization produces a sequence of three strings, and after adding spaces the result is a text node containing '#[_x_]#' where # represents newline and _ represents space. Using xsl:copy-of or xsl:sequence in place of xsl:value-of would solve the problem (because when three text nodes are added to a document node, they are combined without any separator); alternatively use xsl:value-of separator="".

You can force the text nodes to be concatenated by writing the function as:

<xsl:function name="d:chars" as="xs:string">
[<xsl:value-of select="'x'"/>]
</xsl:function>

Alternatively, use separator="" on the xsl:value-of instruction.

3.

xsl:function, user defined functions

Jeni Tennison




> Would someone please give me a simple example of creating a user
> defined function using <xsl:function>
>
> I'm having a really hard time finding complete examples for some
> reason.

I suspect that's because <xsl:function> was only introduced in XSLT 2.0, which isn't even a Last Call Working Draft yet and has very few implementations.

<xsl:function> works in roughly the same way as <func:function> as defined in EXSLT (http://www.exslt.org/func/elements/function). You can find lots of examples of <func:function> on the EXSLT site -- most of the functions defined there have a <func:function> implementation.

An example is the following fairly useless function that adds two things together:

<xsl:function name="my:add">
  <xsl:param name="val1" />
  <xsl:param name="val2" />
  <xsl:result select="$val1 + $val2" />
</xsl:function>

All functions you define with <xsl:function> have to be in some namespace, which means that their names are always qualified. In this example, you have to have the 'my' prefix associated with a namespace at the top of your stylesheet.

You can call the function with, for example:

  <xsl:value-of select="my:add(1, 3)" />

to get the value 4.

If you want, you can constrain the types of the parameters to the function and declare the type of the result using 'as' attributes. This will enable/force the implementation to raise type errors if the function is passed the wrong type of arguments or used somewhere that expects something other than a number. For example, to create a my:add() function that will only work with integers:

<xsl:function name="my:add">
  <xsl:param name="val1" as="xs:integer" />
  <xsl:param name="val2" as="xs:integer" />
  <xsl:result select="$val1 + $val2" as="xs:integer" />
</xsl:function>

Note again that the 'xs' prefix has to be associated with the 'http://www.w3.org/2001/XMLSchema' namespace at the top of your stylesheet.

If you're after concrete examples of user-defined functions in use, I used quite a few in some stylesheets I wrote over the weekend, which are available at:

  http://www.lmnl.org/projects/LMNLCreator/LMNLCreator.xsl
  http://www.lmnl.org/projects/LMNLSchema/LMNLNester.xsl

The stylesheets are not run-of-the-mill, but they do use XSLT 2.0 features, including <xsl:function>, quite heavily.

4.

Sorting

Michael Kay

XSLT 2.0 also introduces a sort() function that takes a named sort key as a parameter, which can be determined at run-time:

<xsl:for-each select="sort($x, if (condition1) then 'sortkey1' else
'sortkey2')">

5.

Including CSS files

Michael Kay



> I'm parameterising a CSS file name, and want to include it
> via <style/> tags or externally.

> If its included, I can't use an entity or the document() 
> function, since its not an xml file :-)

You can use the unparsed-text() function in XSLT 2.0.

6.

calculate depth of an xml-tree

Michael Kay

<xsl:for-each select="//*">
  <xsl:sort select="count(ancestor-or-self::*)" data-type="number"/>
  <xsl:if test="position()=last()">
    <xsl:value-of select="count(ancestor-or-self::*)"/>
  </xsl:if>
</xsl:for-each>

Or in 2.0:

max(for $n in //* return count($n/ancestor-or-self::*))

You can possibly speed it up a little by excluding non-leaf elements, or by using a recursive template that supplies the depth of a node as a parameter, avoiding the need to count ancestors of every node.

7.

Index-of, using nodes

Mike Kay


> So I can't find index-of (node-set, node) ?
>   For this instance it would have been useful.

You can write it yourself as:

<xsl:function name="index-of-node" as="xs:integer*">
  <xsl:param name="node-set" as="node()*"/>
  <xsl:param name="node" as="node()"/>
  <xsl:sequence select="
    for $i in 1 to count($node-set) return
       if ($node-set[$i] is $node) then $i else ()"/>
</xsl:function>

8.

Multiple string replacements

David Carlisle

If you need 8 replacements then nested invocations of replace should do it:

  replace(replace(replace(....)
                  ...
          '\{', '{day}{of}')

This is still a little bit mess

and of course you dont need to explictly nest the replace functions, you can get the system to do it for you.

This defines an x:replace function that takes an input string and then a list of replacement pairs, it just recursively calles replace() until the list is done

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:x="data:,x" version="2.0">

<xsl:output method="text"/>

<xsl:function name="x:replace">
<xsl:param name="string"/>
<xsl:param name="list"/>
<xsl:value-of select="
  if(empty($list)) 
   then $string 
   else
x:replace(replace($string,$list[1],$list[2]),$list[position()&gt;2])"/>
</xsl:function>

<xsl:template match="/">
<xsl:value-of select="
  x:replace('one two three four',
  ('o', '@',
   'tw','TW',
   'e', '3'))
 "/>
</xsl:template>



</xsl:stylesheet>
$ saxon7 rep.xsl rep.xsl
@n3 TW@ thr33 f@ur

9.

Insert a character every nth character

Mike Kay / DaveP

E.g. insert a space every tenth character, or as per this example, insert X every 4th character. gpSz is the group size, s is the source string to be split.

Mike gave a solution untested, which I show here after testing it.

  <xsl:variable name="s" select="'A long string with commas
	  inserted 
every 4th character'"/>
    <xsl:variable name="gpSz" select="4"/>
    
    <xsl:value-of select="  string-join(
  for $i in 0 to (string-length($s) idiv 4)
                          return substring($s, $i*$gpSz + 1, $gpSz), ',')"/>

10.

Counting characters

Michael Kay


> What I want to do is to count the number os characters in a  
> text node and all previous text nodes children of the current 
> text node's parent.

Well the XPath 2.0 solution is

sum(for $i in preceding-sibling::text() return string-length($i))

For XSLT 1.0 it's much more difficult, it's the classic problem of summing a calculated value over a node-set. There are several workable solutions:

- Construct a result tree fragment containing the computed values, then use the sum() and xx:node-set() functions to do the summation.
- Use a recursive template
- Use Dimitre's FXSL library.

11.

Match an element and last two words of preceding element content

Michael Kay

In XSLT 2.0 (actually XPath 2.0) there is a standard function tokenize() which does the job for you. You can then select the last two words using

<xsl:value-of 
    select="tokenize(preceding-sibling::node()[1][self::text()],
   "\s+")[position() > last() - 2]"/>

12.

Find Word frequency [tokenise]

Michael Kay David Carlisle


if I have:

<foo>
<blort> This is a <wibble>Test</wibble>, only a test!</blort> <blort> This
really is a <wibble>great big test</wibble>, only a test!</blort> </foo>

I don't know that foo|wibble|blort  will be the element names.

But I want to produce a word frequency list:

a  -- 4
test  -- 4
only -- 2
is  -- 2
this  -- 2
big -- 1
great -- 1
really -- 1

Mike Kay offers

Firstly, taking the string value of the element gets rid of all the element markup, which doesn't seem to play any role in this problem.

Then you can tokenize using the tokenize() function, being as clever as you care about how to recognize word boundaries and inter-word space.

Then you can convert everything to lower case using the lower-case() function.

Then you can group using for-each-group.

Sorted by descending frequency:

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="/">
<frequencies>
<xsl:for-each-group group-by="." select="
   for $w in tokenize(string(.), '[\s.?!,]+')[.] return lower-case($w)">
  <xsl:sort select="count(current-group())" order="descending"/>
  <word><xsl:value-of select="current-grouping-key(), '  -  ',
count(current-group())"/></word> </xsl:for-each-group> </frequencies>
</xsl:template> </xsl:stylesheet>

(The predicate [.] elimitates the zero-length string)

Here's the start of the output for othello.xml:

<?xml version="1.0" encoding="UTF-8"?>
<frequencies>
   <word>i   -   816</word>
   <word>and   -   794</word>
   <word>the   -   762</word>
   <word>to   -   591</word>
   <word>of   -   476</word>

David Carlisle offers another xslt 2.0 solution.

   
   <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="2.0">
   
   <xsl:output method="text"/>
   
   <xsl:template match="/">
   <xsl:for-each-group
   select="tokenize(lower-case(.),'(\s|[,.!:;])+')[string(.)]" group-by=".">
   <xsl:sort select="- count(current-group())"/> <xsl:value-of
   select="concat(.,' - ',string(count(current-group())),'&#10;')"/>
   </xsl:for-each-group>
   </xsl:template>
   
</xsl:stylesheet>

13.

Configuration file as unparsed-text()

Michael Kay


>I have a "*"-delimited config file for our internal
>app.  The syntax is as follows:
>
>param1*param2*value=3
>param1*param2*param3*value=4
>param1*value=1
>param2*value=3
>
>etc..
>
>What's the best way of doing this? 

This sort of conversion can be done very nicely in XSLT 2.0: I showed a very similar example in my tutorial at XML Europe.

1. read the file using the unparsed-text() function
2. split it into lines using the tokenize() function
3. analyze each line using the <xsl:analyze-string> instruction
4. group the resulting lines hierarchically using the <xsl:for-each-group> instruction, within a recursive template so you do the grouping repeatedly at each level of hierarchy.

14.

index-of function

Michael Kay



> The index-of function as specified in the XPath 2.0 spec says that if 
> the input string is for example "Hello John" and the search string is 
> "John" the function should return 2.  But  I get an 
> empty sequence as the result. 

You have misunderstood the spec. If the input is a sequence of two strings, ("Hello", "John"), then index-of will return 2. To convert the string "Hello John" into a sequence of two strings, use the tokenize() function.

15.

sequence()

Michael Kay



>  Is there a function that's concerned about 
> sequence of nodes in Saxon? Like one which returns the orders of 
> nodes.

> For instance, assume a sequence
> "<a>AAA</a><b>BBB</b><c>CCC</c><d>DDD</d>"

> Then the function can return something like:

> <a>AAA</a> => 1;
> <b>BBB</b> => 2;
> <c>CCC</c> => 3;
> <d>DDD</d> => 4;

> If so, how can it be invoked

Appendix C.3 of the Functions and Operators spec shows how to write this as a user-written function:

XSLT implementation

<xsl:function name="eg:index-of-node" as="xs:integer*">
  <xsl:param name="sequence" as="node()*"/>
  <xsl:param name="srch" as="node()"/>
  <xsl:for-each select="$sequence">
    <xsl:if test=". is $srch">
       <xsl:sequence select="position()"/>
    </xsl:if>
  </xsl:for-each>
</xsl:function>

16.

Escaping quotes

Michael Kay

XPath 2.0 allows you to escape the delimiting quotes by doubling them, for example

 "He said: ""I don't"""

You can achieve this escaping using the XPath 2.0 replace() function.

17.

Fast node comparison

Dimitre Novatchev


I'm checking if a number exists within a set of ranges.  The ranges are
stored as a variable:

<xsl:variable name="ranges">
  <range from="988" to="989"/>
  <range from="1008" to="1009"/>
  <range from="1014" to="1014"/>
  <range from="1025" to="1036"/>
  <range from="1038" to="1103"/>
  <range from="1105" to="1116"/>
  <range from="1118" to="1119"/>
  <range from="4150" to="4150"/>
  <range from="8194" to="8197"/>
  ...
</xsl:variable>

-Is there a better way of writing this?

-How efficient is the test?  Does it check each <range> element
sequentially in document order?

> If there are many ranges and you need it to go at better than linear 
> speed, you could code a binary-chop. I think Dimitre has done this in 
> the past, I don't know if it's available in packaged form.

Here are two XSLT 2.0 solutions: a DVC (Divide and Conquer) and BS (Binary Search):

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:f="http://fxsl.sf.net/"
 xmlns:t="http://fxsl.sf.net/test"
 exclude-result-prefixes="f xs t"
 >
  <xsl:output method="text"/>
  
  <xsl:variable name="vRanges" as="element()+">
    <range from="988" to="989"/>
    <range from="1008" to="1009"/>
    <range from="1014" to="1014"/>
    <range from="1025" to="1036"/>
    <range from="1038" to="1103"/>
    <range from="1105" to="1116"/>
    <range from="1118" to="1119"/>
    <range from="4150" to="4150"/>
    <range from="8194" to="8197"/>
  </xsl:variable>
  
  <xsl:template match="/">
    <xsl:value-of select="t:inRangeDVC($vRanges, 8195)"/>, <xsl:text/>
    <xsl:value-of select="t:inRangeBS($vRanges, 8195, 1, count($vRanges))"/>
  </xsl:template>
  
  <xsl:function name="t:inRangeDVC" as="xs:boolean">
    <xsl:param name="pRanges" as="element()*"/>
    <xsl:param name="pVal"/>
  
    <xsl:sequence select=
    "if(empty($pRanges))
       then false()
       else for $cnt in count($pRanges)
             return if($cnt = 1)
                   then $pVal ge xs:integer($pRanges[1]/@from)
                      and 
                        $pVal le xs:integer($pRanges[1]/@to)
                    else for $vHalf in $cnt idiv 2
                      return
                      if(t:inRangeDVC($pRanges[position() le $vHalf], $pVal))
                         then true()
                         else 
                          t:inRangeDVC($pRanges[position() gt $vHalf], $pVal)
                    "
    />
  </xsl:function>
  
  <xsl:function name="t:inRangeBS" as="xs:boolean">
    <xsl:param name="pRanges" as="element()*"/>
    <xsl:param name="pVal"/>
    <xsl:param name="pLow" as="xs:integer"/>
    <xsl:param name="pUp" as="xs:integer"/>

    <xsl:sequence select=
    "if($pLow gt $pUp)
       then false()
       else for $mid in ($pLow + $pUp) idiv 2,
                $v in $pRanges[$mid]
               return
                  if($pVal ge xs:integer($v/@from) 
                       and $pVal le xs:integer($v/@to))
                     then true()
                     else if($pVal lt xs:integer($v/@from))
                          then t:inRangeBS($pRanges, $pVal, $pLow, $mid - 1)
                          else t:inRangeBS($pRanges, $pVal, $mid+1, $pUp)
                        
    "/>
  </xsl:function>
</xsl:stylesheet>

18.

URI Escaping

Michael Kay

Non-Ascii characters in a URI should be escaped using the %HH convention, rather than using XML escaping.

XSLT 2.0 provides a function escape-uri() to achieve this.

In 1.0, it happens automatically when you use the HTML serialization method if the URI appears in an attribute such as <a href="..."> that is known to require a URI as its value.

19.

Data types in functions

Michael Kay



> I'm having a few problems with an xsl:function.

> Its supposed to take a numerical parameter and return it multiplied by
> 1.2 and rounded:

> <xsl:function name="my:increase">
>     <xsl:param name="i" />
>     <xsl:value-of select="round(1.2 * $i)" /> </xsl:function>

Firstly, if you're expecting a numerical parameter it's best to say so:

  <xsl:param name="i" as="xs:double"/>

and if you want to return a numerical result it's best to say so:

  <xsl:function name="my:increase" as="xs:double">

This might be enough to fix the problem (because it will force certain type conversions), and if it doesn't, it will give you error messages that point you closer to the answer.

Secondly, xsl:value-of creates a text node. You don't want a text node here, you want a number. So use xsl:sequence:

<xsl:function name="my:increase" as="xs:double">
     <xsl:param name="i" as="xs:double"/>
     <xsl:sequence select="round(1.2 * $i)" /> </xsl:function> 

In most contexts, if you expect a number and provide an untyped text node, the number will be extracted from the text node. But it's better to return the number in the first place.

20.

Document Crossreferences. Keys?

Michael Kay


> Say I have some (grossly simplified, politically incorrect, and 
> exclusive of alternate lifestyles) XML which looks something like 
> this:


> <?xml version="1.0" encoding="UTF-8"?> <People>
>    <Man name="Bob" wife="Alice" birth="1960-08-15"/>
>    <Woman name="Alice" birth="1955-10-26"/> </People>


> To cut a long story short, I have an xsl template which scopes Woman, 
> and I want to set a variable to be that Woman's husband (ie the Man 
> for whom the Woman is the wife).

Keys are usually recommended for performance, but when you're handling cross-references they can also make your code simpler and more understandable.

<xsl:key name="man-by-name" match="Man" use="@name"/> 
<xsl:key name="woman-by-name" match="Woman" use="@name"/> 
<xsl:key name="man-by-wifes-name" match="Man" use="@wife"/>
<xsl:template match="Woman">
  <xsl:apply-templates select="key('man-by-wifes-name', @name)"/>

In 2.0 I often write stylesheet functions to encapsulate a relationship:

<xsl:function name="get-husband" as="element(Man)">
  <xsl:param name="wife" as="element(Woman)">
  <xsl:sequence select="key('man-by-wifes-name', $wife/@name)"/> </xsl:function>

You can then use this in path expressions rather like a virtual axis:

<xsl:template match="Woman">
  <xsl:value-of select="get-husband(.)/get-children(.)/@date-of-birth"/>

21.

document-available()

Michael Kay

> Does doc-available() do anything more than check for
> the files existence?  Using it seems to slow the processing down more
> than expected..

doc-available() checks that the file exists and that it contains well-formed XML (and valid XML if you are validating). It actually builds the tree in memory. When you subsequently call document() or doc(), this work won't be repeated. If you just want to check file existence, calling out to a java method is going to be rather cheaper.

22.

Find longest row (max function)

Andrew Welch


> We are trying to output tables in XSL-FO, but I do not seem to be able to
> easily find the table row with the maximum number of cells within a table,

If you are generating your XSL-FO using XSLT 1.0 then the usual way is to select all <tr>'s and sort them by the count of their <td>'s, and then pick the first: see faq If you are using XSLT 2.0 then you can use the max() function, eg:

<xsl:variable name="maxCells" select="max(//tr/count(td))"/>

23.

Hex to decimal conversion

Michael Kay

The hex-to-decimal conversion: here's a function I wrote to do this:

<xsl:function name="f:hex-to-char" as="xs:integer">
  <xsl:param name="in"/> <!-- e.g. 030C -->
  <xsl:sequence select="
  if (string-length($in) eq 1)
     then f:hex-digit-to-integer($in)
     else 16*f:hex-to-char(substring($in, 1, string-length($in)-1)) +
            f:hex-digit-to-integer(substring($in, string-length($in)))"/>
</xsl:function>

<xsl:function name="f:hex-digit-to-integer" as="xs:integer">
  <xsl:param name="char"/>
  <xsl:sequence 
	  select="string-length(substring-before('0123456789ABCDEF',
	  $char))"/>
</xsl:function>

24.

intersect function

David Carlisle



> Which makes me wonder in what scenario's 'intersect' can be useful.

When you want to know if a node is in two different sets. for example suppose you have a key that returns some nodes key('x','a') and some more nodes key('x','b') now, which modes are returned by both a and b. You can do this in xslt1 as key('x','a')[count(key('x','b'))=count(.|key('x','b'))] but it's rather more readable to say key('x','a') intersect key('x','b')

> (if automatic node-to-value were applied for intersect).

That would get very confusing, especially for text nodes (which many people use interchangeably with strings). You want it to be clear in the syntax whether you are doing identity-equality (so two nodes are ony equal if they are the same node, or value-equality, where two items are equal if they have the same string value.

25.

How to use generate-id() inside an xsl:function without a node available?

Dimitre Novatchev



Have an auxiliary function, which creates a new node every time t is
evaluated, for example using:
<xsl:function name="pref:GetNode" as="element()">
<xsl:variable name="myNode" as="element()">
  <someNode/>
</xsl:variable>

 <xsl:copy-of select="$myNode"/>
<xsl:function

Then in your code use:

 generate-id(pref:GetNode() )

It may even be possible to only use one node (and then immediately delete it as part of the closing of the scope of the function. Looking at the spec I am not sure, however if re-using the generated ID for a node (which is no longer alive) is allowed or not. If it is not allowed, then we have the following *cheap* implementation:

<xsl:function name="pref:myId" as="xs:string">
<xsl:variable name="myNode" as="element()">
   <someNode/>
</xsl:variable>

<xsl:variable name="vdynNode" as="element()">
  <xsl:copy-of select="$myNode"/>
</xsl:variable>

 <xsl:sequence select="generate-id($vdynNode)"/>

</xsl:function>

DC offers

surely you can lose the first variable and write that as

<xsl:function name="pref:myId" as="xs:string">
<xsl:variable name="myNode">x</xsl:variable>

 <xsl:sequence select="generate-id($myNode)"/>

</xsl:function>

MK comes back with

Yes, this is fine. But please get out of the habit of using xsl:value-of when you mean xsl:sequence, and please get into the habit of declaring the types of your functions! This should be

 <xsl:function name="pref:getId as="xs:string">
   <xsl:sequence select="generate-id(pref:getNode())" />
 </xsl:function>

xsl:value-of is creating a text node, and because of the very identity issues we're discussing, it's very hard to optimize this away: if the return type is declared as xs:string the processor has some chance of recognizing that the text node is going to be atomized as soon as it's created, but really it's better not to create it in the first place.

The cheapest solution is probably a text or comment node rather than an element, something like:

 <xsl:function name="pref:getNode"><xsl:comment/></xsl:function>

 <xsl:function name="pref:getId">
   <xsl:value-of select="generate-id(pref:getNode())" />
 </xsl:function>

Remember that an LRE like <node/> might be creating a lot of namespaces...

26.

Understanding position()

Abel Braaksma


> <xsl:template match="/">
>  <xsl:sequence select="('a', 'b')[fn:position()]"/>
> </xsl:template>
>
> returns ('a', 'b').
>
> Why is this? Everything tells me the last expression should evaluate to 'a'. Did I miss something?
>

It's a classic, and every now and then, I do it wrongly still (check the archives, I have once or twice asked about the same question). The function position() (you don't need to put "fn:" in front of it) returns the position of the context node. The context changes within the XPath expression to each node it is evaluating.

Thus, the expression

('a', 'b')[position()]

a) will test the first item in the sequence 'a' for the predicate [position()] which evaluates to the predicate [1], which is short for [position() = 1], which will return true because the current position is 1.

b) will test the second item in the sequence 'b' for the predicate [position()] which evaluates to the predicate [2], which is short for [position() = 2], which will return true because the current position is 2 (we are at the second item, 'b', remember).

As it comes, the following:

some-node[position()]

will always evaluate to true, and

(some-sequence)[position()]

will always return the whole sequence, because each separate item will always have the position in the sequence that equals the result of the predicate [position()].

What you want is the following:

<xsl:variable name="pos" select="position()" />
<xsl:sequence select="('a', 'b')[$pos]" />

27.

Count all caution elements

Mike Kay



I have the following XML structure

<book>
<chapter>
<caution/>
<caution/>
<caution/>
</chapter>

<chapter>
<sect1>
<caution/>
</sect1>
<caution/>
<caution/>
</chapter>
</book>

>What I need is an XPATH statement that counts the number of preceding
>cautions in each chapter.  From any given chapter element.

In XPath 2.0,

count(preceding::caution intersect ancestor::chapter//caution)

In 1.0, you can simulate the intersect operator using the equivalence

A intersect B ==> A[count(.|B) = count(B)]

But you might be better off using

<xsl:number count="caution" level="any" from="chapter"/>

28.

Document available?

David Carlisle

Use doc-available() rather than fn:doc-available() (as earlier drafts of xslt2 used different namespaces, but all drafts define the default function namespaces to be the correct namespace for that draft, so if your implementation implements an old draft of xslt2 then it will still work)

IE (and other browsers such as mozilla and opera) do not support XSLT2 (and are unlikely to support them for some years one would assume)

To check if a file exists in xslt1 you can use

test="document('foo.xml')"

which will be false if the processor returns an empty node set for missing files (they are also allowed to raisean error) or you can escape to an extension language (in IE but not in mozilla) msxml for example allows you to define functions in javascript, so assuming that you are in a situation that browser security allows access to the filesystem at all you can text in javascript (or any other ms scripting language)

29.

for-each context

Jay Bryant


> trying to figure out what the error with this fragment (XSLT 2) is:
>
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="2.0">
>
>   <xsl:template match="src">
>     <xsl:for-each select="tokenize( 'a c', '[ ]')">
>       <xsl:apply-templates select="/doc/*[current() eq string(@id)]" />
>     </xsl:for-each>
>   </xsl:template>
>
> </xsl:stylesheet>
>
>
> This gives an error, "Cannot select a node here: the context item is an
> atomic value".
>
> I see that the xsl:for-each iterates over a sequence of atomic values
> (xs:string). So I assume that current() results in an xs:string, which
> should be comparable to the string(@id) expression. The select="..."
> expression does not depend on a context node, as it is an absolute path
> from the document root. xsl:apply-templates results in a sequence and
> therefore should be no problem in the xsl:for-each body. It's certainly
> trivial - I just don't see it right now... :-/
>
> An input for the above would e.g. be:
>
>
> <doc>
>     <src />
>     <elem id="a">A</elem>
>     <elem id="b">B</elem>
>     <elem id="c">C</elem>
>     <elem id="d">D</elem>
> </doc>
>
>
> The desired output would (probably...) be:
>
> AC

By using tokenize inside a for-each, you've set the context to a string that has no relationship to your input document.

To fix it, use a variable that contains the root element (I call those "anchor variables"), thus:

 <xsl:template match="src">
   <xsl:variable name="root" select="/"
   <xsl:for-each select="tokenize( 'a c', '[ ]')">
     <xsl:apply-templates select="$root/doc/*[current() eq string(@id)]" />
   </xsl:for-each>
 </xsl:template>

That way, your apply-templates instruction finds the context you need.

30.

Function with variable number of arguments

David Carlisle et al



>Is there a way to encode a function 
>that takes a variable number of arguments like the concat function?

Andrew offers, Why not just have the function take one argument that is a sequence?

DC follows up with

Also, without using any extension at all, you can make a function that takes a sequence (like string-join) rather than an arbitrary number of arguments (like concat) the only practical difference to the end user is that you have to double the brackets

define local:function to take xsl:integer* as argument, and you can do

local:function((1,2))

local:function((1,2,3,4,5,6))

Dimitre expands this with

Another approach, limited but pragmatic, is the one taken by FXSL to use many overloads for the function, one for a given allowed number of arguments.

Thus, if I would expect no more than, say, 10 arguments to be specified, I would define the following overloads:

x:fnName(arg1),
x:fnName(arg1,arg2),
x:fnName(arg1,arg2,arg3),
x:fnName(arg1,arg2,arg3,arg4),
x:fnName(arg1,arg2,arg3,arg4,arg5),
x:fnName(arg1,arg2,arg3,arg4,arg5,arg6),
x:fnName(arg1,arg2,arg3,arg4,arg5,arg6,arg7),
x:fnName(arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8),
x:fnName(arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9),
x:fnName(arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9,arg10)

Certainly, the common code implementing these overload can typically be put into a single auxiliary function, so that redundancy would be avoided.

Also, the above overloads can be generated programmatically with another transformation :)

31.

Check for duplicate ID values across files

Mike Kay



I have a project were I take about 800 files and transform them into one .fo
document (create a pdf). The files are authored independent of each other
(stand alone) which are validated against a schema (independently). If there
is a duplicate @id attribute, the validator will tell me. However, the @ids
need to be unique across all files processed by the collection function. If
they are not, my resulting .fo will have duplicate @ids, which cause FOP to
halt.

I am looking for a way to (query or xslt 2.0) to check for duplicate @id's
values across all files with the collection function. This would be a
pre-production check.

<xsl:for-each-group select="collection(...)//@id" group-by=".">
  <xsl:if test="count(current-group()) ne 1">
    <xsl:message>Id value <xsl:value-of select="current-grouping-key()"/> is
duplicated in files
      <xsl:value-of select="current-group()/document-uri(/)" separator=" and
"/></xsl:message>
  </xsl:if>
</xsl:for-each-group>

32.

Intersection 2.0

David Carlisle


> I wonder if you have references to practical examples of making an
>  intersect set operation with xpath ver 1.0

if $a and $b are two node sets then

$a[count(.|$b)=count($b)]

is the intersection of $a and $b, that is

$a intersect $b

in xpath2.