xslt strings containing special characters

Strings

1. Quoting of special characters within XPath string
2. URL decomposition
3. Replacing a string by another which is found by reference in the same XML document
4. Apostrophe problems
5. Quotes in a string
6. splitting lines at n characters
7. String split into elements
8. Replacing newline with break.
9. CRLF to BR
10. Trim or pad a string to a fixed length
11. Computing string-length of nodesets
12. Substituting substrings in an element's text for HTML output
13. Substring-after, last occurrence
14. lastIndexOf('char')
15. count the number of specific characters in a string.
16. Reverse a string
17. ASCII to Hex conversion
18. Testing for an empty string.
19. Last occurrence of a string
20. String to numbers
21. String padding
22. Single quote in select expression
23. Count the number of tokens in a string
24. How to do multiple string replacements
25. How to count words
26. Split a line into fixed lengths
27. Access to individual characters in a string
28. How concatenate the contents of similar elements
29. Strip non Alpha-numeric characters
30. Word highlighting

1.

Quoting of special characters within XPath string

David Carlisle

>  How do I have to escape
>  quotes to get a well-formed XPath string?

First just consider the xpath syntax.

You can use " or ' to delimit a string literal, so if you only want one then you can delimit with the other.

  "'"  or '"'

if you want both then you can not do it directly in a string literal but you can construct the string '" using

translate('a"','a',"'")
or
concat("'",'"')

or if you drop out of xpath, to xslt

<xsl:variable name="x">'"</xsl:variable>

then use $x as this result tree fragment will coerce to a string.

Then you need to get one of those expressions into an XML attribute If you use " to delimit the attribute value then you need to quote " so you end up with

select="translate('a"','a',"'")"

which looks a bit odd but the XML parser eats that and gives the xpath system translate('a"','a',"'") which takes the string a" and replaces the a by '.

2.

URL decomposition

Juliane Harbarth

Can you think of a way to decompose a URL? I want to take something like: http://www.agilic.com/purchase.htm and end up with purchase.htm. ...

You can do this by using a recursive named template, e.g.

 <xsl:template name="filename">
   <xsl:param name="x"/>
   <xsl:choose>
     <xsl:when test="contains($x,'/')">
       <xsl:call-template name="filename">
         <xsl:with-param name="x" select="substring-after($x,'/')"/>
       </xsl:call-template>
     </xsl:when>
     <xsl:otherwise>
       <xsl:value-of select="$x"/>
     </xsl:otherwise>
   </xsl:choose>
 </xsl:template>

and call it somehow like :

 <xsl:call-template name="filename">
  <xsl:with-param name="x" select="."/>
 </xsl:call-template>

Jens Lautenbacher completes the picture with

This template get's a filename and gives back the directory part of it. Giving back the filename part is achieved along the same line

You call it somehow like

<xsl:call-template name="strip">
<xsl:with-param name="relfile">foo/bar/baz.xml</xsl:with-param>
</xsl:call-template>
  <xsl:template name="strip">
    <xsl:param name="reldir"/>
    <xsl:param name="relfile"/>
    <xsl:choose>
      <xsl:when test="contains($relfile, '/')">
      <xsl:call-template name="strip">
        <xsl:with-param name="relfile">
          <xsl:value-of select="substring-after($relfile,'/')"/>
        </xsl:with-param>
        <xsl:with-param name="reldir">
          <xsl:value-of 
    select="concat($reldir, substring-before($relfile,'/'), '/')"/>
        </xsl:with-param>
      </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$reldir"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

3.

Replacing a string by another which is found by reference in the same XML document

Jeni Tennison

Q expansion

>My source XML contains something like:
> <Root>
> <Class
>      Name="MyClass"
>      Uuid="80A2B3BD-0000-520C-383BE4980006BE67"
>      TargetRef="80A2B3BD-0000-520C-383BE4980006A75A">
></Class>
>
><AnotherObject
>Name="MyObject"
>Uuid="80A2B3BD-0000-520C-383BE4980006B687">
></AnotherObject>
></Root>
>
>I have an XSL style sheet  to display it on IE5 with apropriate style. I
>would like to get as output:
>
>Class:
>Name="MyClass"
>TargetRef="MyObject"
>
>Where the TargetRef string has been replaced by the value of the string
>with the same Uuid.


 

This seems to me to be a good instance to use xsl:key to identify the nodes that are uniquely identified through the 'Uuid' attribute. First, set up the key:

* name  - a name for the key, anything you like
* match - an XPath matching the nodes that you want to identify
* use   - an XPath (relative to the 'match' node) that identifies the node

In your case:

<xsl:key name="objects" match="*[@Uuid]" use="@Uuid" />

Note that I haven't named the (element) nodes that are identified by the key because it isn't clear to me whether your 'Class' and 'AnotherObject' elements are indicative of a whole range of possible element names in your input, but we can guarantee at least that they will have a 'Uuid' attribute if they're worth identifying!

Then you can access a particular node through its 'Uuid' attribute using the key() function, so try:

<xsl:template match ="Class">
  Class:
  Name="<xsl:value-of select="@Name" />"
  TargetRef="<xsl:value-of select="key('objects', @TargetRef)/@Name" />"
</xsl:template>

4.

Apostrophe problems

David Carlisle

> In each case it acts like there are missing end parenthesis, no doubt 
> because the odd number of single quotes act like one quote.

No (unless you are using Xpath2, which I suspect isn't the case) All three of the things you tried are equivalent to ''' (enity and character references are expanded before Xpath is parsed) so are an empty string followed by a trailing ' which is a ayntax error.

The FAQ has entries on this but basically, start from the Xpath you need and worry about xml quoting later.

You want the string literal "'" (you have to use " to delimit a string literal involving a ' in Xpath 1) so you want the XPath

string-length(string-before($myValue,"'"))>0

(or equivalently starts-with($myValue,"'") )

As far as XML is concerned, the above may as well be jhgkjg"nbfjh' just some random string involving " and ' to get that into an attribute you always have two choices delimit with " and quote any " or delimit with ' and quote any '

so

test="string-length(substring-before($myValue,&quot;'&quot;))>0"

or

test='string-length(substring-before($myValue,"&apos;"))>0'

If you are using Xpath2 then a doubled ' (or ") counts as a single ' in a string literal so you could then do

test="string-length(substring-before($myValue,''''))>0"

which avoids using xml quotes at all, although I'm not sure it's any more readable.

Incidentally In the file is it properly represented with an internal entity. The XML looks like <LogicalFieldName>Broker&apos;s ZIP</LogicalFieldName>

You could of course use ' rather than &apos;; in the source although that wouldn't change the XSLT required.

5.

Quotes in a string

Mike Brown

How do you put ' or " into a string?

<xsl:variable name="apos">'</xsl:variable>
<xsl:variable name="quot">"</xsl:variable>
<xsl:variable name="foo" select="concat($quot,'Hello, world!',$quot)"/>

How do you test for the presence of ' or " in a string?

<!-- same $apos and $quot assignments, then... -->
<xsl:if test="contains($foo,$apos) or contains($foo,$quot)">
  ...
</xsl:if>

Jeni Tennison rounds out the discussion..

Good question! XML defines entities for ' and " (&amp;apos; and &amp;quot;, somewhat unsurprisingly). In certain situations, it is possible to use these. Your first example, for instance, could also be given as:

<xsl:variable name="foo" select="'&amp;quot;Hello, world!&amp;quot;'"/>

When this is parsed by the XML parser, the value of the 'select' attribute is set to (no extra quotes included): '"Hello, world!"'

When the XSLT Processor sees this, it recognises the external quotes as designating a string value, and so sets the variable $foo to the string (no extra quotes included): "Hello, world!"

The thing to remember is that you are escaping the " and ' *for the XML parser* and not for the XSLT processor. So your second example:

>How do you test for the presence of ' or " in a string?
>
><!-- same $apos and $quot assignments, then... -->
><xsl:if test="contains($foo,$apos) or contains($foo,$quot)">
>  ...
></xsl:if>

can be escaped as:

<xsl:if test="contains($foo, &amp;quot;'&amp;quot;) or contains($foo '&amp;quot;')">
...
</xsl:if>

As there are no unescaped "s within the attribute value, the XML parser can parse this and emerges with the value of the 'test' attribute as:

  contains($foo, "'") or contains($foo, '"')

The XSLT processor can again recognise that "'" designates a string with the value of the single character ' and that '"' designates a string with the value of the single character ".

Similarly, if you wanted single quotes rather than double quotes around your Hello, World!, then you should do:

<xsl:variable name="foo" select="&quot;'Hello, world!'&quot;" />

(-> "'Hello, world!'" att. value -> 'Hello, world!' string)

[Or, alternatively:

<xsl:variable name="foo" select='"&apos;Hello, world!&apos;"' />

(-> "'Hello, world!'" att. value -> 'Hello, world!' string)]

So, for fairly simple situations like this, it is enough to use the normal XML escaping to get the XSLT processor to see something that it can understand. However, difficulties arise when the quote nesting goes deeper than this. For example, if you wanted to see whether a string contains the string (no extra quotes): "You're here"

There is no way to wrap quotes around that string, and no way that I know of within XSLT/XPath to escape internal quotes like this (the XSLT processor is not an XML parser - it won't detect and recognise &quot;/&apos; itself). In these cases, your method, using variables, is the only solution.

(BTW, I'd personally declare the variables as:

<xsl:variable name="apos" select='"&apos;"' />
<xsl:variable name="quot" select="'&quot;'" />

so that they are set as strings rather than result tree fragments.)

6.

splitting lines at n characters

Paul Tchistopolskii

To me that was not rudimentary ( taking into account that there could be exotic situations when some long word is really > 60, so space-scanning could fail ). Anyway.

The snippet below takes into account the whitespace not splitting the word in the middle. It looks back to the first space to prevent that. if it fails - it just splits. tune-width does it.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

 <xsl:template match="/doc">
 <HTML><BODY><PRE>
    <xsl:call-template name="format">
    <xsl:with-param select="normalize-space(para)" name="txt" /> 
     <xsl:with-param name="width">30</xsl:with-param> 
     </xsl:call-template>
  </PRE></BODY></HTML>
  </xsl:template>

  <xsl:template name="format">
   <xsl:param name="txt" /> 
   <xsl:param name="width" /> 

  <xsl:if test="$txt">
   <xsl:variable name="real-width">
    <xsl:call-template name="tune-width">
     <xsl:with-param select="$txt" name="txt" /> 
       <xsl:with-param select="$width" name="width" /> 
       <xsl:with-param select="$width" name="def" /> 
  </xsl:call-template>
   </xsl:variable>

   <xsl:value-of select="substring($txt, 1, $real-width)" /> 

<xsl:text>
</xsl:text> 

   <xsl:call-template name="format">
    <xsl:with-param select="substring($txt,$real-width + 1)" name="txt" /> 
    <xsl:with-param select="$width" name="width" /> 
   </xsl:call-template>

  </xsl:if>
  </xsl:template>


  <xsl:template name="tune-width">
  <xsl:param name="txt" /> 
  <xsl:param name="width" /> 
  <xsl:param name="def" /> 

  <xsl:choose>
  <xsl:when test="$width = 0">
  <xsl:value-of select="$def" /> 
  </xsl:when>
  <xsl:otherwise>
  <xsl:choose>
  <xsl:when test="substring($txt, $width, 1 ) = ' '">
  <xsl:value-of select="$width" /> 
  </xsl:when>
  <xsl:otherwise>
  <xsl:call-template name="tune-width">
  <xsl:with-param select="$txt" name="txt" /> 
  <xsl:with-param select="$width - 1" name="width" /> 
  <xsl:with-param select="$def" name="def" /> 
  </xsl:call-template>
  </xsl:otherwise>
  </xsl:choose>
  </xsl:otherwise>
  </xsl:choose>
  </xsl:template>

  </xsl:stylesheet>

Input:

<doc>

<para>
 123456 2345 343434 545454 43434 343 
 12345 343434 545454 43434 343 
 32345645 343434 545454 43434 343 
 3422222225 343434 545454 43434 343 
 llllllllllllllllllllllooooooooooooooonnnnnnnnnnnggggggggg
 345 343434 545454 43434 343 
</para>

</doc>

Output:

<HTML>
<BODY>
<PRE>123456 2345 343434 545454 
43434 343 12345 343434 545454 
43434 343 32345645 343434 
545454 43434 343 3422222225 
343434 545454 43434 343 
lllllllllllllllllllllloooooooo
ooooooonnnnnnnnnnnggggggggg 
345 343434 545454 43434 
343
</PRE>
</BODY>
</HTML>

7.

String split into elements

Jarno Elovirta

I had need to split out an element content into cross references

<doc>
  <elem>5,6,7</elem>
</doc>

to produce <a href="#id5>5</a> etc.


<xsl:template match="doc/elem">
<body>
  <xsl:call-template name="links">
    <xsl:with-param name="str" select="."/>
  </xsl:call-template>
  </body>
</xsl:template>

<xsl:template name="links">
  <xsl:param name="str"/>
  <xsl:choose>
    <xsl:when test="contains($str,',')">
      <a href="#id{substring-before($str,',')}"><xsl:value-of
select="substring-before($str,',')"/></a>
      <xsl:text>&#xA0;</xsl:text>
      <xsl:call-template name="links">
        <xsl:with-param name="str" select="substring-after($str,',')"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <a href="#id{$str}"><xsl:value-of select="$str"/></a>
      <xsl:text>&#xA0;</xsl:text>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

8.

Replacing newline with break.

Norman Walsh

These templates insert "<br/>"s between the lines of an address. Adapt at will :-)

Test input

   <address>
   some text on 
   multiple lines
   which is required to be split up
   into verbatim lines, for html output
  </address>

Stylesheet.

   
   
   
   
   <xsl:template match="address//text()">
     <xsl:call-template name="make-verbatim">
       <xsl:with-param name="text" select="."/>
     </xsl:call-template>
   </xsl:template>
   
   <xsl:template name="make-verbatim">
     <xsl:param name="text" select="''"/>
   
     <xsl:variable name="starts-with-space"
                   select="substring($text, 1, 1) = ' '"/>
   
     <xsl:variable name="starts-with-nl"
                   select="substring($text, 1, 1) = '&#xA;'"/>
   
     <xsl:variable name="before-space">
       <xsl:if test="contains($text, ' ')">
         <xsl:value-of select="substring-before($text, ' ')"/>
       </xsl:if>
     </xsl:variable>
   
     <xsl:variable name="before-nl">
       <xsl:if test="contains($text, '&#xA;')">
         <xsl:value-of 
	  select="substring-before($text, '&#xA;')"/>
       </xsl:if>
     </xsl:variable>
   
     <xsl:choose>
       <xsl:when test="$starts-with-space">
         <xsl:text>&#160;</xsl:text>
         <xsl:call-template name="make-verbatim">
           <xsl:with-param name="text" 
	  select="substring($text,2)"/>
         </xsl:call-template>
       </xsl:when>
   
       <xsl:when test="$starts-with-nl">
         <br/><xsl:text>&#xA;</xsl:text>
         <xsl:call-template name="make-verbatim">
           <xsl:with-param name="text" 
	  select="substring($text,2)"/>
         </xsl:call-template>
       </xsl:when>
   
       <!-- if the string before a space 
	  is shorter than the string before
            a newline, fix the space...-->
       <xsl:when test="$before-space != ''
                       and ((string-length($before-space)
                             &lt; string-length($before-nl))
                             or $before-nl = '')">
         <xsl:value-of select="$before-space"/>
         <xsl:text>&#160;</xsl:text>
         <xsl:call-template name="make-verbatim">
           <xsl:with-param name="text"
	  select="substring-after($text, ' ')"/>
         </xsl:call-template>
       </xsl:when>
   
       <!-- if the string before a newline 
	  is shorter than the string before
            a space, fix the newline...-->
       <xsl:when test="$before-nl != ''
                       and ((string-length($before-nl)
                             &lt; string-length($before-space))
                             or $before-space = '')">
         <xsl:value-of select="$before-nl"/>
         <br/><xsl:text>&#xA;</xsl:text>
         <xsl:call-template name="make-verbatim">
           <xsl:with-param name="text" 
	  select="substring-after($text, '&#xA;')"/>
         </xsl:call-template>
       </xsl:when>
   
       <!-- the string before the newline and the string before the
            space are the same; which means they must both be empty -->
       <xsl:otherwise>
         <xsl:value-of select="$text"/>
       </xsl:otherwise>
     </xsl:choose>
   </xsl:template>

9.

CRLF to BR

Jarno Elovirta

test.xml

<?xml version="1.0"?>
<Text>
    This is where the actual article starts. The article contains
    several paragraphs.

    Paragaraphs are separated by Carriage returns or Linefeeds, not by Tags
</Text>

test.xsl

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" />

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="text()">
   <xsl:call-template name="break" />
</xsl:template>

<xsl:template name="break">
 <xsl:param name="text" select="."/>
 <xsl:choose>
   <xsl:when test="contains($text, '&#xA;')">
     <xsl:value-of select="substring-before($text, '&#xA;')"/>
     <br/>
     <xsl:call-template name="break">
       <xsl:with-param name="text" select="substring-after($text,'&#xA;')"/>
     </xsl:call-template>
   </xsl:when>
   <xsl:otherwise>
           <xsl:value-of select="$text"/>
   </xsl:otherwise>
 </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Output.

<?xml version="1.0" encoding="utf-8"?><Text><br/>    
This is where the actual article starts. The article contains<br/>    
several paragraphs.<br/>
<br/>    
Paragaraphs are separated by Carriage returns or Linefeeds, not by Tags<br/></Text>

10.

Trim or pad a string to a fixed length

Wendell Piez

>Is there away to force string length? For example, my 
>output contains a string that must contain 10 characters but the 
>XML source is not guaranteed to supply that number of characters. 

In order to trim a long string to 10, or pad a short one with spaces:

substring(concat(string(.), '          '), 1, 10)

11.

Computing string-length of nodesets

David Carlisle


<xsl:variable name="x"><xsl:copy-of select="[nodeset]"/></xsl:variable>
<xsl:value-of select="string-length(string($x))"/>
	

probably does what you want.

> So is there some way to construct a equivalent of sum(), but one that works
> on string values of a nodeset?

simple cases you can get by as above, but usually you have to use a node-set extension function for this sort of thing (until xslt 1.1)

for instance if you wanted to apply normalize-space to each of your nodes in the node set before computing your average, you'd do something like

<xsl:variable name="x">
  <xsl:for-each  select="[nodeset]" >
   <x><xsl:value-of select="string-length(normalize-space(.))"/></x>
  </xsl:for-each>
</xsl:variable>
<xsl:value-of select="sum(xt:node-set($x)/x)"/>

12.

Substituting substrings in an element's text for HTML output

Steve Muench



| I have an element that looks like this:
| 
| <xyz>
|   this is the first line
|   this is the second line
|   this is the third line
| </xyz>
| 
| I'd like to transform it so that it could be outputted to html with
| the same line breaks, i.e. change all \n to <br />
	

Here are a couple of templates I use for this purpose. The "br-replace" replaces carriage returns with <br/> tags. The "sp-replace" replaces *pairs* of leading spaces with *pairs* of leading non-breaking spaces. By combining the two in series, you can achieve the affect of keeping code listings (e.g. XML or Java source code examples in a book) properly formatted without using the <pre> tag which tends to mess up the formatting of table cells (often pushing them wider than you'd like).

To use the templates, add them (or import them) into the stylesheet you're building, then at the right moment, just do:

    <!-- Call the "br-replace" template -->
    <xsl:call-template name="br-replace">

      <!-- 
       | Passing the result of calling the sp-replace template
       | as the value of the parameter named "text"
       +-->
      <xsl:with-param name="text">

        <!-- Call the "sp-replace" template -->
        <xsl:call-template name="sp-replace">

          <!-- Passing the value of the current node -->
          <xsl:with-param name="text" select="."/>
        </xsl:call-template>
      </xsl:with-param>
    </xsl:call-template>

Here are the templates...

  <!-- Replace new lines with html <br> tags -->
  <xsl:template name="br-replace">
    <xsl:param name="text"/>
    <xsl:variable name="cr" select="'&#xa;'"/>
    <xsl:choose>
      <!-- If the value of the $text parameter contains carriage ret -->
      <xsl:when test="contains($text,$cr)">
        <!-- Return the substring of $text before the carriage return -->
        <xsl:value-of select="substring-before($text,$cr)"/>
        <!-- And construct a <br/> element -->
        <br/>
        <!--
         | Then invoke this same br-replace template again, passing the
         | substring *after* the carriage return as the new "$text" to
         | consider for replacement
         +-->
        <xsl:call-template name="br-replace">
          <xsl:with-param name="text" select="substring-after($text,$cr)"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$text"/>
      </xsl:otherwise>
   </xsl:choose>
  </xsl:template>

  <!-- Replace two consecutive spaces w/ 2 non-breaking spaces -->
  <xsl:template name="sp-replace">
    <xsl:param name="text"/>
    <!-- NOTE: There are two spaces   ** here below -->
    <xsl:variable name="sp"><xsl:text>  </xsl:text></xsl:variable>
    <xsl:choose>
      <xsl:when test="contains($text,$sp)">
        <xsl:value-of select="substring-before($text,$sp)"/>
        <xsl:text>&#160;&#160;</xsl:text>
        <xsl:call-template name="sp-replace">
          <xsl:with-param name="text" select="substring-after($text,$sp)"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$text"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

13.

Substring-after, last occurrence

Paul Brown

> I read about the substring-after()-function, which can extract a substring
> after the FIRST occurance of a specified string.
>
> My problem: Have you got an idea, how I can get the substring-after of a
> string after the LAST occurance of a specified substring??
>
> For example: I want the substring after the last ".".
>
> String: "A.B.C.0.1.1.hgk"
>
> The solution should return: "hgk"
	

The answer is recursion:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template name="substring-after-last">
<xsl:param name="input" />
<xsl:param name="marker" />

<xsl:choose>
  <xsl:when test="contains($input,$marker)">
    <xsl:call-template name="substring-after-last">
      <xsl:with-param name="input" 
          select="substring-after($input,$marker)" />
      <xsl:with-param name="marker" select="$marker" />
    </xsl:call-template>
  </xsl:when>
  <xsl:otherwise>
   <xsl:value-of select="$input" />
  </xsl:otherwise>
 </xsl:choose>

</xsl:template>

<xsl:template match="FOO">
 <xsl:call-template 
  name="substring-after-last">
 <xsl:with-param name="input" select="'1.2.3.4.5.6.7'" />
 <xsl:with-param name="marker" select="'.'" />
 </xsl:call-template>
</xsl:template>

</xsl:stylesheet>

14.

lastIndexOf('char')

Jeni Tennison


> i'm looking for a xslt method to identify the last iteration of a
> char into a string. For example, to extract automatically the name
> of the html page into the url.
>
> string : "h ttp://www.thesite.com/directory1/dir2/dir3../pageindex.htm"
>
> there are the functions substrings-before() et substring-after(),
> but they work on the first occurence of the marker-string. Is there
> a Xslt function which gives the last occurence of a marker-string
> (like lastIndexOf('/',"string")) into a string?
	

No, there isn't.

You can achieve what you want through recursion. Walk through the string, taking bits off the front of it until you get to a string which has no '/' in it whatsoever.

<!-- define a lastIndexOf named template -->
<xsl:template name="lastIndexOf">
   <!-- declare that it takes two parameters 
	  - the string and the char -->
   <xsl:param name="string" />
   <xsl:param name="char" />
   <xsl:choose>
      <!-- if the string contains the character... -->
      <xsl:when test="contains($string, $char)">
         <!-- call the template recursively... -->
         <xsl:call-template name="lastIndexOf">
            <!-- with the string being the string after the character
                 -->
            <xsl:with-param name="string"
                            select="substring-after($string, $char)" />
            <!-- and the character being the same as before -->
            <xsl:with-param name="char" select="$char" />
         </xsl:call-template>
      </xsl:when>
      <!-- otherwise, return the value of the string -->
      <xsl:otherwise><xsl:value-of select="$string" />
	  </xsl:otherwise>
   </xsl:choose>
</xsl:template>

To get the filename of a URL held in the URL child of the current node, you can call this template like:

  <xsl:call-template name="lastIndexOf">
     <xsl:with-param name="string" select="URL" />
     <xsl:with-param name="char" select="'/'" />
  </xsl:call-template>

It's pretty verbose, but I'm afraid that's the only way to do it in XSLT at the moment.

15.

count the number of specific characters in a string.

Michael Kay

> If I can somehow count the number of periods "." in
> the string "id", then I can determine what level the
> <tab> element is at...anyone have any ideas how to
> count the number of occurences in a string with XSL?

Try:

string-length($x) - string-length(translate($x, '.', ''))

16.

Reverse a string

Jeni Tennison

 <xsl:template name="reverse3">
    <xsl:param name="theString" />
    <xsl:param name="reversedString" />
    <xsl:choose>
       <xsl:when test="$theString">
          <xsl:call-template name="reverse3">
             <xsl:with-param name="theString"
                             select="substring($theString, 2)" />
             <xsl:with-param name="reversedString"
select="concat(substring($theString, 1, 1),
$reversedString)" />
          </xsl:call-template>
       </xsl:when>
       <xsl:otherwise>
          <xsl:value-of select="$reversedString" />
       </xsl:otherwise>
    </xsl:choose>
 </xsl:template>

Dimitre Novatchev offers

<xsl:template name="reverse">
 <xsl:param name="theString"/>
 <xsl:variable name="thisLength" select="string-length($theString)"/>
 <xsl:choose>
  <xsl:when test="$thisLength = 1">
   <xsl:value-of select="$theString"/>
  </xsl:when>
  <xsl:otherwise>
 <xsl:variable name="restReverse">
   <xsl:call-template name="reverse">
     <xsl:with-param name="theString"
select="substring($theString,  1, $thisLength -1)"/>
   </xsl:call-template>
 </xsl:variable>
 <xsl:value-of 
select="concat(substring($theString,$thisLength, 1) ,$restReverse)"/>
   </xsl:otherwise>
   </xsl:choose>
  </xsl:template>
  
  
  <xsl:template name="reverse2">
    <xsl:param name="theString"/>
    <xsl:variable name="thisLength" select="string-length($theString)"/>
   <xsl:choose>
   <xsl:when test="$thisLength = 1">
       <xsl:value-of select="$theString"/>
   </xsl:when>
   <xsl:otherwise>
       <xsl:variable name="length1" 
  select="floor($thisLength div 2)"/>
       <xsl:variable name="reverse1">
   <xsl:call-template name="reverse2">
       <xsl:with-param name="theString"
   select="substring($theString, 
  1, $length1)"/>
   </xsl:call-template>
       </xsl:variable>
       <xsl:variable name="reverse2">
   <xsl:call-template name="reverse2">
       <xsl:with-param name="theString"
   select="substring($theString, 
                                                    $length1+1, 
                                     $thisLength - $length1
                                                    )"/>
   </xsl:call-template>
       </xsl:variable>
       <xsl:value-of select="concat($reverse2, 
  $reverse1)"/>
  
   </xsl:otherwise>
   </xsl:choose>
  </xsl:template>
  

I compared times from the three templates on a 800MHz 128Mb RAM Pentium, running each test 10 times, averaging the times reported by MSXML run from the command line, and rounding to the nearest millisecond. Here are the results:

Length        Simple        Least Recursive         Tail Recursive
  ------------------------------------------------------------------
   100              22                    36                       5
   200              41                    61                      11
   400              95                   124                      24
   800             241                   249                      77
  1600             650                   485                     220
  3200            3465                   975                    1369

The tail recursive template is always substantially faster than the simple algorithm, but it suffers from the same problem in the end - the time taken increases exponentially rather than linearly based on the length of the string, so for really long strings the least recursive algorithm works best. I haven't taken detailed timings, but there's a similar pattern in Saxon (although Saxon bugs out with the simple algorithm and long strings, I guess a stack overflow). A processor that doesn't optimise tail recursion would probably have similar performance from both the simple and tail-recursive templates.

17.

ASCII to Hex conversion

Mike Brown





> Is there a function that can convert ASCII coded characters to 
> ASCII coded hex data.

Using pure XSLT, and assuming you really meant ASCII (characters 32-127), here is a demonstration of a way to do it:


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

  <xsl:output method="xml" indent="yes"/>

  <!-- the next line is all on one line and the character 
     before the ! is a space -->
  <xsl:variable 
name="ascii"> !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~</xsl:variable>
  <xsl:variable name="hex" >0123456789ABCDEF</xsl:variable>

  <xsl:template match="/">

    <xsl:variable name="foo" select="'I have $1,001.'"/>

    <result>
      <string>
        <xsl:value-of select="$foo"/>
      </string>
      <hex>
        <xsl:call-template name="recurse-over-string">
          <xsl:with-param name="str" select="$foo"/>
        </xsl:call-template>
      </hex>
    </result>

  </xsl:template>

  <xsl:template name="recurse-over-string">
    <xsl:param name="str"/>   
    <xsl:if test="$str">
      <xsl:variable name="first-char" 
  select="substring($str,1,1)"/>
      <xsl:variable name="ascii-value" 
  select="string-length(substring-before($ascii,$first-char)) + 32"/>
      <xsl:variable name="hex-digit1" 
  select="substring($hex,floor($ascii-value div 16) + 1,1)"/>
      <xsl:variable name="hex-digit2" 
  select="substring($hex,$ascii-value mod 16 + 1,1)"/>
      <xsl:value-of select="concat($hex-digit1,$hex-digit2)"/>
      <xsl:if test="string-length($str) > 1">
        <xsl:text> </xsl:text>
        <xsl:call-template name="recurse-over-string">
          <xsl:with-param name="str" select="substring($str,2)"/>
        </xsl:call-template>
      </xsl:if>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

The output is


<?xml version="1.0" encoding="utf-8"?>
<result>
   <string>I have $1,001.</string>
   <hex>49 20 68 61 76 65 20 24 31 2C 30 30 31 2E</hex>
</result>

18.

Testing for an empty string.

Mike Brown


 > &lt;xsl:if test="string-length()='0'">
 

'0' when quoted like that is a string, and string-length() returns a number, so this will result in extra overhead as types are converted. Unquote the 0 so you are comparing a number to a number.

Better yet, just test for string() or normalize-space() -- the result will be an empty string if the string is empty, and an empty string evaluates to false.

19.

Last occurrence of a string

Jeni Tennison


  > Is there a function in XSLT which obtains the position of the last
  > occurrence of a string within another string?
  

No, there isn't. To get the substring after the last occurrence of a string, you need a recursive template, for example:

  &lt;xsl:template name="substring-after-last">
    &lt;xsl:param name="string" />
    &lt;xsl:param name="delimiter" />
    &lt;xsl:choose>
      &lt;xsl:when test="contains($string, $delimiter)">
        &lt;xsl:call-template name="substring-after-last">
          &lt;xsl:with-param name="string"
            select="substring-after($string, $delimiter)" />
          &lt;xsl:with-param name="delimiter" select="$delimiter" />
        &lt;/xsl:call-template>
      &lt;/xsl:when>
      &lt;xsl:otherwise>&lt;xsl:value-of 
                  select="$string" />&lt;/xsl:otherwise>
    &lt;/xsl:choose>
  &lt;/xsl:template>
  

For example to get the extension of a file from its path you could use:

    &lt;xsl:call-template name="substring-after-last">
      &lt;xsl:with-param name="string" select="$file" />
      &lt;xsl:with-param name="delimiter" select="'.'" />
    &lt;/xsl:call-template>

Getting the last index would be slightly more complicated, and it looks like you want to use it to get hold of the string after the last occurrence of something, so hopefully this is sufficient.

[Note that the XQuery/XPath operators document contains an ends-with() function so it's possible that in XPath 2.0 if you wanted to *test* the value of the string after the last '.' to see if it was 'xml' then you could do:

    ends-with($file, '.xml')

but that's just speculation at the moment.]

20.

String to numbers

Jeni Tennison

XPath can only convert a string to a number if it uses a '.' as a decimal separator and doesn't have any other non-decimal, non-whitespace characters (so input numbers can't have grouping separators). So the results of number() are *always*:

  <number>  12.5   </number>  =>  12.5
  <number>  foo    </number>  =>  NaN
  <number>  12,5   </number>  =>  NaN
  <number> 1,234.5 </number>  =>  NaN

The point of the xsl:decimal-format element is solely to interpret the format pattern string that you use in format-number(). By default it uses '.' for decimal points and ',' as the grouping separator so:

  format-number(1234.5, '#,##0.00') => '1,234.5'

But you can override the default decimal format so that you can use a different decimal point and grouping separator:

<xsl:decimal-format
  decimal-separator=","
  grouping-separator="." />

If you want to use that decimal format, you have to change the format pattern in the format-number() function:

  format-number(1234.5, '#.##0,00') => '1.234,5'

If you want to chop and change between different numerical formats in different parts of the stylesheet, you should create named xsl:decimal-formats for each of the different formats you want to use:

<xsl:decimal-format name="German"
  decimal-separator=","
  grouping-separator="." />

<xsl:decimal-format name="AltGerman"
  decimal-separator=","
  grouping-separator="'" />

<xsl:decimal-format name="US"
  decimal-separator="."
  grouping-separator="," />

Then you can use different formats in different places:

  format-number(1234.5, '#,##0.00', 'US')        => '1,234.5'
  format-number(1234.5, '#.##0,00', 'German')    => '1.234,5'
  format-number(1234.5, "#'##0,00", 'AltGerman') => "1'234,5"

21.

String padding

Mike Kay



> I would like to output this as:
>
> 2002-05-02 ... This is the title ................ 18
> 2002-05-01 ... This is the second title ......... 12
> 2002-05-01 ... This is the third title .......... 5
>
> I've been unable to find a simple way in XSL to do the padding that
> I want.  

substring(concat(' ... ', title, ' ...................................'), 1,
25))

(or whatever the correct number is).

Dimitre adds:

Do have a look at: the archives.

One can also use the string analog of the iter() function from FXSL in order to build string of repeated patterns dynamically.

Your case seems quite simple -- you could use a static global xsl:variable containing a known in advance maximum of dots and manipulate it using the substring() function.

22.

Single quote in select expression

Jeni Tennison


> I have the following in my stylesheet that gives me errors:
>
> <xsl:param name="fjeestitle" 
> select="'NRC Research Press: Revue du génie et de la science de
> l&#x0027;environnement'" /> 
>
> Why is this not working? How to make it work?            

When this gets read by an XML parser, it reports the value of the select attribute to the XSLT (or rather XPath) processor to be:

  'NRC Research Press: Revue du génie et de la science de
   l'environnement'

In other words, it's the XML parser's job to resolve the &#x0027; character reference into the ' character; that's already been done by the time it gets to the XPath processor.

The XPath processor then tries to parse that value as an XPath. It pairs the first ' with the second ' and then tries to parse the part after the second ' as being an operator on that string, which is why you get the error.

The solution is to make sure that the XPath processor gets a string that it can parse as an XPath, and since your string has a ' in it, that means using " to delimit the string rather than ':


  "NRC Research Press: Revue du génie et de la science de
   l'environnement"

Then you can move that into the XML either by delimiting the attribute value with double quotes and escaping the double quotes in the XPath:

<xsl:param name="fjeestitle"
  select="&quot;NRC Research Press: Revue du génie et de la science de
   l'environnement&quot;" />

or by delimiting the attribute value by single quotes and escaping the single quote in the XPath:

<xsl:param name="fjeestitle"
  select='"NRC Research Press: Revue du génie et de la science de
   l&apos;environnement"' />

23.

Count the number of tokens in a string

Oleg Tkachenko



> Is there an easy way to count
> the number of commas in a string? Like this:

> <ITEM cols="col1,col2,col3,col4"/>

What about string-length($str) - string-length(translate($str,',',''))

24.

How to do multiple string replacements

Jeni Tennison

Here is *a* way to do it. I'm not sure that it's the most efficient, but it's the principal that counts, and the principal is to XMLise the information about what you want to find and replace, or the characters that need to be escaped.

So, make up a namespace for this information and include it in your file. For the replacements that you're using, I created two sets of elements:

<foo:special_characters>
  <foo:char>_</foo:char>
  <foo:char>%</foo:char>
  <foo:char>$</foo:char>
  <foo:char>{</foo:char>
  <foo:char>}</foo:char>
  <foo:char>&</foo:char>
</foo:special_characters>

<foo:string_replacement>
  <foo:search>
    <foo:find>±</foo:find>
    <foo:replace>$\pm$</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>°</foo:find>
    <foo:replace>$\degree$</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>©</foo:find>
    <foo:replace>\copyright</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>¶</foo:find>
    <foo:replace>$\mathbb{P}$</foo:replace>
  </foo:search>
</foo:string_replacement>

This separates out the data about the replacements that you want to make (the what). Now you want to specify the procedure about how to do those replacements (the how). I've called your existing templates to actually do the replacement, and focussed on identifying what you want to exchange.

First, then, the 'escape_special_characters' template. Basically, you want to first replace the characters on the $input_text, then replace the strings on the output from that:

<xsl:template name="replace_special_characters">
  <xsl:with-param name="input_text" />
  <xsl:variable name="replaced_text">
    <xsl:call-template name="replace_characters">
      <xsl:with-param name="input_text" select="$input_text" />
    </xsl:call-template>
  </xsl:variable>
  <xsl:call-template name="replace_strings">
    <xsl:with-param name="input_text" select="$replaced_text" />
  </xsl:call-template>
</xsl:template>

The two templates for replacing the characters and replacing the strings are much the same, so I'll only go through the one replacing the characters in detail.

<xsl:template name="replace_characters">
...
</xsl:template>

First, we need to declare a couple of parameters that we're going to use. One is the text that we need to replace and the other is one to keep track of where we are in the set of replacements that we need to make. I've done this second using an index number, defaulting it to 1 as the initial value.

  <xsl:with-param name="input_text" />
  <xsl:with-param name="char">1</xsl:with-param>

Then we need to create the new string, with the replacements made. We do this by calling your put_slash_in_front_of template, with the $input_text that we already have and the $special_char that is identified by the index number. We get at the character by getting the nth foo:char within the current document (the stylesheet), i.e. document('')//foo:char[$char].

  <xsl:variable name="replaced_text">
    <xsl:call-template name="put_slash_in_front_of">
      <xsl:with-param name="input_text" select="$input_text" />
      <xsl:with-param name="special_char"
        select="document('')//foo:char[$char]" />
    </xsl:call-template>
  </xsl:variable>

Now the recursive part. If we haven't got to the end of the list of characters that need to be escaped, then we have to move on to the next one, calling this same template with the next index, and with the text that we've created (i.e. that's already been escaped). If we've run out of foo:char, then we just return the escaped text.

  <xsl:choose>
    <xsl:when test="$char < count(document('')//foo:char)">
      <xsl:call-template name="replace_characters">
        <xsl:with-param name="input_text" select="$replaced_text" />
        <xsl:with-param name="char" select="$char + 1" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$replaced_text" />
    </xsl:otherwise>
  </xsl:choose>

And that's it. The other one in full is:

<xsl:template name="replace_strings">
  <xsl:param name="input_text" />
  <xsl:param name="search">1</xsl:param>
  <xsl:variable name="replaced_text">
    <xsl:call-template name="latex_string_replace">
      <xsl:with-param name="input_text" select="$input_text" />
      <xsl:with-param name="find"
        select="document('')//foo:search[$search]/foo:find" />
      <xsl:with-param name="replace"
        select="document('')//foo:search[$search]/foo:replace" />
    </xsl:call-template>
  </xsl:variable>
  <xsl:choose>
    <xsl:when test="$search < count(document('')//foo:search)">
      <xsl:call-template name="replace_strings">
        <xsl:with-param name="input_text" select="$replaced_text" />
        <xsl:with-param name="search" select="$search + 1" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$replaced_text" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

This is tested in SAXON and gives the same results as your original approach, but it is much easier to extend.

You could probably apply templates to the foo:char and foo:search nodes as an alternative approach - the important thing is to separate out the what from the how, not the how you do the how :)

25.

How to count words

Michael Kay

You can count the words without recursive processing:

$x := normalize-space(@attr)
$y := translate(@attr, ' ', '')
$wc := string-length($x) - string-length($y) +1

In XPath 2.0 of course you can have sequences of strings or numbers which makes this kind of thing very much easier.

26.

Split a line into fixed lengths

Jarno Elovirt



> Given the following xml fragment

> <item>Len_5</item>
> <item>Len_5</item>
> <item>Length_8</item>
> <item>Length__9</item>
> <item>Length__10</item>
> <item>Len__6</item>
> <item>L_3</item>
> <item>Le_4</item>
> <item>L_3</item>
> ..

> I want to create the following output

> <line>Len_5 Len_5 Length_8</line>
> <line>Length__9 Length__10</line>
> <line>Len__6 L_3 Le_4 L_3 </line>

> I other words , I want to regroup my items into fixed length of lines.

Dimitre already gave you a solution that used a LINE FEED to break the lines, but if you need the line elements, e.g.

  <xsl:param name="max" select="20"/>
  <xsl:param name="delim" select="' '"/>
  <xsl:param name="pad" select="' '"/>
  <xsl:template match="*[item]">
    <xsl:apply-templates select="item[1]"/>
  </xsl:template>
  <xsl:template match="item">
    <xsl:variable name="count">
      <xsl:call-template name="coil"/>
    </xsl:variable>
    <line>
      <xsl:variable name="line">
        <xsl:for-each select=". | 
        following-sibling::item[position() &lt;=$count]">
          <xsl:if test="not(position() = 1)">
            <xsl:value-of select="$delim"/>
          </xsl:if>
          <xsl:value-of select="."/>
        </xsl:for-each>
      </xsl:variable>
      <xsl:value-of select="$line"/>
      <xsl:call-template name="fill">
        <xsl:with-param name="length" select="string-length($line)"/>
      </xsl:call-template>
    </line>
    <xsl:apply-templates 
        select="following-sibling::item[position() = $count + 1]"/>
  </xsl:template>
  <xsl:template name="coil">
    <xsl:param name="n" select="."/>
    <xsl:param name="l" select="0"/>
    <xsl:param name="i" select="0"/>
    <xsl:choose>
      <xsl:when test="$l + boolean($i)+ string-length($n) 
       &lt;= $max and $n">
        <xsl:call-template name="coil">
          <xsl:with-param name="n" select="$n/following-sibling::item[1]"/>
          <xsl:with-param name="l" select="$l + boolean($i) +
                    string-length($n)"/>
          <xsl:with-param name="i" select="$i + 1"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$i - 1"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
  <xsl:template name="fill">
    <xsl:param name="length" select="0"/>
    <xsl:if test="$length &lt; $max">
      <xsl:value-of select="$pad"/>
      <xsl:call-template name="fill">
        <xsl:with-param name="length" select="$length + 1"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

will get you there. Though, you can tokenize the output from Dimitre's stylesheet to get the line element and add padding. Anyhow.

27.

Access to individual characters in a string

David Carlisle


> How do I access each char of the string 'copy99' ? copy99[1], copy99[2]... ?

substring($copy99,5,1) is the 5th character of  the string


            

28.

How concatenate the contents of similar elements

Wendell Piez



><section>
><p stylename="section" align="justify" fontname="TIMES" fontsize="19"
> bold="on">
>  <string fontname="TIMES" fontsize="19" bold="on">1.(1)Clause 8 (1) (a)
>   of the </string>
>  <string fontname="TIMES" fontsize="19" bold="on" italic="on">More
>   Money for All Amendment</string>
>  <string fontname="TIMES" fontsize="19" bold="on"> is deleted and the
>   following substituted:</string>
></p>
> ... <!-- more 'p' elements -->
></section>
...
>May I have suggestions as to how to concatenate content of the three into
>one, say, 'para' tag?



<xsl:template match="p">
  <para>  
    <xsl:value-of select="."/>
  </para>
<xsl:template>

or <xsl:value-of select="normalize-space()"/>

will get you what you want, since the value of a node is the concatenated string value of all its descendants.

29.

Strip non Alpha-numeric characters

Mukul Gandhi



> I was wondering if there is any way possible of stripping any 
> non-alphanumeric characters from an attribute. ie keep anything that 
> is A-Z/0-9 and strip all other characters like ",*-+. etc etc?


<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
  
<xsl:output method="xml" indent="yes" />
  
<xsl:variable name="allowedchars"
select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'" />
  
<xsl:template match="node() | @*">
  <xsl:copy>       
    <xsl:apply-templates select="node() | @*" />
  </xsl:copy>
</xsl:template>  
  
<xsl:template match="@*[string-length(translate(.,
$allowedchars, '')) &gt; 0]">
  <xsl:attribute name="{name()}">
    <xsl:value-of select="translate(., translate(., $allowedchars, ''), '')" />
  </xsl:attribute>
</xsl:template>
 
</xsl:stylesheet>


for e.g. when it is applied to XML -

<?xml version="1.0"?>
<root>
  <a x="123ABC+-" />
  <b y="ABC12" />
  <c z="+-1" />
</root>

it produces output -

<?xml version="1.0"?>
<root>
  <a x="123ABC" />
  <b y="ABC12" />
  <c z="1" />
</root>

30.

Word highlighting

Alexander Johannesen


>I'm trying to highlight a specific word or phrase in the text of
> a document.

I've got a word highlighter (it only bolds stuff, but you can put in whatever you need) that is case insensitive and unicode compatible (using str:to-lower function from the xsltsl project [http://xsltsl.sourceforge.net/string.html] : replace this with your own lowercaser transformation if you like).

Input is $text (your full text) and $what (what to highlight ; terms, words, etc.) ;

   <xsl:template name="highlighter">
       <xsl:param name="text"/>
       <xsl:param name="what"/>
       <xsl:variable name="test-text">
           <xsl:call-template name="str:to-lower">
               <xsl:with-param name="text" select="$text" />
           </xsl:call-template>
       </xsl:variable>
       <xsl:variable name="test-what">
           <xsl:call-template name="str:to-lower">
               <xsl:with-param name="text" select="$what" />
           </xsl:call-template>
       </xsl:variable>
       <xsl:choose>
           <xsl:when test="contains($test-text, $test-what)">
               <xsl:variable name="before" 
	  select="substring-before($test-text,
	  $test-what)"/>
               <xsl:variable name="after" 
	  select="substring-after($test-text,
	  $test-what)"/>
               <xsl:variable name="real-before" 
	  select="substring($text, 1,
	  string-length($before))"/>
               <xsl:variable name="real-after" 
	  select="substring($text,
	  string-length($before) + string-length($what) + 1)"/>
               <xsl:value-of select="$real-before"/>
               <b><xsl:value-of select="$what"/></b>
               <xsl:call-template name="highlighter">
                   <xsl:with-param name="text" select="$real-after"/>
                   <xsl:with-param name="what" select="$what"/>
               </xsl:call-template>
           </xsl:when>
           <xsl:otherwise>
               <xsl:value-of select="$text"/>
           </xsl:otherwise>
       </xsl:choose>
   </xsl:template>