xslt, string replace

General String Replace

1. multiple string replacements
2. How to escape or substitute unwanted characters
3. Parsing a character string
4. Replacing %1 with a parameter

1.

multiple string replacements

Jeni Tennison

Here is *a* way to do it. I'm not sure that it's the most efficient, but it's the principal that counts, and the principal is to XMLise the information about what you want to find and replace, or the characters that need to be escaped.

So, make up a namespace for this information and include it in your file. For the replacements that you're using, I created two sets of elements:

<foo:special_characters>
  <foo:char>_</foo:char>
  <foo:char>%</foo:char>
  <foo:char>$</foo:char>
  <foo:char>{</foo:char>
  <foo:char>}</foo:char>
  <foo:char>&</foo:char>
</foo:special_characters>

<foo:string_replacement>
  <foo:search>
    <foo:find>±</foo:find>
    <foo:replace>$\pm$</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>°</foo:find>
    <foo:replace>$\degree$</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>©</foo:find>
    <foo:replace>\copyright</foo:replace>
  </foo:search>
  <foo:search>
    <foo:find>¶</foo:find>
    <foo:replace>$\mathbb{P}$</foo:replace>
  </foo:search>
</foo:string_replacement>

This separates out the data about the replacements that you want to make (the what). Now you want to specify the procedure about how to do those replacements (the how). I've called your existing templates to actually do the replacement, and focussed on identifying what you want to exchange.

First, then, the 'escape_special_characters' template. Basically, you want to first replace the characters on the $input_text, then replace the strings on the output from that:

<xsl:template name="replace_special_characters">
  <xsl:with-param name="input_text" />
  <xsl:variable name="replaced_text">
    <xsl:call-template name="replace_characters">
      <xsl:with-param name="input_text" select="$input_text" />
    </xsl:call-template>
  </xsl:variable>
  <xsl:call-template name="replace_strings">
    <xsl:with-param name="input_text" select="$replaced_text" />
  </xsl:call-template>
</xsl:template>

The two templates for replacing the characters and replacing the strings are much the same, so I'll only go through the one replacing the characters in detail.

<xsl:template name="replace_characters">
...
</xsl:template>

First, we need to declare a couple of parameters that we're going to use. One is the text that we need to replace and the other is one to keep track of where we are in the set of replacements that we need to make. I've done this second using an index number, defaulting it to 1 as the initial value.

  <xsl:with-param name="input_text" />
  <xsl:with-param name="char">1</xsl:with-param>

Then we need to create the new string, with the replacements made. We do this by calling your put_slash_in_front_of template, with the $input_text that we already have and the $special_char that is identified by the index number. We get at the character by getting the nth foo:char within the current document (the stylesheet), i.e. document('')//foo:char[$char].

  <xsl:variable name="replaced_text">
    <xsl:call-template name="put_slash_in_front_of">
      <xsl:with-param name="input_text" select="$input_text" />
      <xsl:with-param name="special_char"
        select="document('')//foo:char[$char]" />
    </xsl:call-template>
  </xsl:variable>

Now the recursive part. If we haven't got to the end of the list of characters that need to be escaped, then we have to move on to the next one, calling this same template with the next index, and with the text that we've created (i.e. that's already been escaped). If we've run out of foo:char, then we just return the escaped text.

  <xsl:choose>
    <xsl:when test="$char &lt; count(document('')//foo:char)">
      <xsl:call-template name="replace_characters">
        <xsl:with-param name="input_text" select="$replaced_text" />
        <xsl:with-param name="char" select="$char + 1" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$replaced_text" />
    </xsl:otherwise>
  </xsl:choose>

And that's it. In full it is:

<xsl:template name="replace_strings">
  <xsl:param name="input_text" />
  <xsl:param name="search"
      select="document('')/*/foo:string_replacement/foo:search" />
  <xsl:variable name="replaced_text">
    <xsl:call-template name="replace-substring">
      <xsl:with-param name="text" select="$input_text" />
      <xsl:with-param name="from" select="$search[1]/foo:find" />
      <xsl:with-param name="to" select="$search[1]/foo:replace" />
    </xsl:call-template>
  </xsl:variable>
  <xsl:choose>
    <xsl:when test="$search[2]">
      <xsl:call-template name="replace_strings">
        <xsl:with-param name="input_text" select="$replaced_text" />
        <xsl:with-param name="search" select="$search[position() > 1]" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$replaced_text" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

This is tested in SAXON and gives the same results as your original approach, but it is much easier to extend.

You could probably apply templates to the foo:char and foo:search nodes as an alternative approach - the important thing is to separate out the what from the how, not the how you do the how :)

2.

How to escape or substitute unwanted characters

Chris Bayes

This escapes double quotes with \ from any string. It could equally be used for any character.

<xsl:variable name="noQuote"><xsl:call-template
name="cleanQuote"><xsl:with-param name="string"><xsl:value-of
select="$noLF" /></xsl:with-param>
</xsl:call-template>
</xsl:variable>



<xsl:template name="cleanQuote">
<xsl:param name="string" />
<xsl:if test="contains($string, '&#x22;')"><xsl:value-of
    select="substring-before($string, '&#x22;')" />\"<xsl:call-template
    name="cleanQuote">
                <xsl:with-param name="string"><xsl:value-of
select="substring-after($string, '&#x22;')" />
                </xsl:with-param>
        </xsl:call-template>
</xsl:if>
<xsl:if test="not(contains($string, '&#x22;'))"><xsl:value-of
select="$string" />
</xsl:if>
</xsl:template>

3.

Parsing a character string

Don Bruey

I'm trying to parse a simple string, identify its elements, and map those elements to a set of output strings. The input consists of seven-character strings that represent the days of the week. One such string looks like <<YNYNYYN>>. The first character tells me that, "Yes, the system is to send an alert on Sunday," the second tells me, "No, don't send an alert on Monday," and so on down through the days of the week.

here is the two-character solution

given XML:

<?xml version="1.0"?>
<days>
<deliverOnDays>NYNYYYN</deliverOnDays>
</days>

How to get _M_WThF_ from this input.

XSL:

<?xml version="1.0"?> 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"> 

<xsl:variable name="weekDays" select="'S M T W ThF Sa'" />

<xsl:template name="parseYNFlags">
 <xsl:param name="YNFlags"/>
 <xsl:param name="dayAbbreviations" />
 <xsl:choose>
   <xsl:when test="substring($YNFlags, 1, 1) = 'Y'" >
     <xsl:value-of select="substring($dayAbbreviations, 1, 2)" />
   </xsl:when>
  <xsl:otherwise>_</xsl:otherwise>
 </xsl:choose>  

 <xsl:if test="string-length($YNFlags) &gt; 1" >
  <xsl:call-template name="parseYNFlags">
    <xsl:with-param name="YNFlags" select="substring($YNFlags, 2)" />
    <xsl:with-param name="dayAbbreviations"
select="substring($dayAbbreviations, 3)" />
  </xsl:call-template>
  </xsl:if>
</xsl:template>


<xsl:template match="deliverOnDays">
  <xsl:variable name="temp" >
  <xsl:call-template name="parseYNFlags" >
    <xsl:with-param name="YNFlags" select="." />
    <xsl:with-param name="dayAbbreviations" select="$weekDays" />
  </xsl:call-template>
  </xsl:variable>

  <!-- now get rid of extra spaces from result -->
  <xsl:variable name="translatedTemp" select="translate($temp, ' ', '')" />
  <xsl:value-of select="$translatedTemp" /> 
</xsl:template>
</xsl:stylesheet>

4.

Replacing %1 with a parameter

Jeni Tennison




> In a fragment from an XML document where the
> contents of the <ErrorText> nodes  has a value of "%1
> %2 is not a valid date format.".
>
> My task, (you guessed it) is to replace each %1, %2 etc.. with a
> string built from the correspondong ErrorParameter (formatted nicely
> of course) . Being relatively new to XSLT, I can't think of a way to
> accomplish this in a generic manner. Has anyone got any ideas? I
> can't think of an approach that will actually work!

It's a little tedious because XPath string-handling isn't the best (roll on regexps), but it's certainly possible. To work through the string, you need a recursive template; it needs to take the string in which you're replacing parameter references, and a list of parameters that you want to replace the references with:

<xsl:template name="substituteParameters">
  <xsl:param name="string" />
  <xsl:param name="parameters" select="/.." />
  ...
</xsl:template>

You need to work through the string a bit at a time. If the string doesn't contain any % characters, then you know that you can just output the string with no changes, so that's the basic test on which the template hinges:

<xsl:template name="substituteParameters">
  <xsl:param name="string" />
  <xsl:param name="parameters" select="/.." />
  <xsl:choose>
    <xsl:when test="contains($string, '%')">
      ...
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="." />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Now, given that you have a string with a % in it, you want to output:

- the string up to the %
- the parameter in the position indicated by the number after the %
- the result of calling this same template on the rest of the string (after the number)

The first bit's easy - you can use the substring-before() function to get the bit before the %:

<xsl:template name="substituteParameters">
  <xsl:param name="string" />
  <xsl:param name="parameters" select="/.." />
  <xsl:choose>
    <xsl:when test="contains($string, '%')">
      <xsl:value-of select="substring-before($string, '%')" />
      ...
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="." />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

The second bit's a little more complicated - you need to get the number after the %. If you can guarantee that there aren't going to be more than 9 parameters, then you only need to get the first character after the %, so I'll assume that's the case (let me know if it's not). So you can use the following to get the first character after the %:

  substring(substring-after($string, '%'), 1, 1)

That gives you (a string representation of) a number, which you can use to index into the parameters passed as the value of the $parameters parameter:

  $parameters[position() =
              substring(substring-after($string, '%'), 1, 1)]

or:

  $parameters[number(substring(substring-after($string, '%'), 1, 1))]

I'd apply templates to this parameter - you can have a template matching ErrorParameter elements that does the relevant pretty formatting:

<xsl:template name="substituteParameters">
  <xsl:param name="string" />
  <xsl:param name="parameters" select="/.." />
  <xsl:choose>
    <xsl:when test="contains($string, '%')">
      <xsl:value-of select="substring-before($string, '%')" />
      <xsl:apply-templates
        select="$parameters[number(substring(
                                     substring-after($string, '%'),
                                     1, 1))]" />
        ...
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="." />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Finally, the recursive call to the template needs to pass in the same set of parameters whilst amending the string to being the 'rest' of the substring after the '%':

<xsl:template name="substituteParameters">
  <xsl:param name="string" />
  <xsl:param name="parameters" select="/.." />
  <xsl:choose>
    <xsl:when test="contains($string, '%')">
      <xsl:value-of select="substring-before($string, '%')" />
      <xsl:apply-templates
        select="$parameters[number(substring(
                                     substring-after($string, '%'),
                                     1, 1))]" />
      <xsl:call-template name="substituteParameters">
        <xsl:with-param name="string"
          select="substring(substring-after($string, '%'), 2)" />
        <xsl:with-param name="parameters" select="$parameters" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="." />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

To call it, you'd do something like the following (assuming that SynchError is the current node):

  <xsl:call-template name="substituteParameters">
    <xsl:with-param name="string" select="ErrorText" />
    <xsl:with-param name="parameters" select="ErrorParameter" />
  </xsl:call-template>

Having said all of that, you would be a lot better off if you could change the XML structure so that you used elements to indicate where the parameters should be inserted. Something like:

  <ErrorText>
    <Insert param="1" /> <Insert param="2" /> is not a valid date
    format.
  </ErrorText>

That way, you could just apply templates to the content of the ErrorText element and use templates matching Insert elements to insert the value of the relevant parameter.