XSLT and CSV

Comma Separated Data

1. How to import CSV files
2. Comma separated data
3. Comma separated record to table
4. CSV to list of words ( a functional tokeniser)
5. Comma separated data

1.

How to import CSV files

Jeni Tennison





 > Is there a clever way to import CSV data into the stylesheet? The
 > following fails of course:
 >
 >   <xsl:variable name="csv" select="document('data.csv')"/>
 >
 > I can't alter the file data.csv but need the data in it for a helper
 > lookup function. 

If data.csv stays the same, then convert it to an XML file using sed or perl or whatever you fancy, and use that.

If data.csv changes, but doesn't contain any <s or &s, then you could create a wrapper XML document that accesses it as an external entity - data.xml:

 <?xml version="1.0"?>
 <!DOCTYPE data [
 <!ENTITY data SYSTEM 'data.csv'>
 ]>
 <data>&data;</data>

You could then access the data using the document() function, though you'd have to work through the CSV document using string manipulation functions in XSLT/XPath, which isn't all that fun.

Alternatively, if you're using an XSLT processor that accepts SAX events you could write a custom entity resolver that reads in the CSV document and generates SAX events to make the XSLT processor think that it's accessed an XML document. That would also allow you take advantage of the string manipulation support in whatever programming language you were using, such that the XSLT processor 'sees' the CSV file as elements.

2.

Comma separated data

Arnold, Curt


I need to separate the following data into separate fields:
 
<years  title = "year">1994, 1995, 1996, 1997, 
1998, 1999, 2000, 2001, 2002, 2003,   </year>
 
For instance, the output needs to read:
 
<tr><td>1994</td><td>1995</td><td>1996</td> 
... etc ... </tr>





It should be possible using recursive named template calls,
something like
 
<xsl:template name="displayyear">
    <xsl:param name="yearlist"/>
    <xsl:variable name="year" 
        select="substring-before($yearlist,',')"/>
    <xsl:choose>
        <xsl:when test="string-length($year) > 0">
            <td><xsl:value-of select="$year"/></td>
            <xsl:call-template name="displayyear">
                <xsl:with-param 
             name="yearlist"><xsl:value-of 
             select="substring-after($yearlist,',')"/>
               </xsl:with-param>
            </xsl:call-template>
       </xsl:when>
        <xsl:otherwise>
             <td><xsl:value-of select="$yearlist"/></td>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

3.

Comma separated record to table

Jorg Heinicke

How to convert comma separated input to a table? HTML table out of this?

  <Data>
       <Record>Full Name, Address1, Address2, Address3, 
        Phone  number, Age,   Hobby</Record>
       <Record>Anne Brown, 25A Symonds St, , Auckland, 09373535, 29, Reading</Record>
       <Record>Mark Smith, 30 Whiteney St, Blockhouse Bay, 
 Auckland, 09  6232653, 31, Swimming</Record>
       <Record>Dane Anderson ,1 Crescent Dr, Newton,  Auckland, 09373995, 20,
       </Record>
       ...
 </Data>
 

This resolves it using the divide template

  <xsl:template match="Record">
      <tr>
          <xsl:call-template name="divide">
              <xsl:with-param name="to-be-divided" select="."/>
              <xsl:with-param name="delimiter" select="','"/>
          </xsl:call-template>
      </tr>
  </xsl:template>
  
  <xsl:template name="divide">
      <xsl:param name="to-be-divided"/>
      <xsl:param name="delimiter"/>
      <xsl:choose>
          <xsl:when test="contains($to-be-divided,$delimiter)">
              <td><xsl:value-of
  select="substring-before($to-be-divided,$delimiter)"/></td>
              <xsl:call-template name="divide">
                  <xsl:with-param name="to-be-divided"
  select="substring-after($to-be-divided,$delimiter)"/>
                  <xsl:with-param name="delimiter" select="','"/>
              </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
              <td><xsl:value-of select="$to-be-divided"/></td>
          </xsl:otherwise>
      </xsl:choose>
 </xsl:template>

4.

CSV to list of words ( a functional tokeniser)

Dimitre Novatchev


 > Now in XSL land I want to iterate over a
 > nodelist and compare some attribute of the current node to 
>  each value in the  CSV for equality.

You have a CSV string (a list of characters), you need to inspect every character and to gradually accumulate the result -- a list of words, that were delimited by special characters (in this particular case by comma and/or white space).

A "generic accumulator" function over the elements of a list is the "foldl" function -- the classic king of generic list processing. We pass to "foldl" as parameter a function that will be called with two arguments -- the accumulated result until now (the list of tokens so far) and the next character in the input string. Based on these two arguments, this function updates the accumulated result appropriately -- it either appends the character to the last token, or "cuts" the last token and starts a new one.

And here's the solution:

 &lt;xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
 xmlns:str-split2words-func="f:str-split2words-func"
 exclude-result-prefixes="xsl msxsl str-split2words-func" >
 
    &lt;xsl:import href="str-foldl.xsl"/>
 
    &lt;str-split2words-func:str-split2words-func/>
 
    &lt;xsl:param name="pDelimiters" 
              select="', &#9;&#10;&#13;'"/>
 
    &lt;xsl:output indent="yes" omit-xml-declaration="yes"/>
    
     &lt;xsl:template match="/">
       &lt;xsl:call-template name="str-split-to-words">
         &lt;xsl:with-param name="pStr" select="/*/*"/>
       &lt;/xsl:call-template>
     &lt;/xsl:template>
 
     &lt;xsl:template name="str-split-to-words">
       &lt;xsl:param name="pStr" select="dummy"/>
       
       &lt;xsl:variable name="vsplit2wordsFun"
           select="document('')/*/str-split2words-func:*[1]"/>
 
       &lt;xsl:call-template name="str-foldl">
         &lt;xsl:with-param name="pFunc" select="$vsplit2wordsFun"/>
         &lt;xsl:with-param name="pStr" select="$pStr"/>
         &lt;xsl:with-param name="pA0" select="/.."/>
       &lt;/xsl:call-template>
 
     &lt;/xsl:template>
 
     &lt;xsl:template match="str-split2words-func:*">
       &lt;xsl:param name="arg1" select="/.."/>
       &lt;xsl:param name="arg2"/>
          
       &lt;xsl:choose>
         &lt;xsl:when test="contains($pDelimiters, $arg2)">
             &lt;xsl:copy-of select="$arg1/*"/>
             &lt;xsl:if test="string($arg1/*[last()])">
               &lt;word/>
             &lt;/xsl:if>
         &lt;/xsl:when>
         &lt;xsl:otherwise>
           &lt;xsl:copy-of 
               select="$arg1/*[position() &lt; last()]"/>
           &lt;word>&lt;xsl:value-of 
                 select="concat($arg1/*[last()], 
 $arg2)"/>&lt;/word>
         &lt;/xsl:otherwise>
       &lt;/xsl:choose>
     &lt;/xsl:template>
 
 &lt;/xsl:stylesheet>
 

When applied on the following xml document:

 &lt;contents>
   &lt;csv>Fredrick, Aaron, john, peter&lt;/csv>
 &lt;/contents>

The result is:

 &lt;word>Fredrick&lt;/word>
     &lt;word>Aaron&lt;/word>&lt;word>john
              &lt;/word>&lt;word>
 peter&lt;/word>

We need just one more small step in order to obtain the ultimate tokenizer -- if we manage to pass the list of delimiters to the accumulating function that we pass as parameter to str-foldl, then we have the most general tokenizer function. You'll never anymore need to code your own tokenizer, just call this one with your parameters.

The solution is to always specify the list of delimiters as the first element of the "accumulator" list:

 &lt;xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
 xmlns:str-split2words-func="f:str-split2words-func"
 exclude-result-prefixes="xsl msxsl str-split2words-func" >
 
    &lt;xsl:import href="str-foldl.xsl"/>
 
    &lt;str-split2words-func:str-split2words-func/>
 
    &lt;xsl:output indent="yes" omit-xml-declaration="yes"/>
    
     &lt;xsl:template match="/">
       &lt;xsl:call-template name="str-split-to-words">
         &lt;xsl:with-param name="pStr" select="/*/*"/>
         &lt;xsl:with-param name="pDelimiters" 
         select="',  &#9;&#10;&#13;'"/>
       &lt;/xsl:call-template>
     &lt;/xsl:template>
 
     &lt;xsl:template name="str-split-to-words">
       &lt;xsl:param name="pStr"/>
       &lt;xsl:param name="pDelimiters"/>
       
       &lt;xsl:variable name="vsplit2wordsFun"             
 select="document('')/*/str-split2words-func:*[1]"/>
                     
       &lt;xsl:variable name="vrtfParams">
        &lt;delimiters>&lt;xsl:value-of 
                 select="$pDelimiters"/>&lt;/delimiters>
       &lt;/xsl:variable>
 
       &lt;xsl:variable name="vResult">
 	      &lt;xsl:call-template name="str-foldl">
 	        &lt;xsl:with-param name="pFunc" 
	       select="$vsplit2wordsFun"/>
 	        &lt;xsl:with-param name="pStr" select="$pStr"/>
 	        &lt;xsl:with-param name="pA0" 
 select="msxsl:node-set($vrtfParams)"/>
 	      &lt;/xsl:call-template>
       &lt;/xsl:variable>
       
       &lt;xsl:copy-of select="msxsl:node-set($vResult)/word"/>
 
     &lt;/xsl:template>
 
     &lt;xsl:template match="str-split2words-func:*">
       &lt;xsl:param name="arg1" select="/.."/>
       &lt;xsl:param name="arg2"/>
          
       &lt;xsl:copy-of select="$arg1/*[1]"/>
       &lt;xsl:copy-of select="$arg1/word[position() != last()]"/>
       
       &lt;xsl:choose>
         &lt;xsl:when test="contains($arg1/*[1], $arg2)">
           &lt;xsl:if test="string($arg1/word[last()])">
              &lt;xsl:copy-of select="$arg1/word[last()]"/>
           &lt;/xsl:if>
           &lt;word/>
         &lt;/xsl:when>
         &lt;xsl:otherwise>
           &lt;word>&lt;xsl:value-of 
 select="concat($arg1/word[last()], $arg2)"/>&lt;/word>
         &lt;/xsl:otherwise>
       &lt;/xsl:choose>
     &lt;/xsl:template>
 
 &lt;/xsl:stylesheet>
 

And with the same xml source document, here's the result:

 &lt;word>Fredrick&lt;/word>
 &lt;word>Aaron&lt;/word>
 &lt;word>john&lt;/word>
 &lt;word>peter&lt;/word>

5.

Comma separated data

Stuart Brown



> I have a problem regarding parsing a string and using it in XSL Sheet.
>
> For example ..
> <data>A,vc,dfg,aa,dfr,r</data>
>
> Now I want to use the comma separated fields in my XSLT.
> Like  match field 1 = something
>     out some thing ..
> match field 2 = something
>     out some thing.
>

You can use a recursive template to break down the string:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

  <xsl:template match="data">
    <xsl:variable name="dataString" select="text()"/>
    <xsl:call-template name="commaSplit">
      <xsl:with-param name="dataString" select="$dataString"/>
      <xsl:with-param name="position" select="1"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="commaSplit">
    <xsl:param name="dataString"/>
    <xsl:param name="position"/>
    <xsl:choose>
      <xsl:when test="contains($dataString,',')">
        <!-- Select the first value to process -->
        <xsl:call-template name="doWhatever">
          <xsl:with-param name="doWith"
	  select="substring-before($dataString,',')"/>
          <xsl:with-param name="position" select="$position"/>
        </xsl:call-template>
        <!-- Recurse with remainder of string -->
        <xsl:call-template name="commaSplit">
          <xsl:with-param name="dataString"
select="substring-after($dataString,',')"/>
          <xsl:with-param name="position" select="$position + 1"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <!-- This is the last value so we don't recurse -->
        <xsl:call-template name="doWhatever">
          <xsl:with-param name="doWith" select="$dataString"/>
          <xsl:with-param name="position" select="$position"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <!-- Process of individual value here -->
  <xsl:template name="doWhatever">
    <xsl:param name="doWith"/>
    <xsl:param name="position"/>
    <outputValueInElementWithABigLongName position="{$position}">
      <xsl:value-of select="$doWith"/>
    </outputValueInElementWithABigLongName>
  </xsl:template>

  <xsl:template match="*">
    <xsl:apply-templates/>
  </xsl:template>

</xsl:stylesheet>

There's plenty of other examples of this kind of this in the archives.