Identity

1. Identity transform.
2. Change only a small part of an XML file
3. How to create a node set that excludes some descendant elements?
4. Identity transform to stylesheet with includes
5. Why is the identity transform the way it is?
6. Recording the changes made
7. Identity variants

1.

Identity transform.

Mike Kay


 > Is there a way to change an attribute in any occurrence 
 > in a given
 > element, including children?  I have been able to do it 
 > recursively, with a
 > little knowledge of where the attributes might be, but is it 
 > possible to do
 > it all at once?  I want to assume that I don't know where the 
 > attribute
 > will be at.
 

If you base your stylesheet on the identity template rule

 <xsl:template match="node()|@*">
   <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>

Then you can define any special processing in additional templates, e.g.

 <xsl:template match="@fred">
   <xsl:attribute name="bill">
   <xsl:value-of select="."/>
   </xsl:attribute> 
</xsl:template>

2.

Change only a small part of an XML file

Steve Muench

Transformations of this sort are best done as a variation on the identity transformation.

If you put the following handy identity transformation into a file like "identity.xsl"...

<!-- The Identity Transformation -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <!-- Whenever you match any node or any attribute -->
  <xsl:template match="node()|@*">
    <!-- Copy the current node -->
    <xsl:copy>
      <!-- Including any attributes it has and any child nodes -->
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

And then build a stylesheet that imports the "identity.xsl" stylesheet as part of it as the one does below, then whatever templates you put in this stylesheet will take precedence over the base (imported) template that is doing the identity transformation of each node. The result is that the tree gets copied verbatim, with the exception of the "overrides" your templates below pick up...

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <!-- Import the identity transformation. -->
  <xsl:import href="Identity.xsl"/>
  <!-- 
   | This will match any element's "date" attribute
   | Make the pattern more specific if this is not appropriate
   +-->
  <xsl:template match="@date">
     <!-- This will construct a "date" attribute having value of its content -->
     <xsl:attribute name="date">
       <!-- Change what's in here to construct the "new" date format -->
       <xsl:value-of select="concat('new',.)"/>
     </xsl:attribute>
  </xsl:template>
</xsl:stylesheet>

This will transform a document like:

  <a date="1">
    <b date="2"/>
    <c>
      <d date="3"/>
    </c>
  </a>

into:

<a date="new1">
    <b date="new2"/>
    <c>
      <d date="new3"/>
    </c>
  </a>

Jeni Tennison offers You want to create copies of most nodes. The 'copy me, but pay attention to my contents' templates look like:

<xsl:template match="*">
  <xsl:copy>
    <xsl:apply-templates select="@*" />
    <xsl:apply-templates />
  </xsl:copy>
</xsl:template>


<xsl:template match="@*">
  <xsl:copy-of select="." />
</xsl:template>

[Uses the built-in templates to get to the document element and to copy the value of any text nodes. I haven't included copying processing instructions etc. here.]

These templates are very general - they don't give names to match on or use predicates - so they're given low priority. Any template that you have that *does* use a name or a predicate will be applied in preference. So, for your exception, do:

<xsl:template match="@date[. = 'some format']">
  <xsl:attribute name="date">some new format</xsl:attribute>
</xsl:template>

This will make any date attribute with a value of 'some format' be changed into a date attribute with a value of 'some new format'. If there are any other restrictions on what the date should be (like what element it's an attribute on), you can put them in the match pattern as well.

3.

How to create a node set that excludes some descendant elements?

Michael Kay



I think you are a little confused between the terms "node-set" and "result-tree-fragment". (Not surprising, most people are).

You select a node-set using

<xsl:variable name="n" select="--- some path expression ---"/>

The result is a set of nodes - I actually prefer to think of it as a set of *references* to nodes. These are original nodes in the source document, and they retain their original position in the source document, which means for example that you can process each node in the set to ask how many ancestors it has.

You create a result tree fragment (or in 2.0 terminology a temporary tree) using

<xsl:variable name="n">
  -- some instructions ---
</xsl:variable>

The result is a new document. (The term "fragment" comes from DOM, and means a document that isn't constrained to have a single element at the top level.) The nodes in this tree are newly constructed nodes; they may be copies of nodes in the source tree (or not) but they have separate identity and have lost their relationships to other nodes in the source tree.

The example that you've given suggests that you do actually want to create a new tree that is a selective copy of the original tree. The way to do this is to walk the original tree applying templates. If a node is to be copied into the new tree, you apply the identity template, if it is to be removed, you apply an empty template, and of course you can also have templates that modify selected elements. So it looks something like this:

<xsl:variable name="subset">
  <xsl:apply-templates select="a" mode="subset"/> </xsl:variable>

<!-- by default, an element is copied -->

<xsl:template match="*" mode="subset">
  <xsl:copy>
   <xsl:copy-of select="@*"/>
   <xsl:apply-templates mode="subset"/>
  </xsl:copy> 
</xsl:template>

<!-- I  want to include [only] the first <y> element 
that is contained within
  <c>, no matter where it occurs. 
There may be no <y> elements present. -->

<!-- <xsl:template match="y[not(. is (ancestor::c//y)[1])]" mode="subset"/> -->
<xsl:template match="y[not(generate-id(.)= generate-id(ancestor::c//y)[1])]" mode="subset"/>



<!-- I want to exclude all <z> elements 
     that are contained within <c>, no
  matter where they occur. 
Again, there may be none present. -->

<xsl:template match="z" mode="subset"/>

I've shown the XPath 2.0 "is" operator for comparing node identity here. The 1.0 equivalent of "A is B" is generate-id(A)=generate-id(B).

4.

Identity transform to stylesheet with includes

Michael Kay



> I need to perform an identity transform on a stylesheet to 
> add an attribute to each and every LRE.  The stylesheet(s) 
> are of the form:

> <xsl:stylesheet version="1.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:include href="foo.xsl"/>
> <xsl:template match="/">
>   <xsl:apply-templates/>
> </xsl:template>
> </xsl:stylesheet>

> The issue is performing the transform after all of the 
> includes/imports of the xsl have been processed, rather that 
> just the 6 line xml document.

If the above is your source document, then it is just data, so there is no sense in which any xsl:include and xsl:import elements are going to be "processed, other than being processed in the same way as any other element in the source document.

If you want to apply the same transformation to the above document and to all the documents referenced (recursively) by xsl:include and xsl:import elements you can do:

<xsl:template match="xsl:stylesheet | xsl:transform">
  <xsl:apply-templates select="." mode="add-attributes"/> 
  <xsl:for-each select="document(xsl:include/@href | xsl:import/@href)">
     <xxx:result-document href=".">
       <xsl:apply-templates select="*"/>
     </xxx:result-document>
  </xsl:for-each>
</xsl:template>

Where xxx:result-document is an extension element provided by your processor to produce multiple output files.

5.

Why is the identity transform the way it is?

Tom Passin



> why does the standard identity transform use copy instead of copy-of?

One reason is that you often do not __quite__ want an identity transform - you want to change one or a few things but leave verything else the same. With the standard approach you can do that, with the copy-of you cannot.

Wendell gives the example

<!-- The identity transform -->
<xsl:template match="/ | @* | node()">
   <xsl:copy>
     <xsl:apply-templates select="@* | node()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="broken">
   <fixed>
     <xsl:apply-templates select="@* | node()"/>
   </fixed>
</xsl:template>

Put these two together and you get a stylesheet which takes arbitrary input and returns it as the result -- except all <broken> elements, at any level, are now <fixed>.

The copy-of method is, of course, more efficient if you actually need to clone a subtree, rather than copy-it-mostly-except, which is what most applications of an identity transform actually need to do.

6.

Recording the changes made

David Carlisle




>I have the following stylesheet which essentially performs an identity 
>transform on a given xml instance, producing a copy with only slight 
>changes (the e.g. below has one of these -- stripping out the 
>concluding punctuation from element 'onlyNum' -- but the actual 
>stylesheet will have more). I would like to produce another document 
>that records these changes, at least as a list containing the original 
>and changed node. this can be either text or xml, something on the lines
of:
><onlyNum>E345.</onlyNum> changed to
><onlyNum>E345</onlyNum>

This is really like a table of contents, so the canonical way of proceeding is to process the document twice, with two modes so

<xsl:template match="/">
 <xsl:apply-templates/>
<xsl:result-document ....>
 <xsl:apply-templates mode="log"/>
</xsl:result document>
</xsl:template>

You then just duplicate every

<xsl:template match="something">

that's doing something otherthan an identity and have

<xsl:template match="something" mode=log">  I did something </xsl:template>

If reprocessing the document twice and duplicating the templates doesn't feel right then an alternative plan (which isn't really following the one-true-path to functional programming purity is to stick an

<xsl:message>I did something </xsl:message> 

into any template that does anything, then use your command line or API to redirect the xsl:message output to your log file.

Note however that the first way is guaranteed to produce the log in a logical order, the second way will most likely produce the log in the order that the system actually evaluated the templates, which if you have some (perhaps mythical) highly parralelised and optimised xslt engine mightbe in any order at all.

7.

Identity variants

Michael Kay

There are two variants of the identity template in common use. This version:

<xsl:template match="*">
  <xsl:copy>
  <xsl:copy-of select="@*"/>
  <xsl:apply-templates/>
  </xsl:copy>
</xsl:template>

processes all *elements* by copying them, and can be overridden for individual elements.

This version:

<xsl:template match="node()|@*">
  <xsl:copy>
  <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

processes all *nodes* by copying them, and can be overridden for individual elements, attributes, comments, processing instructions, or text nodes.