XSLT Design Patterns

Design Patterns

1. Multiple views on an xml document
2. Summation Pattern
3. Design Patterns in XSLT
4. Design pattern in the XSLT and best practices in the XML.
5. General and Special case stylesheets
6. XSL Design Patterns
7. XSLT standard library
8. Library of stylesheets
9. Design Choices
10. Multiple media output
11. Decorator Pattern
12. Pipelines and multi-pass
13. Generating something like, but not quite, XML

1.

Multiple views on an xml document

Michael Kay


    
> Does anyone have any insights 
> into a nice elegent solution to the problem of generating multiple
> views of the same document?
 

Use <xsl:import> to import common, general-purpose rules into a stylesheet designed to handle the the specific transformation, not the other way around. I.e. make summary-view and detailed-view the principal stylsheet for the transformation, and import the shared stuff into each of these.

2.

Summation Pattern

Paul Tchistopolskii


> Is it possible to implement running counters in pure 100% XSLT, without
> using proprietary extensions?

    

There is general pattern for 'counter' :

template accumulator ( list ) {

    $first element = take first element of the list ( position() = 1 )
    $rest list = take the rest of the list ( position() > 1 )

    $total-of-rest  =  {
            if ( $rest ) { 
                 call accumulator(  $rest )
            } else {
                <!-- nothing. End of loop. -->
            }
    }

    <xsl:value-of select = " $first + $total-of-rest " />
}


Invokation: 

template( list-to-process )

3.

Design Patterns in XSLT

Dan Morrison

Personally I've been a stickler for Content & Presentation being abstracted from each other as much as possible, and generally add Logic/Navigation as a third independant entity.

While I agree that hand-coding tiny variations is bad, I've been achieving results using XSL layout 'libraries' and xsl:include, so only the unique bits end up in a new file.

As I see it, the alternative is to 'contaminate' the XSL with lots of conditionals. Now that's messy. ( It's how I started :-) )

Does anyone have pointers on the genuine design patterns of using XSLT? So far there's a couple of dozen HOWTOs, but a poor selection of WHENTOs and only some abstract WHYTOs. We know that XML+XSL=Presentation of Data, we know that XML+XSL[2]=Alternative Presentation of Data. Has anyone got a resource on the thinking behind how to glue this together?

This is EXACTLY what I'm working on now, and would love to find some prior art...

Mike Brown continues this debate:

The point that was being missed was that with XSL you have at your disposal any number of hierarchical data sources from which you can obtain information that helps you build a new hierarchical data structure. To restrict the source data repositories to house only "content" is to overlook the numerous possibilities offered by XML and XSL -- lots of information can be put into the source trees, data that the XSL can use as cues as to how to go about building up the result tree.

xsl:apply-templates can be very powerful when used to process, for example, a source tree consisting of a purely structural description of a web site, and secondary source trees (retrieved via document()) consisting of presentational variables (colors, text styles, image names and attributes) referenced by the structural tree.

I can see how conditionals can get messy, but templates that match different things, or spaghetti choose-when'ing ... there's not much difference! You have to draw a line and say "in this situation I want to do a,b,c and in this other situation I want to do x,y,z" ... it's just better to have as few as those as possible, IMO. Quantifying and acting on presentational data alongside, but separate from, the content data, is one way to go about that.

Ednote: I'd love to see some more design patterns in XSLT.

4.

Design pattern in the XSLT and best practices in the XML.

Various.

wrappers. - - if they exist in the XML proceed as per DaveP below Often one additional wrapper in the source XML makes all the difference in the world to the ease of processing via XSLT. To be able to sit (sorry, template match) on the wrapper, and play with the children (??) of that wrapper is a piece of cake compared to matching on one of many, and chasing along the axis to do something. - - if they do not exist consider two transformations piped to insert wrappers

if wrappers do not exist but IDs do

process with template match

if IDs do not exist consider generate(id) and then template match or keys

if both wrappers and IDs exist... Select the most appropriate access method.

if neither wrappers nor IDs exist...

In short, good XML markup uses both elements and attributes to create as much specificity in the nodes as possible. Good XSLT can transform XML to achieve greater degrees of specificity in the markup or can use XML marked up with an adequate degree of specificity to extract information.

It does seem almost counter-intuitive to be adding something in order to speed up the extraction. The metaphor of a catalyst might make it seem more intuitive. Don't know if the "add in order to subtract" principle could make sense of axis and node navigation....

Mike Kay adds:

The simplest rule is, wherever you've got a list, put a tag around the list as well as tags around the members of the list.

If you haven't done this, it's a good idea to put in a SAX filter to add the wrapper before the XSLT stylesheet sees the data. This will generally be much faster than doing it with XSLT.

5.

General and Special case stylesheets

Mike Kay


> It seems that there is no way to dynamically import or include another
> stylesheet, unless one writes the calling stylesheet at runtime.
	

Correct, xsl:include and xsl:import are compile-time facilities.

> More concretely, I am trying to create a sheet that uses a default
> stylesheet for a portion of an HTML document, but I want that
> portion to be able to be replaced by a user simply specifying
> an alternate stylesheet source in the source XML of the translation.

The desire for run-time xsl:import is voiced quite often, and for once it's nice to hear exactly why you want it, so that I can tell you why you don't!

Instead of A conditionally importing stylesheet B1, B2, or B3 each of which replaces part of A, the user should select stylesheet B1, B2, or B3, each of which imports the fixed stylesheet A and replaces or overrides parts of it. The special-case stylesheet should import the general-case stylesheet, not the other way around.

6.

XSL Design Patterns

Dimitre Novatchev

While collecting XSLT design patterns is a good idea, there are very nice repositories of such -- e.g. VBXML.Com's Snippetcentral.

However, the first that immediately come into mind are:

- The Kaysian method for set intersection, difference and symmetric difference.
- The Muenchian method for grouping.
- The Wendel Piez method of non-recursive looping
- Oliver Becker's method of conditional selection of a value in a single expression.
- Multi-pass processing.
- Using customised data structures under a non-xsl namespace, embedded in the stylesheet.
- Iterative processing based on recursion.

7.

XSLT standard library

Steve Ball

The XSLT Standard Library, xsltsl, provides the XSLT developer with a set of XSLT templates for commonly used functions. These are implemented purely in XSLT, that is they do not use any extensions.

Goals are stated as: Provision of a high-quality library of XSLT templates, suitable for inclusion by vendors in XSLT processor software products.

Demonstration of best practice in XSLT stylesheet development and documentation.

Provide examples of various techniques used to develop XSLT stylesheets

Presently (Nov 2001) contains modules for String Processing Nodes Date/Time Processing URI (Uniform Resource Identifier) Processing

See Sourceforge for more detail.

8.

Library of stylesheets

Jean-Marc Vanel

I have a library of reusable and modular transforms. They are under LGPL license. They are meant to be used either stand-alone, or imported, or in pipelines (e.g. in Cocoon). Some process or generate XML Schemas, of XHTML, some are generic, e.g. replace all attributes by elements. Some other are design patterns. Some necessitate Saxon, some use XSLT 2.0 draft. readme

They are also on sourceforge.net on the CVS : sourceforge

9.

Design Choices

Michael Kay


> My current code is oriented towards xhtml output.  But I want to 
> be able to write simple simple output drivers for other formats that 
> just takes the content in the formatted-biblist variable (which is 
> just
> xhtml) and transforms it (into, for example, TeX or WordML).

There's an interesting design choice that has to be made between using a single-stylesheet pipeline and a multi-stylesheet pipeline. Generally, the more complex things get, the more benefits there are in using a multi-stylesheet pipeline, where the transformation is broken up into a sequence of small transformations each done by its own stylesheet. The problem with this approach is that you need to write code in another language to control the pipeline flow. There are pipeline languages that do this (XPipe, Orbeon), or you can do it in Java, but it's another bit of technology to add to the mix.

Once you've got a pipeline of more than one stylesheet, and a decent control engine for coordinating them, you tend to find that the individual stylesheets become smaller and smaller, and more and more reusable. Until you get to that point, you tend to construct a single-stylesheet pipeline in which variables are used to hold the results of successive transformation phases: and the stylesheet gradually gets more and more complex.

I don't know what the right answer is for you - I'm just trying to clarify your options.

10.

Multiple media output

Michael Kay


> So, my current code is oriented towards xhtml output.  But I want to 
> be able to write simple simple output drivers for other formats that 
> just takes the content in the formatted-biblist variable (which is 
> just
> xhtml) and transforms it (into, for example, TeX or WordML).

There's an interesting design choice that has to be made between using a single-stylesheet pipeline and a multi-stylesheet pipeline. Generally, the more complex things get, the more benefits there are in using a multi-stylesheet pipeline, where the transformation is broken up into a sequence of small transformations each done by its own stylesheet. The problem with this approach is that you need to write code in another language to control the pipeline flow. There are pipeline languages that do this (XPipe, Orbeon), or you can do it in Java, but it's another bit of technology to add to the mix.

Once you've got a pipeline of more than one stylesheet, and a decent control engine for coordinating them, you tend to find that the individual stylesheets become smaller and smaller, and more and more reusable. Until you get to that point, you tend to construct a single-stylesheet pipeline in which variables are used to hold the results of successive transformation phases: and the stylesheet gradually gets more and more complex.

I don't know what the right answer is for you - I'm just trying to clarify your options.

11.

Decorator Pattern

Wendell Piez

I'm not sure whether this could rightly be said to map to the "decorator" pattern (maybe so), but it seems to me you may be looking for the idiom:

<xsl:template match="something">
   <xsl:call-template name="some-wrapper"/> 
  </xsl:template></programlisting>

<xsl:template name="some-wrapper">
   <xsl:param name="contents">
     <xsl:apply-templates/></xsl:param>
   </xsl:param>
   <wrapper>
     <xsl:copy-of select="$contents"/>
   </wrapper>
</xsl:template>

this allows you also to do things like

<xsl:template match="something-else">
   <xsl:call-template name="some-wrapper">
     <xsl:with-param select="contents">
       <xsl:call-template name="some-other-wrapper"/>
     </xsl:with-param>
   </xsl:call-template>
</xsl:template>

or even

<xsl:template match="another-something-else">
   <xsl:call-template name="some-wrapper">
     <xsl:with-param select="contents">
       <xsl:text>Wrap me!</xsl:text>
     </xsl:with-param>
   </xsl:call-template>
   <xsl:apply-templates/>
</xsl:template>

That is, the "wrapper" (or decorator) template can wrap anything you pass it.

A "thick" solution to a nasty problem (map a set of attributes to a set of wrappers) using this idiom (although by matching templates, not calling them by name) appears at

But I should also caution you that it isn't very common to have to do this; lighter-weight solutions are usually possible. Also, as Jay says, the requirements this addresses are often better answered by pushing the problem into the next layer up, for example into a CSS stylesheet to be applied to HTML output.

12.

Pipelines and multi-pass

Michael Kay




> But what do you mean with the second pass. Do I have to invoke the 
> transformer with another xsl or is it possible to invoke this second 
> transformation within the original xsl file?

It's surprising how rarely this technique is taught and discussed. Splitting complex transformations up into a pipeline of simple transformations is something that ought to be a standard design pattern used by every XSLT developer.

There are two ways of doing it: multiple stylesheets, and multiple phases within a single stylesheet. I use both approaches, often within the same pipeline.

Multiple stylesheets can be linked into a pipeline in a number of ways:

* with your own custom Java code, e.g. using the JAXP interfaces

* with a pipeline processor such as Orbeon

* from a shell script

* Saxon has a custom extension, saxon:next-in-chain, that allows one stylesheet to direct its output to be processed by another stylesheet

Within a single stylesheet, a pipeline is expressed as a series of variables:

<xsl:variable name="phase-1-output">
  <xsl:apply-templates select="/" mode="phase-1"/> </xsl:variable>

<xsl:variable name="phase-2-output">
  <xsl:apply-templates select="$phase-1-output" mode="phase-2"/> </xsl:variable>

<xsl:template match="/">
  <xsl:copy-of select="$phase-8-output"/> </xsl:template>

To do this in XSLT 1.0, you need the xx:node-set() extension.

Using multiple stylesheets gives you greater modularity and reusability, but is a bit more complex to deploy. Importantly, it also allows you to incorporate steps into the pipeline [I've never been sure what a step in a pipeline should be called!] that are implemented using technologies other than XSLT - for example, STX, XQuery, Java SAX filters, Perl scripts.

13.

Generating something like, but not quite, XML

Michael Kay



> I just realized we can't even do that as InDesign uses mandatory 
> nested tags to define paragraph styles, like <this is a tag<this is a 
> nested tag>>, so perhaps I will have to look at this all another way??

This is the general problem of generating output in a format that has a passing resemblance to XML but is not actually XML. There are a surprising number of such formats still in use (some of them, of course, are valid SGML). There are a number of choices available:

(a) define an XML representation of the required output and generate that using XSLT. Then write a converter in some other language to convert this XML to the target form.

(b) write a stylesheet that outputs text (xsl:output method="text")

(c) write a stylesheet that outputs a mixture of XML and text (xsl:output method="xml" with disable-output-escaping). Messy but sometimes pragmatic.

(d) implement a custom serialization method. If you have some Java skills, this isn't as daunting as it may sound, for example in Saxon you can do it by subclassing the serializer that comes with the product.

The main thing is to try and keep the peculiarities contained to as small a part of your code as you can. I'd suggest going for (a) if you can.