xslt output problem

Output

1. xsl output problems
2. XML to ASCII Conversion
3. No closing tag on html output method
4. RTFs and Node sets
5. How to create positional ASCII text layout
6. percent sign in output attributes
7. Closure - transformation and output separation
8. Dynamic output method?

1.

xsl output problems

Michael Kay

Can I parameterise encoding in xsl:output?

> The encoding attribute of the <xsl:output> element is set as a string.
> Mike - is the fact that SAXON doesn't report an error a bug?

If the encoding isn't supported, Saxon (as permitted by the spec) reverts to UTF-8 and puts out a warning message that it is doing so. Unfortunately the encoding as written to the XML declaration or the HTML META element is the encoding that was requested, not the one that was actually used.

Which may cause browser confusion!

2.

XML to ASCII Conversion

Chuck White

Use the output element in your top level, and make sure you use the
newest iteration of XT:

<?xml version='1.0'?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
       version="1.0">
  <xsl:output method="text"/>
<xsl:output method="text" indent="yes">
  <xsl:template match='/'>
<xsl:apply-templates select="yourTemplate"/>
</xsl:template>

With fuller example from Mike Brown.

I want to translate the following...

<orderlist>
    <order ordernum="1">
        <customer>
            <firstname>John</firstname>
            <lastname>Doe</lastname>
            <phone>(510) 555-1212</phone>
        </customer
    </order>
    <order ordernum="2">
        <customer>
            <firstname>Jane</firstname>
            <lastname>Smith</lastname>
            <phone>(916) 555-1212</phone>
        </customer
    </order>
</orderlist>

into a tab-delimited ascii file for import into QuickBooks like this...

firstname &#x9;lastname&#x9; phone
John  &#x9;       Doe   &#x9;      (510) 555-1212
Jane  &#x9;       Smith &#x9;      (916) 555-1212

Here you go, with some comments to explain:

<?xml version="1.0"?>
<xsl:stylesheet 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"     
 version="1.0">
 <!-- if using an implementation of the current XSLT WD -->
 <xsl:output method="text"/>
 <!-- execute this template if 
       the element name is 'orderlist' -->
 <xsl:template match="orderlist">
  <!-- emit column headers with tabs and a newline -->
<xsl:text>firstname&#x9;lastname
   &#x9;phone&#xA;</xsl:text>
  <!-- process elements named 
     'customer' that are children of
       elements named 'order' 
       that are children of the current
       node -->
  <xsl:for-each select="order/customer">
   <!-- select as a node-set the elements named 
        'firstname'
        that are children of the current node, and emit the
        concatenation of text nodes contained within the
        first node in that node-set -->
   <xsl:value-of select="firstname"/>
   <xsl:text>&#x9;</xsl:text>
   <xsl:value-of select="lastname"/>
   <xsl:text>&#x9;</xsl:text>
   <xsl:value-of select="phone"/>
   <xsl:text>&#xA;</xsl:text>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

3.

No closing tag on html output method

Phil Lanch

When you use HTML output method, closing tags aren't generated for HTML tags that are 'empty by definition' (including <input>).

When the top-level output tag is <html>, you get HTML output method by default.

You can override the default by saying:

<xsl:output method="xml"/>

In other words: it's a feature, not a bug.

Mike Kay adds:

If your complaint (see later messages) is that with XML output it's generating <input/> rather than <input></input>, then the answer is that you can't influence this; you shouldn't need to, because they are 100% equivalent.

Mark Hayes Adds: If you are REALLY need to force a </input> tag to appear, add a preserve-space element at the top level for input, and some blank text inside the <input> output element:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
    <xsl:output method="xml"/>
    <xsl:preserve-space elements="input"/>  
                  <!-- *** here *** -->
    <xsl:template match="/">
        <html>
            <body>
                <xsl:element name="input">
                  <xsl:attribute 
                   name="type">text</xsl:attribute>
                  <xsl:attribute name="name">
                    <xsl:value-of select="@name"/>
                  </xsl:attribute>
                <xsl:text> </xsl:text>   
                        <!-- *** here *** -->
                </xsl:element>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

4.

RTFs and Node sets

Mike Kay

Result Tree Fragment. Not a pretty name, and the abbreviation RTF is unfortunate, but we have to live with it.

When the body of an <xsl:variable> element is evaluated (or "instantiated" to use the correct jargon), the result is written to an RTF. There are only three things you can do with an RTF: you can use xsl:copy-of to copy it to the result tree (or to another RTF), you can convert it implicitly or explicitly to a string, and you can pass it to a function. There aren't any standard functions that process RTFs, so in practice this means an extension function.

SAXON and xt both provide extension functions to convert an RTF to a node-set. This conversion can't be done implicitly. The reason your xsl:for-each fails is that the expression in the select attribute must yield a node-set. Nothing else will do, in particular, it cannot be an RTF.

David Carlisle adds:

A node set is what you get back from a select expression so select="aaa[@xxx]|aaa[bbb]"

gives you the set of all elements with name aaa and either a xxx attribute or a bbb child. Note this is a set not a list. If some aaa element has both xxx attribute and bbb child, you only get it once. The set is however ordered (in document order, normally)

A node set is what you can apply templates to

<xsl:apply-templates select="aaa[@xxx]|aaa[bbb]"/>

ie it's the relevant part of the input document (or some secondary input document via the docyument() function)

A result tree fragment is what you produce in a template. You can save it in a variable and while it has similar structure to a node set (it's a bunch of XML nodes) it is essentially opaque to XSL You can not apply templates to it or interrogate its structure. The only thing you can do is use xsl:copy-of to put the value of the variable holding the result tree fragment into the result tree at some point.

xt and saxon (at least) have an extension function that converts result tree fragments to node sets.

>     <xsl:for-each select="$members">

members holds the result tree fragment, so you can't select into it.

You could use

    <xsl:for-each select="xt:node-set($members)">

Mike Brown adds:

You can identify *any combination* of unique nodes from different places in the source tree, using an XPath expression that selects the ones you want. Those nodes are a "node set". They don't have to form a hierarchy or anything.

You can create a new hierarchy of nodes (or multiple hierarchies that are siblings of each other), using various XSLT instructions and/or literal result elements. Those nodes are a "result tree fragment". They're branches of a tree.

So a result tree fragment *is* a set of nodes. It's just not a "node set"

5.

How to create positional ASCII text layout

Steve Tinney

It is perfectly feasible to create XSL outputs which not only do not add a single additional space, but which also strip various kinds of space from the output.

The key techniques are:

1) use xsl:strip-space to kill whitespace on input
2) use normalize-space() to kill leading and trailing whitespace from text elements, as well as reduce interior multiple spaces to a single space
3) use xsl:text to do output rather than putting the bare literals in the XSL stylesheet
4) (probably unnecessary if you do all of the above fastidiously) format the XSL stylesheet to put code-prettying whitespace inside the tags, e.g., <xsl:apply-templates select="something/long" />

6.

percent sign in output attributes

David Carlisle

Q expansion: when I put this in XSL: <a href="foo.cgi?formula=xml%2Bxsl&result=html">...</a> It turns into this in HTML (Saxon 5.1, IBM's XML parser): <a href="foo.cgi?formula=xml%252Bxsl&result=html">...</a> Notice that the intended "%2B" got escaped to "%252B".

well the spec says

The html output method should escape non-ASCII characters in URI attribute values using the method recommended in Section B.2.1 of the HTML 4.0 Recommendation.

so the point is that _you_ shouldn't be % escaping stuff, just put the character directly in the url, XSL will do the escaping for you. As it is it is escaping your %.

7.

Closure - transformation and output separation

Michael Kay



> That leaves me with the question why it is not encouragable to couple 
> transformers and serializers... may we assume that serializers are 
> kept out of the spec's domain is because serializers are too system 
> specific?

It comes down to pipelining, or closure.

A property of XSLT is that the input data model is the same as the output data model. The operations in XSLT take trees as input and produce trees as output. The language is "closed" over the data model. The benefit of this is composability: any two transformations can be combined to produce a larger transformation. Hence pipelines.

Serialization should be separate because it breaks away from the data model and produces something different: its output is a different kind of thing from its input. Only by keeping serialization separate from transformation do you preserve the closure property of the transformation language, and hence its composability.

8.

Dynamic output method?

Dimitre Novatchev


 > Is there a way to dynamically set the xsl:output method?  I'd like 
 > to programmitcally set the output for HTML and/or XML.
 >

Even in XSLT 1.0 there's a way to get either

   method="xml" 
or
   method="html"

If a value for the "method" attribute is not explicitly specified and the generated top element by the transformation is with local-name "html", then the method used for serialisation will be "html", otherwise it will be "xml"

From the XSLT 1.0 specification (http://w3.org/TR/xslt#output):

"The default for the method attribute is chosen as follows. If the root node of the result tree has an element child, the expanded-name of the first element child of the root node (i.e. the document element) of the result tree has local part html (in any combination of upper and lower case) and a null namespace URI, and

any text nodes preceding the first element child of the root node of the result tree contain only whitespace characters,

then the default output method is html; otherwise, the default output method is xml. The default output method should be used if there are no xsl:output elements or if none of the xsl:output elements specifies a value for the method attribute"