XSLT XML declaration

Declarations

1. How to include the xml and doctype declaration in output.
2. xml declaration vs text declaration

1.

How to include the xml and doctype declaration in output.

Mike Brown

An XSL stylesheet does not transform one XML document into another. It instructs an XSL processor how to create a result tree (of nodes) from a source tree that has been derived from an XML document. It also tells the processor you would prefer that it output that result tree somehow, perhaps as an XML document.

> <xsl:template match="/">
> <?xml blah blah blah
> </xsl:template>
            

No. A stylesheet does not construct the output; it constructs the result tree of nodes from which output may be derived. The XML declaration and the DTD are not currently part of the source tree and they also cannot be inserted (as such) into the result tree.

However, you can tell an XSLT 1.0 compliant processor that, when emitting the result tree as XML, you want a particular XML version and DOCTYPE. This is explained fully in section 16 of the XSLT 1.0 Recommendation: http://www.w3.org/TR/xslt#output ... it involves putting an <xsl:output/> element at the top of your stylesheet, with various attributes.

David Carlisle adds:

You have stated here exactly the cause of your problem:

> Since my transformation stylesheet is an XML document,

An XSL stylesheet is an XML document, thus this:

>  <xsl:template match="/">
> <?xml blah blah blah
            

is an XML error and will generate a parse error from the XML parser before the XSL system even gets started. <? is the XML syntax for a text declaration, an XML declaration or a processing instruction. PI's are not allowed to begin with `xml' and declarations are not allowed to be inside XML elements, so the above is not a well formed XML document.

The output of XSLT is the tree representing your document. If that tree is linearised into a file, an XML declaration will be added by the system if it is needed. Then a declaration is not part of the tree itself, and so does not need to be in a template (even if that was allowed by XML rules) The declaration just tells a parser how to convert the linearised form in the file back to the parse tree that corresponds to your output. In particular if you specify an encoding other than utf8 or utf16 using xsl:output then an XML declaration will be added by the system (if it supports the encoding you ask for).

2.

xml declaration vs text declaration

Tony Graham

  > To clarify: the syntax for a "text declaration" 
  > is exactly  the same as
  > an XML declaration e.g.:
  > 
  > <?xml version="1.0" encoding="utf-8"?>
  > 
  > Is that correct?
 

It is not correct that a text declaration has exactly the same syntax as an XML declaration.

An XML declaration is used for the document entity, and it has three parts: version information, encoding declaration, and standalone document declaration. A fully-featured XML declaration looks like this:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

The version information tells to which version of XML the document conforms. If there is an XML declaration at the start of the document, it has to have the version information.

The encoding declaration tells the encoding of the characters making up the current parsed entity (i.e. the current file in most cases). It is optional if the encoding is UTF-8 or UTF-16 (since all XML processors must be able to handle both UTF-8 and UTF-16 and the requirement that parsed entities in UTF-16 allows the XML processor to reliably distinguish between UTF-8 and UTF-16). It is required if the encoding is not UTF-8 or UTF-16.

The standalone document declaration tells whether anything in the external subset of the document's DTD affects what the XML processor passes to the application. Its value is either "yes" (the document does stand alone) or "no" (the external DTD subset changes things). It is optional, and if it is not present, "no" is assumed (unless there is no external DTD subset, in which case the standalone document declaration can be safely ignored). An example of when standalone="yes" is required is when the external DTD subset declares a default value for an attribute of an element. Without the information in the external subset, if an element in the document doesn't have that particular attribute, no information about the attribute will be passed to the application. With the information in the external subset, if an element does not have that attribute, the default value will be supplied by the XML processor as if the attribute with its default value had been present on that element.

The entire XML declaration can be omitted if the document entity is in either UTF-8 or UTF-16. If the document entity is not in one of those two encodings, the XML declaration has to be present because you have to identify the encoding.

A text declaration is used for external parsed entities, and it has two parts: version information and encoding declaration. It doesn't need a standalone document declaration because external parsed entities don't have separate DTDs. A fully-featured text declaration looks like this:

<?xml version="1.0" encoding="utf-8"?>

The version information has the same meaning as in the XML declaration, except this time the version information is optional.

The encoding declaration has the same meaning as in the XML declaration, except this time the encoding declaration is required.

The text declaration can be omitted if the external parsed entity is in either UTF-8 or UTF-16. If the external parsed entity is not in one of those two encodings, the text declaration has to be present because you have to identify the encoding.