Printing

Printing XML using XSL

This section introduces simple printing for XML using the available XSL implementations. There are links below to a document type definition, an XML file and an XSLT stylesheet, which, when used in combination with an implementation of XSL, will produce a Portable Document Format (PDF) file which can be viewed and printed with Adobe's Acrobat product. There are links to XSL implementations in the links section below.

I have tried to make it complete, including getting a java runtime setup. If it doesn't work for you please let me know and I will update it. I'm using win32 so that is what I will provide. I doubt if there will be any problems moving any of it to a Unix basis.

This is a work in progress, I intend to add to it to provide examples covering a useful range of features for printing. I doubt if it will ever cover the full range of capabilities of XSL, but I hope there will be sufficient to permit good quality print output.

The syntax used in all these examples uses the W3C Candidate Recommendation 21 November 2000. I've tested it with the Renderx product and the Antenna House product. Both provide similar results.

Setting up

If you are OK running java stuff, you can skip this whole section.

The list of software needed quite possibly matches most peoples needs when dealing with XML. The list is as follows.

A java runtime
An command line XSLT implementation
An XSL implementation
A couple of command line scripts.

The software I have for the above consists of the Sun JRE, Saxon or XT for the XSLT, xep or Antenna-House for the XSL implementation.

Obtaining the software.

Java runtime. Obtain from sun. (Rather large at around 5Mb) Latest version .
Saxon. Obtain from Mike Kay site. Latest version
An XSL-FO processor. I use Renderx, obtain it from Renderx site. The Antenna House product can be obtained from their English Web site.
The command line scripts are available here.

Installation.

Java runtime from sun comes as a normal installation. Follow the instructions.

Installation location.

If you have never done this before, you may like to follow what I have done. Otherwise modify the scripts to suit your own installation. My java runtime environment is located with other applications in a root directory named apps. All other (mainly jar files) java stuff is in another root directory named myjava. All my XML processing stuff is in a root directory called sgml. Within /sgml I have a directory called bin to hold executables. I'll leave it to you to work out any changes you may need for your own installation.

I put the java runtime into /apps/jre.

I put saxon into /sgml/bin

The Antenna product is a standalone executable, Win32 only, with an executable (with a GUI interface) and an ActiveX control which is the XSL formatting/rendering engine. XEP2 comes with its own setup process.

If the unsupported elements bother you, please try "silent mode" (error logging): open [Formatter Options] dialog box, [File Output, etc.] tab, check [Error Logging] check box, and press [OK].

I installed XEP2 to /myjava/xep2

Path changes

I needed to make sure the following locations were included in my path variable.

/sgml/bin to access the XML related binaries.

/apps/jre to access the java runtime environment.

Using java necessitates the use of jar files, sort of zipped up executables. Like binaries which use other stuff via the path variable, java picks up its bits and pieces via a classpath environmental variable or via a command line parameter to the executable program java. I use this latter method since it needs to change quite frequently. Hence no changes are needed to any environmental variables. See the command line scripts below where I use java -cp.

Command Line Utilities.

These are the batch files I use. Their function is noted alongside each one.

Transform from XML to fo, using Saxon. The command line is \sgml\bin\saxon.exe -o %3 %1 %2. First parameter (%1) is the XML file, second the XSLT stylesheet, third is the required output file.


\sgml\bin\saxon -o tmp.fo foxml.xml foxsl.xsl

Render the fo file into pdf using xep2.

java -cp f:/myjava/xep2/lib/xep.jar;\
f:/myjava/xep2/lib/xerces-1.0.0.jar;\
f:/myjava/xep2/lib/JimiProClasses.zip\
 -Dcom.renderx.FO2PDF.ROOT=f:/myjava/xep2\
  com.renderx.FO2PDF.Driver %1

A little explanation. Note the backslash on each line indicates its all one line!

And that's it. If the provided files don't render then carefully go back over the installation for each product and check that you did as asked.

How the example files are presented.

If you look at the example XML file, which uses a very simple DTD, you will see each para has a style attribute. The name of the style attribute is the style I'm trying to present in that particular paragraph. So for example the style="normal" paragraph presents as a simple block element, fo:block, with no further attributes. Similarly the in-line elements are marked out within a pair of style tags, again with a style attribute (sorry if that sounds confusing, you'll soon get the idea). Hence if you want an indented paragraph with some italic content, a combination of a block element with the same attributes as the paragraph, and the in-line elements with the same attributes as the italic style in-line element will give you what you want.

The XSLT stylesheet is applied to the XML file to produce another XML file which is suitable for input to an XSL processor. To keep it simple, I have included lots of very simple match templates, one per paragraph and style combination. From that you should be able to see just how each of the styles is created.

The only really messy bit is the page layout stuff! I have isolated it in an included file with the stylesheet. This keeps it out of the way until you have confidence to go and play with it yourself. If you are just trying to see how to get some print out onto paper, then I suggest you leave well alone initially, and include it with your stylesheets, as I have done. The technique is to use the xsl:include directive to the XSLT processor, which effectively adds the external file into your stylesheet at the point of inclusion. Note that this needs to be a top level element, so its no good putting inside another template.

The only other point worthy of mention in the stylesheet is the use of named attribute sets, as per the XSLT rec, para 7.1.4. Apparently these are not commonly used, but I found them very handy because of the lengthy nature of the attributes needed to fully specify something. Rather than keep on typing out, for instance, a long list of attributes to have my headings set up in a particular font, with spacing before and after, trying to keep the heading with the following text and so on, I simply typed it all out once in a named attribute set, and called it up as and when required. All the named attribute sets are at the top of the stylesheet. I could have used another include file, but in case they are wanted at the same time as the other styles, I have left them in.

One more point. For most uses, I refer to a stylesheet defined font-size, and then re-use that elsewhere. My reasoning for doing this is that if I wanted to increase the font size of the whole document, rather than change it in 101 different places, I can change it in one, then automatically all other font sizes will increase proportionately.

The local files.

Linked from this page are the XML, XSL and DTD files.

The XML file is used to hold some data for presentation. Note that this is not meant to be loaded into your browser, even if it is IE5+! Save the files and run an XSLT and XSL-FO processor on them.

The Document type definition is here. Alter the XML file if you don't put it in the same place! You may note that I've added a couple of elements and lots of attributes to make it easy to spot the styling.

The XSLT file is the stylesheet, which calls up an included file from the main style sheet, and the first image and the second image

And thats it. I really ought to thank Nikolai at Renderx for the test files that they have published. Most of what I know comes from studying them.

Enjoy. Comments or additions gratefully received. I'll add more to it over time. Hope this is enough to get you started.