page numbering in xsl, xsl-fo

Page Numbering

1. How do I start the page numbers from 24
2. How to get page 3 of 67
3. Page numbering in Roman Numerals
4. page-numbering in xsl-fo
5. Page numbering
6. How to use page-number in xsl:if
7. Cross references to pages

1.

How do I start the page numbers from 24

Nikolai Grigoriev

<fo:page-sequence initial-page-number="24">

2.

How to get page 3 of 67

Tokushige Kobayashi



 >Suppose we wish to write a report this way "Page 3 of 67".

<fo:page-number-citation ref-id="endofdoc"/> 

will produce it, if the last thing in your document is something with 'id="endofdoc"', e.g.

	  <fo block id="endofdoc"></fo:block>

It will fail if:

* The block with the id doesn't end up on the last page. It's hard to ensure that this doesn't happen if you have floats that may float to the end of the document thus generating pages after your "last" block.

* The printed page number on the "last" page doesn't reflect the number of pages in the document. This will happen for example if your front matter is numbered i,ii, ... and the you restart the numbering at 1 at the beginning of the main matter.

3.

Page numbering in Roman Numerals

Nikolai Grigoriev


> 1. Headers. Ideally, I need:
>   a.  roman numeral page numbers in the frontmatter and 
> standard numbering in the chapters.

Use a separate page sequence for frontmatter, and set number format for it to be Roman numbers:

<fo:page-sequence master-name="frontmatter" format="i"> ....
>   b. numbers reset for each chapter, annex and appendix.

Use a separate page sequence for each chapter/annex/appendix, and reset initial page number to 1:

<fo:page-sequence master-name="chapter" initial-page-number="1"> ....

Be careful: according to the spec, setting initial-page-number to 1 should lengthen the preceding page sequence to end on an even page. Unless you actually need this, you should specify force-page-count="no-force" on the preceding page sequence.

>   c. page numbers are prefixed by the chapter number, annex letter,
> appendix number where appropriate. 

That's an XSLT task - chapter number etc. are known prior to the formatting, so everything but the actual page number can be generated by the stylesheet. No XSL FO wizardry required.

4.

page-numbering in xsl-fo

Nikolai Grigoriev


> How can I give every page in a
> pdf-document another page number.
> For instance:
>
> page 1 -> 1
> page 2 -> IV
	

Try to make every page a separate page sequence, and then play with initial-page-number and number-format properties. There is no option in XSL FO to specify a custom sequence of tokens for continuous page numbering.

5.

Page numbering

Ken Holman



>I have a problem with the page numbering.The result of my xsl:fo PDF 
>file is printed like a book..
>So this would mean that the page numberings should come alternatively 
>in the right and left corner in the bottom of the page for even and odd 
>numbered pages respectively. Is there some way of automatically 
>producing this?

Point your page-sequence to a page-sequence-master that defines a repeatable sequence of alternatives between two conditional-page-master-references: one testing page parity for odd page numbers and the other testing page parity for even page numbers.

The two simple-page-masters that you point to need then to define differently-named region-after extents, one used as the odd-page and one for the even-page.

Your page sequence then defines two pieces of static content, one with the page number on the right for the region-after named in the odd page master, and one with the page number on the left for the region-after named in the even page master.

The formatter will obtain "the next page" from the page sequence that alternates between the two page masters and you will get alternating footers.

here is an example for headers that you can adapt for footers:

<layout-master-set>
   <simple-page-master master-name="frame-odd"
                      page-height="&page-height;" page-width="&page-width;"
                      margin-top="&margin-top;" margin-bottom="&margin-bottom;"
                      margin-left="&margin-left;" 
                     margin-right="&margin-right;">
     <region-body region-name="frame-body"
                  margin-top="&before-extent;" margin-bottom="&after-extent;"/>
     <region-before extent="&before-extent;" region-name="frame-before-o"/>
   </simple-page-master>
   <simple-page-master master-name="frame-even"
                      page-height="&page-height;" page-width="&page-width;"
                      margin-top="&margin-top;" margin-bottom="&margin-bottom;"
                      margin-left="&margin-left;" 
                      margin-right="&margin-right;">
     <region-body region-name="frame-body"
                  margin-top="&before-extent;" margin-bottom="&after-extent;"/>
     <region-before extent="&before-extent;" region-name="frame-before-e"/>
   </simple-page-master>
   <page-sequence-master master-name="frame-pages">
     <repeatable-page-master-alternatives>
       <conditional-page-master-reference odd-or-even="even"
         master-reference="frame-even"/>
       <conditional-page-master-reference odd-or-even="odd"
         master-reference="frame-odd"/>
     </repeatable-page-master-alternatives>
   </page-sequence-master>
</layout-master-set>

<page-sequence master-reference="frame-pages" force-page-count="even">

<static-content flow-name="frame-before-e">
   <block>frame-even - <page-number/></block> </static-content> <static-content flow-name="frame-before-o">
   <block>frame-odd - <page-number/></block> </static-content>

<flow flow-name="frame-body" font-size="40pt">
   <block...>


Ednote: Ken didn't provide values for his page sizes,
but they are self-explanatory. Replace with suitable
values for your use.


6.

How to use page-number in xsl:if

David Tolpin



>What I tried was to insert a blank page if the current page mod 2 = 0 
>because I had no other idea to start a new page-sequence on an
>odd page.
 
<xsl:if test="fo:page-number mod 2 = 0">
    <fo:block break-after="page">
        <xsl:text>&#xA;</xsl:text>
    </fo:block>
</xsl:if>

This will not work, since the page-number is not a function, see W3C

The page layout is carried out by the formatter, hence the page number is not available at the transform stage

7.

Cross references to pages

Eliot Kimber et al


> The goal: Xrefs only include the page number if the item referenced is
> on a different page from the xref.
>
> It appears that XSL 1.0 and 1.1 provides no capabilities to achieve
> this goal.  Does XEP provide an extension to help satisfy this goal?

Let me try, because this is a very common requirement that is one of the key things you cannot do with XSL-FO out of the box (without a two-pass process).

Say you have these FOs:

<fo:block id="block-01">Some Title Text</fo:block>
...
<fo:block>See
<fo:basic-link internal-destination="block-01">Some Title
Text</fo:basic-link> on page <fo:page-number-citation
ref-id="block-01"/></fo:block>

The requirement is that when block block-01 and the page-number-citation are on the same page, then the entire "on page <fo:page-number-citation/>" be suppressed.

If the they are on different pages, then they are not suppressed.

You could do this at the XSLT level by using a two-pass process that uses the XEP (or equivalent other proprietary intermediate form) area tree to determine the relative page relationships of the different elements and suppress them or not on the second pass.

But I could also imagine an extension element such as:

<xep:suppress-if-same-page ref-id="block-01"> on page
<fo:page-number-citation ref-id="block-01"/></xep:suppress-if-same-page>

Jirka responds

You can use rx:pinpoint to mark places where targets of link appear. During second pass you will generate content "on page <fo:page-number-citation/>" only when rx:pinpoint is on the different page then source of link. I usually generate FO using XSLT and during second pass I am accessing result of first pass in an intermediate format using document() function.

The idea is to have two little bit different FO sources for each pass. Of course, there is problem that FO with removed "on page ..." content could produce different page breaks, some content could shift and you can get wrong result. So you can do 3rd pass to check if everything is OK. And in a very rare cases you can have "unstable" document which has text "on page ..." near the page break and which is impossible to generate proper content. But this is very rare situation, I faced it only once (and this was a long time ago FO ages, it was in TeX system).

Then Eliot completes the picture.

> Have you implemented something that does this, and are willing to
> share?

We have done some limited things using Antenna House's Java API where we pre-render specific constructs, such as tables or page sequences to see how many pages they take up so that the second pass can then reflect that (for example, when generating a list of effective pages that has to reflect the length of the list of effective pages itself).

I don't think we've done anything more general but the basic technique would be the same regardless of the details of the intermediate format:

1. Generate the initial FO instance, adding appropriate "marker" elements or content (i.e., the rx:pinpoin Jirka mentioned or wrappers with specific ID values or whatever will work) so that you can correlate original input XML elements to their rendered location in the first pass.

2. Using the intermediate area tree, which reflects the paginated result of processing the first FO instance, process the input XML again, examining the pass 1 area tree as needed in order to make decisions based on the pass-1 layout result to generate the pass 2 FO instance.

3. As Jirka points out, if necessary, run a third pass to resolve any page break changes from pass 1 to pass 2. Ideally detect any occilations that result (i.e., a target on page X in pass 1, page Y in pass 2, back on page X in pass 3).

There are several practical issues with this approach as a general technique:

1. The different FO implementations have different non-standard ways of representing their area trees. So this approach will by necessity by engine specific.

2. The area tree can be many times larger than the original input XML or the resulting FO document, leading to potential performance issues with either memory usage or I/O time needed to write and read the intermediate file.

I have proposed a "standard" extension that would allow the creator of the FO instance to write page-aware data to a "side file" that would then only have the information you need to support a second pass. However, none of the vendors have yet shown any interest in this suggestion (see http://exslfo.sourceforge.net/requirements.html#side-files for my original statement of requirements). My initial idea was to have something like this:

<fo:root xmlns:exslfo="http://exlso.org"
xmlns:msf="http://www.example.com/mysidefilemarkup"
 >
<exslfo:side-file href="./sidefiles/mysidefile.xml"
  root-tag="msf:mysidefiledata"
 >
 ...
<fo:block id="block-01"
 ><exslfo:side-file-data xmlns="http://www.example.com/mysidefilemarkup">
<element-to-page-mapping-item>
<orig-element-id>para-01</orig-element-id>
<rendered-page><fo:page-number/></rendered-page>
</element-to-page-mapping-item>
</exslfo:side-file-data>
...{rest of content of the fo:block} ...
</fo:block>
...
</fo:root>

This would result in the following side file:

<?xml version="1.0"?>
<mysidefiledata xmlns:msf="http://www.example.com/mysidefilemarkup">
<element-to-page-mapping-item>
<orig-element-id>para-01</orig-element-id>
<rendered-page>10</rendered-page>
</element-to-page-mapping-item>
</mysidefiledata>

This would let you create the smallest possible intermediate data set needed to support your second pass processing.

The reason this approach would not be appropriate for the FO standard itself is that it presumes a particular implementation processing approach, namely a linear pass over the document, which the FO specification does not require. However, for any tool that does in fact do linear processing (which includes XEP, XSL Formatter, and FOP) it seems like a perfectly reasonable approach.

Note also that you can get the same effect, although with a little less convience, by generalizing Ken Holman's technique for putting this type of data in your generated PDF and then extracting it from there on the second pass (see Kens site). This approach is FO implementation independent but does require the ability to extract data from PDF (which isn't that hard but not part of the usual standard tool set).