XSLT document function, document available, exists

Document

1. How to check if document(file) was available?
2. Test for file existance
3. xml:base for document()
4. document() function, relative to where?
5. Merging two documents
6. document function
7. Merging two documents with document function
8. Document function example
9. How to have multiple inputs with one xsl file
10. document root question
11. How can I pass the result of a document call to other templates
12. Explain document function
13. Multiple input files to one output file
14. Using a variable to see if element exists in another xml doc.
15. How to process a list of files
16. document question
17. Document function with two arguments
18. Accessing 2 xml files in xsl
19. Combining XML data from different files
20. document doesnt pick up a parameter
21. document(), location relative to XML file.
22. relative paths in document calls
23. Relative location
24. Checking if document() is working
25. Document function with second parameter
26. Security and the document() function
27. relative document problems
28. relative to ....
29. Matching by id in external document()

1.

How to check if document(file) was available?

David Carlisle

It's actually an error to refer to a document that is not there. Your XSLT system may just give up at that point and stop. If so there is nothing you can do from within the stylesheet.

However the XSLT spec does say what the processor must do if it does not die at that point: it has to return an empty node set, in which case if $Description holds the file name,

<xsl:when test="$Descriptions">
<xsl:when test="count($Descriptions)>0">

both of those should work and be true if the document is there and false otherwise.

$Descriptions will either be an empty node set or a set consisting of a single / node, depending on whether the document is there or not.

2.

Test for file existance

M. David Peterson

The easiest way to solve your problem is to wrap the document() function within a boolean() function which will evaluate to true if the file exists and false if it does not. You can also wrap string() and concat() within the document() function to use a dynamically created reference to a file... for example

boolean(document(string(concat('/foo/bar/', $foobar, '.xml'))))

when used within a test attribute would evaluate to true if the string created by the concat() function existed in that location.

3.

xml:base for document()

David Carlisle



  When I call document() with variable as first argument, file1.xml
  opened from directory "xsl".

  When I call document() with exsl:node-set($file-set) as first
  document, files from $file-set opened (try to open) from the
  current directory.

You can force the base URI used for the document function by supplying a node as the second argument of document(). If you give a node from the stylesheet, eg document('') it'll use the uri of the stylesheet. If you use a node from your source file it will use the URI of that file.

If your file list was in an external file, not in a variable you wouldn't need the node-set extension and you would have a different and more reliable base URI (the URI of your XML file that has the file list)

4.

document() function, relative to where?

David Carlisle



> Is there a way using normal XSLT facilities to get or operate relative 
> to the location of the style sheet without passi

yes document() works relative to the stylesheet if you give it a string rather than a node. (Given a node it works relative to the node's uri) so you can just force the uri to be a string with string().

Or what I usually do as it's more obvious when you come back after coffee is to explictly force the base uri to be the base uri of the stylesheet by giving document('') as the second argument to document()

document(@src,document('')) fetches the thing in @src relative to the base uri of the stylesheet

5.

Merging two documents

Ken Holman


A working example using XT-19990813 is below.



doc1.xml

<?xml version="1.0"?>
<!DOCTYPE BookSet [
<!ATTLIST Book id ID #IMPLIED>
]>
<BookSet>
   <Book id="id1"><Name
 >The wizard of OZ</Name></Book>
   <Book id="id2"><Name
 >Java Servlet Programming</Name></Book>
   <Book id="id3"><Name
 >John Coltrane Rage</Name></Book>
</BookSet>

doc2.xml

<BookList>
    <Book id="id1"/>
    <Book id="id2"/>
</BookList>

list.xsl

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
 version="1.0">

<xsl:output method="xml" indent="yes"/>

<xsl:param name="source" select="''"/>    
     <!--source of data-->

<xsl:template match="/BookList">        
     <!--document element-->
   <BookList>
     <xsl:for-each select="Book">
       <Book id="{@id}">
         <xsl:variable name="id" select="string(@id)"/>
            <!--note you cannot use document($source)/id($id)-->
         <xsl:for-each select="document($source)">
           <xsl:copy-of select="id($id)/*"/>
         </xsl:for-each>
       </Book>
     </xsl:for-each>
   </BookList>
</xsl:template>

</xsl:stylesheet>

  Output

<BookList>
<Book id="1">
<Name>The wizard of OZ</Name>
</Book>
<Book id="2">
<Name>Java Servlet Programming</Name>
</Book>
</BookList>


            

6.

document function

Ken Holman.

Q: Expansion
>I have a string in a variable and I want to convert it
>to a document via the document() function.
    

Trying to feed a variable of rich markup to the document() function is impossible.

However ... getting data from the stylesheet isn't impossible and if that is what you want to do, an example is below.

In this example I have stopped using ID so that I can use the same id attribute values in two places. I have invoked the engine twice, once with a default value and a second time with an argument.

Note that a stylesheet writer does not have control over an XSLT engine's emission of namespace declarations

T:\ftemp>type doc2.xml
<BookList>
    <Book id="1"/>
    <Book id="2"/>
</BookList>

T:\ftemp>type list3.xsl
<?xml version="1.0"?>
<xsl:stylesheet 
     xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
     version="1.0"
     xmlns:data="any-uri">

<xsl:output method="xml" indent="yes"/>

<data:BookSet set="first">
   <Book id="1"><Name
 >The wizard of OZ</Name></Book>
   <Book id="2"><Name
 >Java Servlet Programming</Name></Book>
   <Book id="3"><Name
 >John Coltrane Rage</Name></Book>
</data:BookSet>

<data:BookSet set="second">
   <Book id="1"><Name
 >An Uninteresting Book</Name></Book>
   <Book id="2"><Name
 >Another Uninteresting Book</Name></Book>
   <Book id="3"><Name
 >Yet Another Uninteresting Book</Name></Book>
</data:BookSet>

  
<xsl:param name="source" select="'first'"/>

<xsl:template match="/BookList">          
     <!--document element-->
   <BookList>
     <xsl:for-each select="Book">
       <Book id="{@id}">
         <xsl:variable name="id" select="string(@id)"/>
          <!--note you cannot use document("")/id($id)-->
         <xsl:for-each select='document("")'
 ><!--the stylesheet-->
           <xsl:copy-of select="//data:BookSet[@set=$source]
                                 /Book[@id=$id]
                                 /*"/>
         </xsl:for-each>
       </Book>
     </xsl:for-each>
   </BookList>
</xsl:template>

</xsl:stylesheet>

T:\ftemp>xt doc2.xml list3.xsl result1.xml

T:\ftemp>type result1.xml
<BookList xmlns:data="any-uri">
 <Book id="1">
  <Name xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
 version="1.0">
   The wizard of OZ</Name>
 </Book>
 <Book id="2">
  <Name xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
     version="1.0">Java Servlet 
   Programming</Name>
 </Book>
</BookList>

T:\ftemp>xt doc2.xml list3.xsl result2.xml source=second

T:\ftemp>type result2.xml
<BookList xmlns:data="any-uri">
 <Book id="1">
  <Name xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
     version="1.0">
   An Uninteresting Book</Name>
 </Book>
 <Book id="2">
  <Name xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
 version="1.0">
    Another Uninteresting Book</Name>
  </Book>
</BookList>

7.

Merging two documents with document function

Ken Holman

what if both documents had attributes declared as ID attributes?

What if there were an element in each document that happened by coincidence to have the same unique identifier?

Using document() allows you to still access both documents, but access the unique identifiers found in both documents without the risk of collision. My examples earlier showed how using <xsl:for-each> with the document() function allows you to access the ID space within each document independently.

If both documents, each with the same ID in use, were merged into one, it couldn't be validated (because of duplicate IDs) and the id() function of XSLT would at best only find one and at worst report the duplicate ID as an error and refuse to execute.

Of course this isn't a problem if you aren't using IDs, but I assumed you were asking about the general case.e identifier?

Using document() allows you to still access both documents, but access the unique identifiers found in both documents without the risk of collision. My examples earlier showed how using <xsl:for-each> with the document() function allows you to access the ID space within each document independently.

If both documents, each with the same ID in use, were merged into one, it couldn't be validated (because of duplicate IDs) and the id() function of XSLT would at best only find one and at worst report the duplicate ID as an error and refuse to execute.

Of course this isn't a problem if you aren't using IDs, but I assumed you were asking about the general case.

8.

Document function example

Clark C. Evans


Here is an example for illustration:

- ----------------- stylesheet.xsl ------------------
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Tranform"     
     version="1.0">
<xsl:output method="html">
 
<xsl:template match="/">
   <html>
  <xsl:for-each select="document('two.xml')//day-list" >
     <xsl:value-of select="day" />
     <xsl:value-of select="sum(//amount)" />
   </xsl:for-each>
  </html>
</xsl:template>
</xsl:stylesheet>
- ----------------- one.xml -------------------------
<entry-list>
  <entry>
    <day>mon</day>
    <amount>34</amount>
  </entry>
</entry-list>
- ----------------- two.xml -------------------------
<root>
  <day-list>
    <day>mon</day>
  </day-list>
  <amount> 99 </amount>
</root>
- ----------------- command line --------------------
xt one.xml stylesheet.xsl
- ----------------- expected output -----------------
mon 34
- ----------------- actual output -------------------
mon 99
- ---------------------------------------------------
    

The sum(//amount), unfortunately, seems to be pointing at document two.xml, not document one.xml.

Of course, I could "sum(document('one.xml')//amount)", however:

a) I was not expecting document('two.xml') to change the current root node;

b) I found that document('') means the stylesheet... so I was hoping for some other like trick!

Steve Muench added You could use a top-level:

 <xsl:stylesheet>
   <xsl:param name="monthlookup">defaultfile.xml</param>

and later use:

    document($monthlookup)//xyz

James Clark refined this as:

XPath says that "A / by itself selects the root node of the document ***containing the context node.***"

The solution is to have a top-level variable pointing to the root of the main document:

  <xsl:stylesheet>

  <xsl:variable name="main" select="/"/>

and then use

  sum($main//amount)

9.

How to have multiple inputs with one xsl file

Mike Brown

Say I have foo.xml and bar.xml. I need the contents of both for foobar.html and it uses foobar.xsl, one process,

If your XSLT processor implements the document() function, then yes. This is explained in section 12.1 of the XSLT 1.0 Recommendation, at http://www.w3.org/TR/xslt#document

Example:
<xsl:apply-templates select="document('b.xml')"/>

            

10.

document root question

Sebastian Rahtz


Q expansion

with the following XML

<page>
    <about>blah blah blah</about>
    <title>Blah</title>
    [...]
    <links>
        <link type="parent">
            <uri>../ma.xml</uri>
        </link>
    </links>
</page>

and XSL

<xsl:template match="link" mode="parent">
    <xsl:variable name="lp" select="document(uri)"/>
    <a href="{uri}" title="{$lp/about}"
	><xsl:value-of select="$lp/title"/></a>
</xsl:template>

I am trying to use this template to use the "uri" of my parent link, to pull in the top level node of parent external document into a variable, then access it's "about" and "title" elements, via the variable.

Thats because document() returns the root of the XML document, not the top element. you need "page/about", not "about", etc E.g.

 <a href="{uri}" title="{$lp/page/about}"
          ><xsl:value-of select="$lp/page/title"/></a>

            

11.

How can I pass the result of a document call to other templates

Michael Kay


Same way as any other value:
<xsl:call-template name="f">
<xsl:with-param name="x" select="document('abc.xml')"/>
</xsl:call-template>

Or you can apply templates to the result of the document() call, though it is usually better to pass the document element rather than the root node (because it's easier to match that in a pattern):

<xsl:apply-templates select="document('abc.xml')/*"/>

> I need is to append the node-set created from the 
> document() call to the current one. 
> Any idea how this can be done?

Using the union operator:

<xsl:variable name="$a" select="$b | document('abc.xml')"/>

Of course you cannot "append" in the sense of updating a variable in situ. (unless you feel like using <saxon:assign>, which this community regards as a crime almost as bad as using IE5)

Eric van der Vlist adds:

If you want to transform a set of source files into a set of output files, you can just do something like :

<xt:document method="html" href="doc1.html"> 
<xsl:apply-templates select="document('doc1.xml')/>
</xt:document> 
 
<xt:document method="html" href="doc2.html"> 
<xsl:apply-templates select="document('doc2.xml')/>
</xt:document> 

12.

Explain document function

Vun Kannon, David

Suppose you had a situation where the XML vocabulary supported an "import" mechanism, and of course imported node sets could themselves import other node sets recursively, on and on. XSchema does this. Here is a stylesheet that slightly modifies the identity transform. The additional piece brings in the imported nodes so that you can see in the result document the original source document plus all the imported nodes.

<stylesheet xmlns="http://www.w3.org/1999/XSL/Transform">

<template match="/">
<apply-templates/>
</template>
<template match="@*|comment()|text()|processing-instruction()">
<copy>
<apply-templates
select="@*|comment()|text()|processing-instruction()"/>
</copy>
</template>
<template match="*">
<copy>
<apply-templates select="@*"/>

<!-- now bring any imported nodes -->
<if test="self::import/@uri">
<apply-templates select="document(@uri)"/>
</if>

<apply-templates
select="*|comment()|text()|processing-instruction()"/>
</copy>
</template>
</stylesheet>

Here are a set of faux XSchema documents that refer to each other:

fabulous.xsd:
<schema name="Fabulous">
<import uri="kpmg.xsd"/>
<element name="DomainName"/>
</schema>

kpmg.xsd:
<schema name="KPMG Media">
<import uri="core.xsd"/>
<element name="BrandEquity"/>
</schema>

core.xsd:
<schema name="Core">
<element name="BalanceSheet"/>
</schema>

Run the stylesheet against the first schema document given above and you should get

<?xml version="1.0" encoding="utf-8"?>
<schema name="Fabulous">
<import uri="kpmg.xsd"><schema name="KPMG Media">
<import uri="core.xsd"><schema name="AICPA Core">
<element name="BalanceSheet"/>
</schema></import>
<element name="BrandEquity"/>
</schema></import>
<element name="DomainName"/>
</schema>

Actual XSchema document processing might replace the import element with the imported nodes (in order to keep it a valid schema), my point is not be explicit on that kind of issue. I just want to show how I've used document().

13.

Multiple input files to one output file

David Carlisle


> Is it possible with XSLT to have lots of input files, 
> and create one output
> file listing specific data from each of the files? 

yes that's exactly what the document() function does.

Mike Brown adds: For more than one file, you could have: In filenames_file.xml you could have:

<someURIs>
  <file>file1.xml</file>
  <file>file2.xml</file>
  <file>http://foo/file3.xml</file>
  <file>file://D|/dev/src/file4.xml</file>
  <file>../../file5.xml</file>
</someURIs>

Then in a template in your XSL you could have:

<xsl:for-each 
 select="document('filenames_file.xml')/someURIs/file/text()">
  <xsl:variable name="current_file_root" 
       select="document(string(.))"/>
  <!-- the next lines are just for example -->
  <xsl:text>

current file: </xsl:text>
  <xsl:value-of select="."/>
  <xsl:text>
# of elements: </xsl:text>
  <xsl:value-of select="count($current_file_root//*)"/>
</xsl:for-each>

14.

Using a variable to see if element exists in another xml doc.

John E. Simpson

>I am trying to use the variable myKey to check to see if it's value (ie.
>"currencyCode") is an element in another document.
>I am having some problems.  does anyone know what is wrong with this xsl.

Try this:

<xsl:template match="object">
      <xsl:for-each select="property">
           <xsl:variable name="myKey" select="key"/>
           <xsl:if test="$myKey='currencyCode'">
                This works; try the next if
                <xsl:if test="document('en_US.xml')/locale/*
     [name()=$myKey]">
                     insert label
                </xsl:if>
           </xsl:if>

            </xsl:for-each>
   </xsl:template>

15.

How to process a list of files

Steve Tinney



> <mother>
>   <file>files\overview.html</file>
>   <file>files\book1\page1.html</file>
>   <file>files\book1\chap1\page2.html</file>
>   <file>files\book2\page1.html</file>
>   <file>files\book2\chap1\page2.html</file>
> </mother>
> 
> How can I use XSL to enter each listed file,
> manipulate it and copy the result onto itself,
> thus retaining the file names and directory structure?

You would need to use document(), as you already realize, in conjunction with a processor that has at least an output extension, and possibly a nodeset extension also (depending on what you want to do). You would need to be sure that after a document() call, the processor closed the file (i.e., do some testing with unimportant data first or, better, read the source for your processor). So, the structure would be something like this:

    <xsl:template match="file">
      <xsl:variable name="pathname" select="."/>
      <xsl:variable name="contents" 
             select="document($pathname)"/>
      <xsl:variable name="newcontents-rtf">
        ... rewrite $contents here ...
      </xsl:variable>
      ... possibly do further transforms here ...
      <saxon:output 
        file="$pathname" method="xml" encoding="utf-8">
        <xsl:copy-of select="$newcontents-rtf"/>
      </saxon:output>
    </xsl:template>

You probably need to ensure that you reference $contents before doing the file open for the new output because most(? some?) processors do lazy evaluation of the select expressions.

Mike Kay adds

You can do something like:

<xsl:template match="mother">
<xsl:for-each select="file">
   <xsl:variable name="f" select="document(.)"/>
   <saxon:output file="{.}">
      <xsl:apply-templates select="f"/>
   </saxon:output>
</xsl:for-each>
</xsl:template>

It will only overwrite the existing files if the "mother" document is in the current directory, as document() interprets filenames relative to the "mother" document, and saxon:output to the current directory. I wouldn't recommend overwriting anyway, as the process then isn't restartable in the event of a failure.

Obviously you can replace saxon:output by its equivalent in xt or Xalan. If overwriting, take especial care with xt because it does lazy evaluation, I don't think it writes the output before reading the input, but with James you never know :-)

Mike Brown adds:

If you want an XSL stylesheet to refer to data in a remote XML file? If so, then you need a URI that points to the XML file. Use it as the argument to the document() function:

<xsl:variable name="foo" 
 select="document('http://remoteserver/file.xml')"/>

Then, if the resource identified by the URI could be parsed, $foo will be a node-set containing the root node from file.xml. You can put it in an XPath expression to get data from the document:

<xsl:value-of select="$foo/path/to/some/nodes"/>

If you are going to be referring to a document multiple times, it makes more sense to only call document() once and bind the node-set it returns to a variable, then refer to that variable.

Like any other function, it returns an object of one of the types: node-set, number, boolean, string, result tree fragment. You know that it returns a node-set, so you can use it anywhere where you expect to see a node-set. So it could be the first location step in an XPath expression. I am not aware of any XSL processors (aside from IE) that would have any trouble with that.

If the resource identified by the URI could not be parsed, your XSL processor is supposed to either signal an error or return an empty node-set. XT does the former and aborts processing, so it's best to have some control over the documents you are obtaining in this manner.

You can also use a node-set as the argument to document(). The string-values of the nodes will be used as a list of URIs, and the function will return the union of root nodes from those documents.

There are a few other features of the document() function explained in the XSLT spec at W3C

16.

document question

Martin Algesten

If i haven't misunderstood the XSLT spec completely I should be able to do a "document( node-set )" as an expression and that would result in the XSLT engine reading out each of incoming node's (1-st level of nodes) value as a string and doing a document( string ) on each of those elements and the result would be a union of each of the read files.

In the following example I have a file-list in files.xml defined like this:

<file-list>
  <absolute>file://\z:\myroot\file1.xml</absolute>
  <absolute>file://\z:\myroot\file2.xml</absolute>
</file-list>

my stylesheet has a template like this (output html):

<xsl:template match="page">
  <BODY>
  <xsl:variable name="files"
                select="document('file:///Z:/myroot/files.xml')"/>
  <xsl:for-each select="document($files//file-list/absolute)">
     <xsl:value-of select="."/>
  </xsl:for-each>
  </BODY>
</xsl:template>

17.

Document function with two arguments

G Ken Holman

I have a "share" construct in my XML that allows me to define content in another XML file and access it algorithmically through an unparsed NDATA entity. I am not using the other XML file syntactically through XML parsed entity referencing ... each package of content is named in the shared file and I am pointing to a package through an NDATA entity for the file and pulling in content from that remote entity according to the share construct. With my book spread out over a number of subdirectories, I have relative paths between subdirectories in my SYSTEM identifiers for my NDATA entities.

Recall that every node in the source and stylesheet node trees has a "Base URI" being the URI of the entity in which the node is found when read by the XML processor inside the XSLT processor.

Recall also that if you do not supply a second argument to the document() function then the a relative first argument value is resolved relative to the stylesheet node's base URI (hence, relative to the subdirectory in which the stylesheet is found). If you do supply a second argument to the document() function, a relative first argument is resolved relative to the base URI of the node in the supplied second argument.

In my situation I was using relative SYSTEM identifiers in my unparsed entity declaration and I was omitting the second argument ... thus, when the XSLT processor resolved my relative addresses, it was going relative to my stylesheet directory which is one level higher in my directory structure than my XML storage directory ... hence, my shared file of XML was never being found!

My fix was to change the call from "document($file)" to be "document($file,.)" which supplied my current XML source file node instead of the default current XSLT stylesheet file node ... and my relative addresses were then being resolved relative to my data directory and not relative to my stylesheet directory ... everything started to work and I *finally* understood that second argument for myself.

Ken later added:

To illustrate my understanding, please consider my example below. XT is being invoked with the test1.xsl stylesheet from the test1 subdirectory. This includes test2.xsl which happens to be in a sibling directory named test2. Both directories contain a file named "test.xml". The test reveals that all calls using either no second argument or a second argument using the document('') function as you describe above return information from the subdirectory in which the document() function is being executed. Only in the last case where the source tree node is supplied does the function refer to the original directory. I do not believe you can reference the root of the initially invoked stylesheet file using an expression executed in a referenced stylesheet file.



T:\ftemp\test1>type test.xml
<?xml version="1.0"?>
<test>
   This is test.xml from the test1 directory.
</test>
T:\ftemp\test1>type ..\test2\test.xml
<?xml version="1.0"?>
<test>
   This is test.xml from the test2 directory.
</test>
T:\ftemp\test1>type test1.xsl
<?xml version="1.0"?><!--filename.xsl-->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:ken="ken"
                 version="1.0">

<ken:data>
   The file being accessed is test1.xsl.
</ken:data>

<xsl:include href="../test2/test2.xsl"/>

<xsl:template match="/">
   <xsl:call-template name="test2"/>
</xsl:template>

</xsl:stylesheet>

T:\ftemp\test1>type ..\test2\test2.xsl
<?xml version="1.0"?><!--filename.xsl-->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:ken="ken"
                 version="1.0">

<ken:data>
   The file being accessed is test2.xsl.
</ken:data>

<xsl:template name="test2">
   <xsl:value-of select="document('')//ken:data"/>
   <xsl:value-of select="document('test.xml')"/>
   <xsl:value-of select="document('test.xml', document(''))"/>
   <xsl:value-of select="document('test.xml',.)"/>
</xsl:template>

</xsl:stylesheet>

T:\ftemp\test1>xt test1.xsl test1.xsl
<?xml version="1.0" encoding="utf-8"?>

   The file being accessed is test2.xsl.

   This is test.xml from the test2 directory.

   This is test.xml from the test2 directory.

   This is test.xml from the test1 directory.

michael gruber later added an example using just the one parameter.

   document (test.xsl, test.xml)
         |
         |---- input (test.xml)
         |
         |---- ent (test.xml)


*******document\test.xml
<?xml version="1.0"?>
 <test>
   This is test.xml from the document directory.
 </test>

*******D:\TEST\document\input\test.xml

<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY testentity SYSTEM "../ent/test.xml" >
                ]>
<root>
 <node>test.xml</node>
 <test>
   This is test.xml from the input directory.
 </test>
 &testentity;
</root>

*******D:\TEST\document\ent\test.xml
<ent>
 <node-enti>test.xml</node-enti>
 <test>
   This is test.xml from the enti directory.
 </test>
</ent>


*******document\test.xsl

<?xml version="1.0" ?>
<!DOCTYPE xsl:stylesheet>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

<xsl:template match="/">
<xsl:value-of select="document('test.xml')//test"/>
<xsl:value-of select="document(root/node)//test"/>
<xsl:value-of select="document(root/ent/node-enti)//test"/>
<xsl:value-of select="document(root/ent/node-enti, .)//test"/>
<xsl:value-of select="document('test.xml', document(''))//test"/>
<xsl:value-of select="document('test.xml',.)//test"/>
<xsl:value-of select="document('test.xml',root/ent/node)//test"/>
</xsl:template>

</xsl:stylesheet>


D:\TEST\document>type result.txt
<?xml version="1.0" encoding="utf-8"?>

   This is test.xml from the document directory.
   This is test.xml from the input directory.   <---!!!!!
   This is test.xml from the enti directory.    <---!!!!!
   This is test.xml from the input directory.
   This is test.xml from the document directory.
   This is test.xml from the input directory.
   This is test.xml from the enti directory.

18.

Accessing 2 xml files in xsl

Steve Muench

Consider XML files a.xml, b.xml, and c.xml below

<!-- a.xml -->
<row>
  <ename>Steve</ename>
</row>

<!-- b.xml -->
<row>
  <ename>Francis</ename>
</row>

<?xml version="1.0"?>
<!-- c.xml -->
<!DOCTYPE rowset [
  <!ENTITY a SYSTEM "a.xml">
  <!ENTITY b SYSTEM "b.xml">
]>
<rowset>
 &a;
 &b;
</rowset>

Then the stylesheet below illustrates two ways to "process both a.xml and b.xml files" the first way uses "c.xml" that includes a.xml and b.xml as external entities to "glue" them together. The second uses the XPath | operator to "or" the "a.xml" and "b.xml" docs together. The result of doing:

$ oraxsl anything.xml stylesheetbelow.xsl

is:

Steve
Francis
Steve
Francis

Hope this gives some ideas...

<?xml version="1.0"?>
<xsl:stylesheet 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="/">
    <xsl:apply-templates select="document('c.xml')//rowset"/>
    <xsl:apply-templates select="  document('a.xml')//row
                                 | document('b.xml')//row"/>
  </xsl:template>

  <xsl:template match="rowset">
    <xsl:apply-templates select="row"/>
  </xsl:template>

  <xsl:template match="row">
    <xsl:value-of select="ename"/>
    <xsl:text>
</xsl:text>
  </xsl:template>

</xsl:stylesheet>

19.

Combining XML data from different files

Jeni Tennison

One of the difficulties lies in identifying the 'same node' in two different documents. I can think of two main ways of saying that two nodes in different documents are the 'same node':

1. they are in the same position (relative to other nodes)
2. they have the same identifier

How you go about merging and reporting on differences between the two XML documents really depends on which of these can be used.

If it is the first (same position), then you're right that you have to keep track of the 'context node' in the two documents. It's easy to keep track of one of the context nodes (the processor does that for you), but the other will have to be passed from template to template, making especially sure that it is never lost through the use of the built-in templates. So, with document1.xml and document2.xml as the two documents, and document1.xml being the input, something like:

<xsl:template match="/">
  <xsl:variable name="doc2node" select="document('document2.xml')" />
  <xsl:for-each select="*">
    <xsl:variable name="index" select="position()" />
    <xsl:apply-templates select=".">
      <xsl:with-param name="doc2node" select="$doc2node/*[position() =
$index]" />
    </xsl:apply-templates>
  </xsl:for-each>
</xsl:template>

<xsl:template match="*">
  <xsl:param name="doc2node" />
  <!-- do your element comparison here -->
  <xsl:for-each select="@*">
    <xsl:variable name="name" select="name()" />
    <xsl:apply-templates select=".">
      <xsl:with-param name="doc2node" select="$doc2node/@*[name() = $name]" />
    </xsl:apply-templates>
  </xsl:for-each>
  <xsl:for-each select="*">
    <xsl:variable name="index" select="position()" />
    <xsl:apply-templates select=".">
      <xsl:with-param name="doc2node" select="$doc2node/*[position() =
$index]" />
    </xsl:apply-templates>
  </xsl:for-each>
</xsl:template>

<xsl:template match="@*">
  <xsl:param name="doc2node" />
  <!-- do your attribute comparison here -->
</xsl:template>

If, on the other hand, you have a structure in which elements can be individually identified somehow, then you don't have to keep track of where you are in the second document all the time - you can just index into it to get the node to compare. So, say you had two documents, each of which had a load of elements with @id attributes on them, you could have something like:

<xsl:key name="ided-nodes" match="*[@id]" use="@id" />

<xsl:template match="*[@id]">
  <xsl:variable name="doc1node" select="." />
  <xsl:for-each select="document('document2.xml')">
    <xsl:variable name="doc2node" select="key('ided-nodes', $doc1node/@id)" />
    <!-- do your comparison between $doc1node and $doc2node here -->
  </xsl:for-each>
</xsl:template>

Both of these approaches take document1.xml as the primary document - it is specified as the input and is used as the basis of the comparison. You'd have to do something a bit more complicated if you wanted to do the comparison the other way as well (e.g. identify the elements that exist in document2 but not in document1).

20.

document doesnt pick up a parameter

Jeni Tennison

>The XSL statement "with-param" doesn't seem to be carried over to
processing of
>nodesets from "document(...)" calls.

When you use document(), it gives you the *root node* of the document that's named. So when you do:

<xsl:apply-templates select="document('incoming.xml')">
  <xsl:with-param name="blah" select="555"/>
</xsl:apply-templates>

you're applying templates to the root node (/) of 'incoming.xml'. In your case, you have no template that explicitly matches the root node, so the built-in template matches instead:

<xsl:template match="*|/">
  <xsl:apply-templates />
</xsl:template>

You'll notice that the built-in template doesn't declare any parameters nor pass any on within the xsl:apply-templates. Essentially this means that any parameters you pass in are ignored and forgotten, and therefore appear not to be being passed.

Rather than applying templates to the root node of the document, you want to apply templates to the document element ('matchme'):

<xsl:apply-templates select="document('incoming.xml')/matchme">
  <xsl:with-param name="blah" select="555"/>
</xsl:apply-templates>

21.

document(), location relative to XML file.

Jeni Tennison


<xsl:variable name="file2"
              select="document(/files/file[2]/@href, /)" />
              

[Note the second argument to document() ensures that the file names are resolved relative to your input.xml document rather than the stylesheet.]

22.

relative paths in document calls

David Carlisle

> The document() function works only with an absolute path 
>(so it looks to me).

If you want a relative path from the document (or current node, which is usually the same) then use document ('file.xml', .) or document ('file.xml', /)

23.

Relative location

Michael Kay

data.xml references a number of other XML data sources via relative references that are relative to the data.xml document rather than relative to the stylesheet that is being applied to data.xml (eg: <xsd:import namespace="namespaceURI" schemaLocation="referencedSchemaDocument.xsd"/>).

Unfortunately, these relative references are being processed as being relative to the location of the stylesheet

Answer:

The document() function works relative to the source document if the first argument is a node-set, and relative to the stylesheet if it is a string. So you must have extracted the relative URI and turned it into a string before passing it to the document() function. Sometimes you need to do this, because you need to do a computation on the value. In this case the second argument of document() can help you: supply the root node of the source document, and that will be used as the base URI.

24.

Checking if document() is working

David Carlisle

 <xsl:apply-templates
select="document(http://localhost/econtent/Content.do?\
   state=resource&amp;res
ource=250 '')" />

Well that would be a syntax error (which would be reported after the : as http:// isn't a valid XPath.

The URL should be an XPath string so

 <xsl:apply-templates
select="document('http://localhost/econtent/Content.do?\
  state=resource&amp;re
source=250')" />

which is possibly what you really have in your code, if not your email.

that is OK but does that URL really return an XML file? It's hard to tell from here.

Rather than do apply-templates, do

<xsl:message>The document is:
  <xsl:value-of
select="document('http://localhost/econtent/Content.do?\
  state=resource&amp;re
source=250')" />
</xsl:message>

If it did return an XML file you would throw XSLT into an infinite loop as the first node you would find would be the root node of the new document and this would again match your template for "/" so it would again call the document function....

You typically need to start applying templates to the first element in the included doc, so

"document('http://localhost/econtent/Content.do?\
  state=resource&amp;resource=
250')/*" />

25.

Document function with second parameter

Jeni Tennison

The second argument to the document() function is a *node-set* and the base URI (used to resolve the first argument) is the base URI of the first node in that node set. The base URI of a node is the location in which it originated. Usually you'd pass the context node as the second argument, something like:

<xsl:template match="xi:include">
  <xsl:copy-of select="document(@href, .)" />
</xsl:template>

to ensure that the first argument is resolved relative to the XML document in which the URI is specified rather than relative to the stylesheet itself.

(Also, the first argument can be a node-set rather than a URI, in which case you get a node-set containing all the root nodes of all the documents referenced.) From The XSLT Programmer's Reference 2nd Edition, pg. 466, strictly speaking the first parameter is a URI which in my case is a file name and the second parameter is the base-uri which is used to resolve any relative reference contained in the first parameter. If the first parameter has no relative references, the second parameter is not necessary.

26.

Security and the document() function

Michael Kay



> I don't get it. I hear there are security issues with the document() 
> function, but I don't see how that could be possible. Since 
> document() only 
> reads an XML file for further processing, how can this be any 
> worse than 
> using wget to download a file? I must be missing something...

Here is one scenario where the document() function can be a risk. You write a servlet to do transformations, that accepts URLs for the source document and the stylesheet as query parameters. Like the one at http://www.w3.org/2001/05/xslt, for example. Someone calls this servlet supplying http://www.evil.com/malicious.xsl as the stylesheet. You execute this untrusted stylesheet on your machine. It calls the document() function with a URL of file:///usr/victim/data.xml, and returns the contents of a data file residing on the machine where the transformation took place.

Allowing an untrusted stylesheet to run on your machine is like running any other untrusted code on your machine; you have no idea what damage it might do.

An even bigger risk, of course, is that the untrusted stylesheet will call arbitrary Java extension functions. The W3C servlet cited above runs with a version of xt that has been modified to prevent extension functions being executed. The modification was only done after I demonstrated to them how it could be exploited.

27.

relative document problems

Ken Holman


>Here's my document list xml file:
>...
><nAnnualReport filename="FSUSA00386-FOUSA00H0A.xml" />

I note that you are using relative URI values for your filenames.

>Here's partial of xsl:
>...
>   <xsl:copy-of select="document(@filename)//Date"/>

I note that you are omitting the second argument to document(), thus relative URI values are going to be resolved relative to the stylesheet fragment from which the element with the function call was read.


>It seems document() doesn't work at all.

Perhaps the processor is not finding your data files in the same subdirectory as your stylesheet files.

>I can't figure out what's wrong.

Relative URI address resolution is based on the presence/absence of the second argument. If your data files for document() are in the same subdirectory as the start of the XML fragment then use:

   document( @filename, / )

or if you are using external parsed general entities and it is relative to the element with the filename attribute, use:

   document( @filename, . )

28.

relative to ....

Michael Kay

It's relevant to know how you invoked the transformation (because the base URIs of source document and stylesheet depend on this).

Remember that document(.) interprets the URI relative to the source document, while document(string(.)) interprets it relative to the stylesheet.

> what is actually the difference between "the source document" 
> and the "stylesheet"? 

> should an xslt processor 
> resolve the URI parameter of a document function call - e.g. 
> document('somedir/somefile.xml') - normally relative to the stylesheet 
> in which the call is in?

The rules for the document() function are explicit: if the first argument is a node-set, the URI is resolved relative to the node (typically in a source document) that contains the URI in question; if the argument is a string, then the URI is resolved relative to the base URI of the element in the stylesheet containing the call to document().

29.

Matching by id in external document()

David Carlisle.

This template works - it accesses the correct target element in the external document based on its id attribute matching the suffix of the local srcfile attribute.

<xsl:template match="jump[@srcfile]">
  <xsl:variable name="jfil">
    <xsl:value-of select="substring-before(@srcfile,'#')" />
  </xsl:variable>
  <xsl:variable name="jid">
    <xsl:value-of select="substring-after(@srcfile,'#')" />
  </xsl:variable>
  <xsl:copy>
    <xsl:apply-templates select="@*[local-name() != 'text']"/>
    <xsl:attribute name="text">
      <xsl:value-of select="document($jfil,/)//target[@id = $jid]" />
    </xsl:attribute>
  </xsl:copy>
</xsl:template>

In XSLT 1.0 you can go

<xsl:for-each select="document($jfil,/)">
<xsl:value-of select="id($jid)" />
</xsl:for-each>