DTD

1. Where can I find the XSLT DTD
2. Mathml and XHTML in IE browser
3. How can I access the DTD declaration
4. Use of a DTD

1.

Where can I find the XSLT DTD

John E Simpson

There can't be one for all cases.

>I'll never be able to validate ANY of my XSL doc?

No... unless you do as suggested, and create an application-specific DTD for use in validating your stylesheet. This can be quite complicated; if XHTML were the result tree's vocabulary, for instance, you'd have to allow for the appearance of just about any XHTML element as a child of just about any XSLT element.

As someone else said, almost no one bothers checking XSLT stylesheets for validity -- well-formedness is all right, as long as the XSLT processor (XT, SAXON, whatever) detects syntax and other XSLT-specific errors. Validity in the XML sense is not critical for XSLT. Actually, I'd guess that absolutely no one bothers to check validity of stylesheets; the "almost" is just a hedge. :)

Joe English adds

Validators usually give better error messages than XSLT processors, which is helpful for catching gross structural errors.

Plus, in cases where the stylesheet makes heavy use of literal result elements, this can go a long way towards semantically validating the stylesheet (that is, making sure that the stylesheet produces valid result documents).

However, constructing a DTD against which to validate the stylesheet in this case can be a bit tricky. It's usually not hard to customize the XSLT DTD fragment:

<!ENTITY % xsl.dtd SYSTEM "xslt.dtd">
<!ENTITY % html.dtd PUBLIC 
    "-//W3C//DTD XHTML 1.0 Strict//EN" "/dev/null">
    %html.dtd;
    <!ENTITY % result-elements "%inline; |  %block;" >
    %xsl.dtd;

but the target DTD *also* has to be parameterized in order to allow XSL instructions inside literal result elements! This isn't difficult either if you "cheat" and use an SGML parser for validation; inclusion exceptions fit the bill nicely here.

2.

Mathml and XHTML in IE browser

David Carlisle



  Well afaict there is something wrong with the dtd - but its far too big
  to dicpher :)

hey watch it, that's "my" DTD:-)

Actually there is something wrong with the DTD (and IE wouldn't read that one anyway as there is something wrong with IE)

the correct URI is http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd


> The xhtml will display in your browser because most likely it makes no 
> attempt to get the dtd (I don't know about Mozilla but IE certainly 
> doesn't use an xml parser for xhtml).

IE will only render XHTML+MathML if it is served with an XML mime type and if it is so served it will use an XML parser and will use the specified DTD. For this reason it iis a good idea not to specify the DTD on teh files that are being served as the XHTML+MathML DTD is rather large and teh time taken to download that can have a very significant effect on rendering time.


> Are you sure that the dtd really is an xml dtd?

That was addressed to the original poster, but _I'm_ sure of that:-)

The error message in the subject line is in fact caused by a bug in IEs parser not being able to read the original version. The version at the specified location has an "equivalent" setup of parameter entities that does work in IE although you have to have IE6 SP1 otherwise you get a different error as older versions of IE used an XML parser that would not accept any character references above hex FFFF.

As I say above, the simple solution is not to reference the DTD at all.

3.

How can I access the DTD declaration

Andrew Welch


Bottom line, you can't. 

But! By capturing input events and modifying the source the entity information can be put through to the output for further processing. See this for the detail

Converting cdata sections into markup
Preserving entity references
Preserving the doctype declaration
Marking up comments

Quite versatile.

4.

Use of a DTD

Michael Sperberg-McQueen

Ednote: This arose out of the user of the DTD reference in SVG instances

(1) The presence of a DOCTYPE declaration does not, in principle, mean that the external DTD file must be dereferenced, though that is often the effect in practice.

The URI "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" given as the system identifier for the DTD must be consulted by any processor performing DTD-based validation on the data. The presence of a DOCTYPE declaration does not constitute an instruction to validate the document, and in principle it would be good if processors like Firefox allowed you to specify whether you want validation performed or not. But in practice, many programs don't provide that kind of user control; instead they assume that if a DOCTYPE declaration is present, they must or should validate the document. For such programs, a request that they read a particular document amounts in effect to an instruction that they should validate it, too, if a DOCTYPE declaration is present.

Note that a program validating the document may or may not actually hit the network: the authoritative source for the document is the server identified, but if your system has a caching proxy and the DTD is in the cache, there will not necessarily be any network traffic. And software built to work with documents of a particular kind may have and consult a locally cached copy of the DTD instead of retrieving it from the network. In the case of DTDs served from W3C servers, the DTDs change very infrequently and the expiration dates are set to encourage local caching; experience on those servers shows that surprising numbers of programs and packages are willing to request the same resource thousands of times in the same minute, whether the requests succeed or fail. When this happens frequently, it can place a bit of a strain on the server involved, so well behaved software should arrange for some kind of local cache.

See W3C for a more complete account of some relevant issues.

(2) Many programs will fail gracefully (or relatively gracefully) if they can't get to the DTD.

Many programs which attempt validation whenever they see a DOCTYPE declaration will shrug their shoulders and proceed without validation if they don't succeed in retrieving the required external resources (such as the DTD). The logic of this behavior is not completely clear (if you think validation is required, why would you proceed anyway if you can't perform validation?), but it's not uncommon.

(3) Namespace names serve purposes of uniqueness and documentation. They will seldom need to be dereferenced.

The URIs "http://www.w3.org/2000/svg" and "http://www.w3.org/1999/xlink" in your sample graphic identify certain constructs in the XML as being in the SVG or the XLink namespaces, respectively. The crucial effect of this is to ensure that when the same local name is used in two different namespaces, markup can reliably be assigned to one or to the other. There is no need to dereference the namespace URI in order for software to perform that function.

Any software responsible for processing a particular vocabulary will need to know, given an element named (for example) "desc", whether it's the "desc" element they know about (e.g. the SVG desc), or some other "desc" element (any desc in any other namespace). That also does not require that the URI be dereferenced; software built to process SVG, for example, will almost certainly have the SVG URI hard-coded into it somewhere.

On the other hand, namespace documents are occasionally used to provide links (e.g. via a RDDL document) to relevant resources, e.g. schema documents in various schema languages. And so software may occasionally dereference a namespace URI to see if it can find relevant resources there.

And of course if a human is trying to understand what this SVG stuff is, then they might do worse than dereference the URI to see if it provides any useful human-readable information, or pointers to such information. (The SVG and XLink URIs do in fact do this.)


> Three of the applications, Firefox, InkScape and
>Adobe CS3 care about the name of the xmlns URL.

They should: they include special code to process SVG, and that code should work on SVG elements and attributes but not on random markup in other namespaces.

> Something other than www.w3.org trips them up. Antenna House and Saxon don't seem to care.

Saxon, not being an SVG processor, will almost certainly not care what namespace URI is used. But if the namespace URI in the input document and the one in the stylesheet don't use, you are unlikely to be getting the transformation you had in mind.

I don't know why Antenna House behaves as it does.

> With the <!DOCTYPE> declaration I can reference www.w3.org as above, or reference an internal network URL or drop the declaration all together and none of the applications perform differently. All of this is, of course, anecdotal data at best. It would be great to know for sure what is going on.

It sure would :)

> My question: Is there ever an attempt to make an external reference to www.w3.org from either the <!DOCTYPE> declaration or the xmlns reference?

I hope the details above help a bit, even though the answer is a rather disappointing "it depends on the program". Most XML specs work very hard to provide a declarative semantics for what they define, and the result is that conforming software has a fair bit of leeway as to what they do in particular cases.

If your organization is worried about things not working if the network goes down, I think your experiments show that that worry is not well founded. I think you would be best advised not to try to strip out the references to external resources.