The State of MathML : Mathematically Speaking (and Stuttering)
The Emperor has New Clothes : HTML Recast as an XML Application
You are here: irt.org | Articles | Extensible Markup Language (XML) | XML Namespaces : Universal Identification in XML Markup [ previous next ]
Published on: Sunday 21st November 1999 By: Pankaj Kamthan
In the last year, XML has emerged as a universal syntax for marking up documents to be served and received on the Web. An XML document, as implied by its data model, consists of a tree of elements. Each element has an element type name and a set of attributes. Applications of XML, where a single XML document may contain elements and attributes that are defined by multiple languages, occur in various contexts. With the distributed nature of the Web, such documents, containing multiple markup vocabularies, pose problems of recognition and collision for processing software. This consideration requires that document constructs should have universal names, whose scope extends beyond their containing document. The mechanism of XML namespaces accomplishes this.
In this article, from the authoring viewpoint, we restrict ourselves to the discussion of the following questions:
We assume that the reader has some background in XML and, although not required, familiarity with some XML applications is recommended. A Technical Introduction to XML and the XML FAQ provide a good starting point.
The next two examples illustrate the nature of the problem.
Example 1. Consider the following example:
<books> <title>Book Database by Subject</title> <typography> <author title="Dr" name="Knuth, Donald" /> <book title="Digital Typography" isbn="1575860104" pages="720" price="$49.95" year="1999" /> <publisher name="CSLI Publications" country="USA" /> </typography> </books>
In this example, there are three occurrences of the name
title
and two occurrences of the name
name
within markup. This leads to potential conflict and
provides insufficient information to allow correct processing by a
software.
Example 2. The following is a fragment of an XML document which is to be displayed using a CSS stylesheet:
<theatre> <reservation> <name html:class="font1">Kepler, Johannes</name> <seat class="A" html:class="VIP">101</seat> <name html:class="font2">Kamthan, Pankaj</name> <seat class="B" html:class="General">201</seat> </reservation> </theatre>
In this case, the occurrences of the class
attributes are
semantically different. Since XML 1.0 does not provide a built-in way
to declare "global" attributes, this again results in a
potential conflict.
The XML namespace mechanism resolves these conflicts by extending the XML data model to allow element type names and attribute names to be qualified with a Uniform Resource Identifier (URI). Thus, a document that describes title of a person can use title qualified by one URI, and a document that describes title of books can use title qualified by another URI.
The idea of XML namespaces has its predecessors. In the real world, address schemes are unique when identified in their entirety (starting from the name of the person to the country of residence). From a programming perspective, a Matlab program can include Fortran or C procedures, or a Java program can include a C procedure via the Java Native Interface (JNI). From the viewpoint of document publishing, XML namespaces has a distant similarity to the concept of Architectural Forms in SGML. For an early motivation to the namespace concept in extensible markup languages, see Web Architecture: Extensible Languages.
XML namespaces provide the following (overlapping) advantages:
An XML namespace is defined by:
In the following subsections, we will expand on each of the components in the definition.
The XML namespaces defines a mapping from an XML 1.0 tree where element type names and attribute names are "local names" into a tree where element type names and attribute names can be "universal names." The mapping, as we see later, is based on the idea of a prefix.
A qualified name in XML namespaces contains a single colon, separating the name into a namespace prefix and a local part. The prefix, which is mapped to a URI reference, must be associated with a namespace URI reference in a namespace declaration. The namespace is identified by a URI, either a Uniform Resource Locator (URL), or a Uniform Resource Number (URN), but it does not matter what (if anything) it points to. URIs are used simply because they are globally unique across the Internet, and thus help produce identifiers that are universally unique. The prefix functions only as a shorthand placeholder for a namespace name, that is, the URIs. As URIs can contain characters that are not allowed in names, a proxy that associates the prefix with the given URI is used. The appendix lists namespace names associated with some XML vocabularies.
Example 3. Here is an example of a qualified name serving as an element type:
<books xmlns:b="http://www.foo.com/bar"> <!-- The 'publisher' element's namespace is http://www.foo.com/bar --> <b:publisher>Addison Wesley</b:publisher> </books>
The attribute "xmlns
" is an XML keyword for a
namespace declaration.
Example 4. Here is an example of a qualified name serving as an attribute name:
<books xmlns:b="http://www.foo.com/bar"> <!-- The 'category' attribute's namespace is http://www.foo.com/bar --> <book b:category="research">Numerical Analysis of Partial Differential Equations</book> </books>
The following constraints apply to prefixes in namespaces:
x
, m
, l
, in that order, in any case combination, are reserved for use by XML and XML-related specifications.xml
or xmlns
, must have been declared in a namespace declaration attribute in either the start-tag of the element where the prefix is used or in an ancestor element.
A namespace is declared using a family of reserved
attributes. Such an attribute's name must either be xmlns
or have xmlns:
as a prefix. These attributes, like other
XML attributes, may be provided either explicitly or by
default. The attribute's value, a URI reference, is the
namespace name identifying the namespace. The namespace name
has the characteristic of being unique.
The namespace declaration applies to the element where it is specified and to all elements within the content of that element, unless overridden by another namespace declaration with the same attribute name.
Example 5. The following is an example of a namespace
declaration, which associates the namespace prefix m
with
the namespace name http://www.w3.org/Math/MathML
:
<apply xmlns:m="http://www.w3.org/Math/MathML"> <!-- The 'm' prefix is bound to http://www.w3.org/Math/MathML for the 'apply' element and its contents --> </apply>
The following conditions apply to attributes. A tag should not contain two attributes which:
Example 6. The next example shows how these
conditions can be contradicted, resulting in an illegal element
y
:
<!-- http://www.foo.com is bound to n1 and n2 --> <x xmlns:n1="http://www.foo.com" xmlns:n2="http://www.foo.com" > <!-- y contradicts condition 1 --> <y a="1" a="2" /> <!-- y contradicts condition 2 --> <y n1:a="1" n2:a="2" /> </x>
With an explicit declaration, you can define a prefix to substitute for the full name of the namespace. You can then use this prefix to qualify elements belonging to that namespace. Explicit declarations are useful when a node contains elements from different namespaces.
Example 7. In the following explicit declaration, all
elements beginning with b:
or money:
are
considered to be from the namespace
urn:BooksAreUs.org:BookInfo
or
urn:Finance:Money
, respectively. This example also shows
that multiple namespace prefixes can be declared as
attributes of a single element.
<?xml version="1.0"?> <books> <b:book xmlns:b="urn:BooksAreUs.org:BookInfo" xmlns:money "urn:Finance:Money"> <b:title>Digital Typography</b:title> <b:price money:currency="US Dollar">49.95</b:price> </b:book> </books>
Example 8. In this example, the elements prefixed
with b are associated with a namespace whose name is
urn:BooksAreUs.org:BookInfo
, while those prefixed with
h are associated with a namespace
http://www.w3.org/TR/REC-html40
that is used as the
namespace name for HTML.
<?xml version="1.0"?> <html xmlns:h="http://www.w3.org/TR/REC-html40" xmlns:b="urn:BooksAreUs.org:BookInfo"> <h:head> <h:title>Typography</h:title> </h:head> <h:body> <h:p>Welcome to the world of typography! Here is a book that you may find useful.</h:p> <b:title h:style="font-family: sans-serif;">Digital Typography</b:title> <b:author>Donald Knuth</b:author> </h:body> </h:html>
A default namespace applies to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. The default namespaces do not apply to attribute names.
Example 9. In this example, all elements and
attributes within the book
element (title
,
price
, currency
) are from the namespace
urn:BooksAreUs.org:BookInfo
.
<?xml version="1.0"?> <books> <book xmlns="urn:BooksAreUs.org:BookInfo"> <title>Digital Typography</title> <price currency="US Dollar">49.95</price> </book> </books>
Example 10. In this example, all
unprefixed elements and attributes are by default, from
the namespace http://www.w3.org/TR/REC-html40
that
is used as the namespace name for HTML:
<?xml version="1.0"?> <html xmlns="http://www.w3.org/TR/REC-html40" xmlns:b="urn:BooksAreUs.org:BookInfo"> <head> <title>Typography</title> </head> <body> <p>Welcome to the world of typography! Here is a book that you may find useful.</p> <b:title style="font-family: sans-serif;">Digital Typography</b:title> <b:author>Donald Knuth</b:author> </body> </html>
The default namespace can be set to the empty string. If the URI reference in a default namespace declaration is empty, then unprefixed elements in the scope of the declaration are not considered to be in any namespace. This could be useful if we have declared a namespace initially, but want to "free" some of the elements from such a binding later.
XML documents using multiple vocabularies can be authored like any other XML documents. XML Spy is a commercial XML editor with support for various features often required in XML authoring, including XML namespace support for both elements and attributes. Previously-used namespaces are "preserved" for later use:
These XML authoring environments, however, may not be "sensitive" to specific XML application fragments embedded in a document and treat them as being generic XML markup.
As a safe practice, all XML documents (with or without namespaces) that are authored should be checked for well-formedness. This is possible with any XML authoring environment, for example, XML Spy discussed above, which supports such a facility.
However, with an XML document using elements from different applications, validation becomes an issue. For a name "foo:bar," there is no standard way to validate that "bar" is a member of the namespace "foo" since there is no standardized mechanism for the vocabulary of which "bar" may or may not be a member. For example, an occurrence of "xhtml:table" does not by itself imply that what is being processed is in fact an XHTML element, as an assumption based on the prefix alone is at best a guess. Furthermore, namespace-sensitive validation would require associating each URI corresponding to the namespace name with some sort of schema (similar to a DTD) and be able to validate a document with respect to the schemas for all of the URIs. This is not yet possible as DTD-based schema has various limitations which prohibit such a possibility.
XML Schema initiative fixes the problems associated with XML DTDs and, as a result, provides namespace-sensitiveness. It attempts to provide a mechanism similar to SGML Architecture Forms (without being limited to the constraints of DTD syntax). Also, in this regard, Simple API for XML (SAX) version 2 has added support for XML namespaces. This, however, should not be taken to imply that there is any direct relationship between namespaces and schemas. A namespace is an abstract object with no necessary association between itself and anything. In particular, there is no necessary association between a namespace and a schema.
In previous sections, we have already seen examples where XML can be used in conjunction with HTML. In this section, we provide some further scenarios.
Example 11. A circle is one of the basic mathematical objects. Suppose we wish to include the symbolic as well as graphical representation of a circle in an Extensible HyperText Markup Language (XHTML) 1.0 document. Using an entirely XML approach, we could do that by representing the equation of a circle in Mathematical Markup Language (MathML), a language for expressing mathematical notation in XML, and the corresponding graphics in Scalar Vector Graphics (SVG), a language for describing two-dimensional graphics in XML. We embed both MathML and SVG markups in an XHTML 1.0 document with the help of corresponding XML namespaces, as follows:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
<html xmlns="http://www.w3.org/TR/xhtml1/strict" xml:lang="en" lang="en">
<head><title>Equation of a Circle</title></head>
<body>
<p>The equation of a circle:</p>
<!-- MathML Content Markup of the Equation of a Circle -->
<math xmlns="http://www.w3.org/1998/Math/MathML">
<reln><eq/>
<apply><plus/>
<apply><power/><ci>x</ci><cn>2</cn></apply>
<apply><power/><ci>y</ci><cn>2</cn></apply>
</apply>
<cn>1</cn>
</reln>
</math>
<p>can be graphically represented as:</p>
<!-- SVG Graphic of a Circle -->
<svg xmlns="http://www.w3.org/Graphics/SVG/SVG-19991203.dtd
"
width="250px" height="250px">
<g><circle style="fill: none; stroke: black" cx="10" cy="10" r="100"/></g>
</svg>
</body>
</html>
On a renderer (currently not in existence to the author's knowledge) that supports XHTML, MathML and SVG, this should result in the following output:
The equation of a circle:
can be graphically represented as:
The use of EzMath to author the equation, and CSIRO SVG viewer for the image shown above, was made.
There may be times when a need arises to translate a set of XML files to XHTML 1.0, say, for presenting them to HTML 4.0 user agents. An efficient way of doing that is to use the XSL Transformations (XSLT), the "transformational" part of the Extensible Stylesheet Language (XSL). Here is an example of using XSLT to create an XHTML 1.0 document from a given XML document.
Example 12. Here is an example of using XSLT to create an XHTML 1.0 document from a given XML document.
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="sales.xsl" type="text/xsl"?> <sales> <department id="A"><revenue>100</revenue><profit>5</profit></department> <department id="B"><revenue>200</revenue><profit>15</profit></department> </sales>
The XSL style sheet (sales.xsl
) is:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform
" xmlns="http://www.w3.org/1999
/xhtml" xml:lang="en" lang="en" default-space="strip" indent-result="yes"> <xsl:template match="/"> <html> <head><title>Sales By Department</title></head> <body> <table align="center" cellpadding="5" border="1"> <tr><th>Department</th><th>Revenue</th><th>Profit</th></tr> <xsl:apply-templates /> </table> </body> </html> </xsl:template> <xsl:template match="sales"> <xsl:apply-templates match="department"> <xsl:sort select="profit" data-type="number" order="ascending"/> </xsl:apply-templates> </xsl:template> <xsl:template match="department"> <tr> <td><xsl:value-of select="@id" /></td> <xsl:apply-templates select="revenue" /> <xsl:apply-templates select="profit" /> </tr> </xsl:template> <xsl:template match="revenue | profit"> <td><xsl:apply-templates /></td> </xsl:template> </xsl:stylesheet>
The files can be, for example, processed using XT under Windows 9x/NT
using the XML document sales_1999.xml
as input, applying
the XSL stylesheet sales.xsl
, and directing the output to
sales_1999.html
. The result, depending on the renderer,
appears in the XHTML document as:
Resource Description Framework (RDF) is a foundation for processing metadata. It provides interoperability between applications that exchange machine-understandable information on the Web.
RDF requires the XML namespace facility to precisely associate each property with the schema that defines the property. Consider the following RDF statement:
W3C is the host of the resource http://www.w3.org/.
After identifying the RDF parts of the description,
W3C | is the | host | of the resource | http://www.w3.org/ | . |
Object |
Property |
Subject |
the markup, in the RDF serialization syntax, is:
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:h="http://www.foo.com/consortiums"> <rdf:Description about="http://www.w3.org/"> <h:Host>W3C</s:Host> </rdf:Description> </rdf:RDF>
where the elements with prefix h
are associated with a
namespace name http://www.foo.com/consortiums
.
An immediate consequence of "universal identification" is accurate searchability; search over a large set of documents for an element name would otherwise (in absence of namespaces) lead to irrelevant results (as the search program would not know which element do we mean).
An observation based on some of the available namespace names (see, for example, the Appendix) reveals the following issue: There seems to be an apparent inconsistent pattern in namespace name assignment. Some of the specifications use version control number (for example, XSL and XSLT), some use version control date (for example, RDF Syntax, RDF Schema), some use year (MathML, XHTML), some use DTD location (for example, SVG), and yet some others use URI of the specification itself (HTML 4.0, SMIL). There is an inconsistency even among the members of the same "family" (for example, RDF and MathML). Such an arrangement can pose difficulties such as making an "educated" guess of their form during authoring. A unified direction on how the URIs are assigned to namespaces and managed, is needed.
Namespaces, though allow a mechanism for unique identification of elements and attributes, do not have any implications towards semantics (that is, they do not define what these elements and attributes are, or what they mean). Any inferences based on meanings of namespace names are unreliable. For example, there may be several names in different namespaces that map to the same semantic, and conversely, a given name may have different semantics based on its context. There is no way to describe this using namespaces only.
XML Namespaces : Bridges of XML Applications
XML namespaces are an important step towards making XML applications coexist coherently and interoperate transparently without any potential conflict. They work as a "glue" that binds these standards together.
There are certain W3C standards for which namespace mechanism does not exist. One of them is CSS, for which a namespace enhancement has been proposed in CSS3.
XML has permeated various diverse "islands" of knowledge: mathematics, graphics, multimedia, databases, and electronic commerce, to name a few. With the XML namespace "bridge," elements and attributes of one island can now travel freely to another.
The journey continues.
I would like to thank Hsueh-Ieng Pai and Martin Webb for various useful suggestions.
APPLICATION | XML NAMESPACE NAME |
---|---|
HTML | http://www.w3.org/TR/REC-html40 (This is an example of the fact that the concept of XML namespaces is not limited to XML.) |
XHTML | http://www.w3.org/1999/xhtml |
MathML | http://www.w3.org/1998/Math/MathML |
RDF | RDF Syntaxhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
RDF Schema http://www.w3.org/TR/1999/PR-rdf-schema-19990303# |
SMIL | SMIL 1.0http://www.w3.org/TR/REC-smil
SMIL Animation http://www.w3.org/TR/smil-animation10 |
SVG | http://www.w3.org/Graphics/SVG/SVG-19991203.dtd |
XSL | XSL Formatting Semantics Vocabularyhttp://www.w3.org/XSL/Format/1.0
XSL Transformation Vocabulary http://www.w3.org/XSL/Transform/1.0 |
The State of MathML : Mathematically Speaking (and Stuttering)
The Emperor has New Clothes : HTML Recast as an XML Application