XSL (Extensible Stylesheet Language): Summary of Preliminary Research for Site Developers

Definitions
XML
Definition: The Extensible Markup Language (XML) is the universal format for structured documents and data on the Web.

XSL
Definition: A way to specify output styles. An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary. XSL consists of three parts:

1. The XSL standard -- an XML vocabulary for specifying formatting semantics, where an element is used to describe the appearance of its content rather than its meaning.

2. XSLT (Transformations) -- an XML language that transforms an XML document into an XSL document with the aid of XPATH. XSLT is actually just an XML document that conforms to the XSLT DTD.

3. XPATH -- a general-purpose expression language for interrogating/querying XML document structures and scripting the XSL to manipulate XML data.

Why use XSL?
Site developers should be interested in XML as it relates to XSL because:

1. XML used in conjunction with XSL allows for the separation of data and presentation.

Since our domain mainly covers the presentation layer, we should understand the advantage of being able to port data to many different platforms/media, in different formats using XML/ XSL. XML/XSL realizes the notion of "write once, publish everywhere". We will be able to leverage XSL to author for aural format, for print, and for wireless or PDA devices, for example, while accessing a single data source.

2. XML allows for the transmission of data between applications.

While this is more of a tech concern, we should understand the concept because of the implications for the facility of formatting control and re-use of data.

The advantage of using XSL to style or format the presentation of XML documents is obvious -- we may have been using CSS to accomplish some of these tasks. The ability to use a high-level language to manipulate the data (transform it) before serving it to the client has even greater implications for Site Development.

We can manipulate XML data as we might manipulate data pulled from an SQL database. In XSL, rather than searching the structure of a "Table", we are parsing the structure of the "Source Tree" -- the hierarchical representation of the XML document (tree) and it's associated nodes (branches) -- in order to produce the "Result Tree". The nodes of the newly formed result tree, with style and formatting applied to it, are expressed as "styled" flow objects -- i.e. in simpler terms, the client-facing page is delivered.

XSL allows us, at the presentation-layer (at the level of the "Result Tree"), to introduce SQL-like SELECT statements and comparison operations. Transformation may also include: aggregating and grouping data, sorting it, and performing arithmetic and string conversions.

The transformation of XML documents happens within the XSL template.

XML, XSL, SQL and CMS: Real world considerations
In the real world, it is likely that our XSL implementations will coexist with SQL databases and Content Management Systems (CMS). The illustrations below depict the process of delivering content after the http request. In these depictions, two scenarios for delivering web content are shown.

The transformation of XML documents requires invoking an XML/XSL(T) parser, referencing an XSL stylesheet and transforming the original XML document into another format. A server-side solution may use a "stand-alone" XSLT processor to convert the XML into a viewable format in HTML to be served to the browser. Alternatively, a client-side solution exists where an "embedded processor" which resides at the browser's point of contact may transform the XML into XSL. The client-side scenario assumes that the client's browser is XML-aware and capable of parsing and transforming XML using XSL.

illustration

XSL stylesheet creation
As illustrated above, an XSL stylesheet is used to generate a new document that outputs the XML document data in the manner specified in the stylesheet. The classic "Hello World" example below shows the basic elements needed to transform XML data using XSL. You can test these documents on your browser if you have MS IE 5 with the MSXML3 parser installed. [available here].

Since XSL documents are XML documents, they must be well-formed to comply with XML rules, including the HTML that comprises the template. So before we get started, here are some issues you should be aware of:

All code must be well-formed

All tags must be closed

No overlapping tags are allowed

Case matters (lowercase your tags)

Quote your attributes

Use a single root

Escape script blocks

[see msdn's description of well-formedness]

Hello world example
The following introduces a basic XSL stylesheet to produce the boring (but classic) "Hello world" example.

[hello.xml -- input]

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="hello.xsl"?>

<greeting>Hello cruel world!</greeting>

The basic xml document consists of 3 basic parts. The XML declaration , the processing instruction and the tagged data.

<?xml version="1.0"?>

This is the XML declaration and version number. It is standard on all XML documents.

<?xml-stylesheet type="text/xsl" href="hello.xsl"?>

This is a processing instruction to the XML file to use a style sheet.

<greeting>Hello cruel world!</greeting>

Finally, this is the xml-tagged data.

[hello.xsl -- output/stylesheet]

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
<html>
<body>
<xsl:value-of select="greeting"/><br />
</body>
</html>
</xsl:template>
</xsl:stylesheet>

The xsl document consists of 2 main parts in addition to the standard XML declaration: a match or pattern match that identifies the node which the stylesheet is acting on, and the formatting, styling and processing statements.

<?xml version='1.0'?>

Start with the XML declaration because an XSL document is still an XML document.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
...
</xsl:stylesheet>

Identify the document as a stylesheet. The xmlns:xsl attribute is an XML Namespace declaration indicating that the xsl prefix will be used.

<xsl:template match="/">
...
</xsl:template>

The xsl:template defines a template rule that should be triggered on the node which matches the pattern or location indicated. The XSL searches the XML document tree for this match. The convention for naming the tree structure follows the directory naming convention used in UNIX systems, where the initial "/" specifies the root, and words following slashes specify directories beneath the root.

In structure of the XML document, the initial "/" indicates the root of the document and any words following a slash indicate nodes.

Getting started with XPATH
Recall that XPATH is the expression language for interrogating/querying XML document structures and scripting the XSL to manipulate XML data.

In the example below, we have an xml file with data about 3 books. The xml file has a node called "booklist" with child nodes called "book". We want to be able to format this list as HTML by walking the "booklist " and printing out each "book". To do this, we use the XPATH language to convert the XML file.

[example.xml]

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
<booklist xmlns:dt="urn:schemas-microsoft-com:datatypes">
<book>
<title>My name is Jon. I have a gambling problem.</title>
<author_first_name>Jon</author_first_name>
<author_last_name>Tobey</author_last_name>
<abstract>Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat.</abstract>
</book>
<book>
<title>My life without Survivor</title>
<author_first_name>Bob</author_first_name>
<author_last_name>Burns</author_last_name>
<abstract>Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat.</abstract>
</book>
<book>
<title>Zen of woodworking</title>
<author_first_name>Michael</author_first_name>
<author_last_name>Angeles</author_last_name>
<abstract>Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat.</abstract>
</book>
</booklist>

example.xml contains 3 book nodes in the booklist root. Each book node contains child nodes that describe the author, the title and the abstract for each book. For this example, we want to use XPATH to transform the XML into an HTML page. XPATH will allow us to "walk the tree" of the XML document (i.e. parse the document node by node) and execute formatting commands with a loop.

[example.xsl]

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
<html><title>booklist test</title><body gcolor="#ffffff"> <b>Booklist</b><br />
<xsl:for-each select="booklist/book" order-by="+author_last_name">
<p><b><xsl:value-of select="title"/></b><br />

<xsl:value-of select="author_last_name"/>,
<xsl:value-of select="author_first_name"/><br />

<xsl:value-of select="abstract"/></p>
</xsl:for-each>
</body></html>
</xsl:template>
</xsl:stylesheet>

<xsl:template match="/">
...
</xsl:template>

Apply to all nodes in root.

<xsl:for-each select="booklist/book" order-by="+author_last_name">
<p><b><xsl:value-of select="title"/></b><br />
<xsl:value-of select="author_last_name"/>,
<xsl:value-of select="author_first_name"/><br />
<xsl:value-of select="abstract"/></p>
</xsl:for-each>

This is the loop that says:

Walk through the the "booklist" tree in the document root and for each occurrence of "book", perform some formatting task. Finally, sort the "book" elements in ascending order by the element named "author_last_name".

OK. Let's go line by line. The first part of that first line tells the parser that this is a loop. This is indicated by the XPATH command for-each.

The second part of the for-each command tells the parser where this loop should occur. This is indicated by the select attribute. Specifically, it says to walk through "booklist" tree and whenever it encounters the "book" node, perform the formatting that follows until it reaches the </xsl:for-each> and return to the beginning of the for-each loop until it no longer encounters a "book" node in the booklist tree of the document root.

The last part of the for-each line says to sort the result tree. This is indicated by the order-by attribute. It specifically says to sort in ascending order (+) by the node named "author_last_name". If you wanted to sort in descending order, you'd precede the sort node with a minus sign (-).

The formatting lines access the content of an element as a string using the value-of element. This element has a select attribute to select the current element that is named.

That is a basic example using XPATH. For further study, delve into the Neil Bradley book, XSL Companion [see references].

Points of transformation
The transformation of XML into another format such as HTML can be performed on the client or server-side. As mentioned above, if the transformation occurs on the server-side, we can reach a wider pool of users.

Client-side transformation can occur right in the browser if the browser is enabled with an XML parser that can perform XSL tranformation -- such is the case with MSIE 5 with the MSXML Parser. You can additionally use JavaScript in IE to load the XML and XSL files. JavaScript allows us to check for non-XML aware browsers and treat these browsers differently. In IE we create 2 new instances of the MS XML parser (XMLDOM) and load the XML and the XSL into memory. Then we use the XSL document to transform the XML to HTML.

<script language="javascript">
// Show error if not using IE
$msg = 'Sorry. You must be using MS Internet Explorer 5 and the MSXML Parser to view this page.';
if(!document.all){document.write($msg)}
// Load XML
var xml = new ActiveXObject("Microsoft.XMLDOM")
xml.async = false
xml.load("stock.xml")
// Load the XSL
var xsl = new ActiveXObject("Microsoft.XMLDOM")
xsl.async = false
xsl.load("stock.xsl")
// Transform
document.write(xml.transformNode(xsl))
</script>

The best solution for cross-browser delivery of XML data is to transform documents on the server side. Server-side transformation can be done using any server side parser application such as Microsoft's MSXML Parser or Michael Kay's Saxon, written in Java for Unix or Windows [see references]. You can additionally employ ASP using a method similar to the one used with JS above, by creating instances of the XML parser. With the above approaches, the browser never has to be XML-aware.

<%
'Load the XML
set xml = Server.CreateObject("Microsoft.XMLDOM")
xml.async = false
xml.load(Server.MapPath("cd_catalog.xml"))
'Load the XSL
set xsl = Server.CreateObject("Microsoft.XMLDOM")
xsl.async = false
xsl.load(Server.MapPath("cd_catalog.xsl"))
Response.Write(xml.transformNode(xsl))
%>

Conclusions

We are finally seeing the promise of XML on the presentation side with the introduction of XSL. The implications for Site Development of porting data to any format are great if we consider that our offering now allows for facile delivery of content across a wider spectrum of media. Site Development as a practice prides itself in the quality of execution of the presentation layer using whatever technologies/tools should emerge in the digital space. Clearly, we have to be able to leverage the power of this new tool for adding value to our offering.

Resources

XML and XSL, General Information

w3c. XML http://www.w3.org/XML/
-- definitions, working drafts, links to tools, resources

w3c. XSL
http://www.w3.org/Style/XSL/
-- definitions, working drafts, links to tools, resources, tutorials

Oasis. XML Cover Pages
http://oasis.oasis-open.org/cover/xsl.html
-- The XML Cover Pages is a comprehensive online reference work for the Extensible Markup Language (XML). The XSL section gives comprehensive coverage of XSL developments and has an extensive bibliography of books, articles, and papers.

XSL Info
http://xslinfo.com
-- links to XSL sites, tutorials, book information

mulberry.com. XSL-List
http://www.mulberrytech.com/xsl/xsl-list/
-- XSL listserv

Books
Bradley, Neil. The XSL Companion. Addison-Wesley Pub. Co, 2000.

Harold, Elliotte Rusty. The XML Bible. IDG Books Worldwide, 1999.
Available from World Wide Web (http://www.ibiblio.org/xml/books/bible/updates/15.html)
-- the complete chapter, "XSL Formatting Objects", is available via the Web.

Kay, Michael. XSLT Programmer's Reference. WROX Press, 2000.
Available from World Wide Web (http://www.wrox.com/Consumer/Store/Books/3129/contents.htm)
-- an abridged version of of this is available via the web. Introduces "...the purpose of XSLT and the task it was designed to perform. It's about what kind of language it is and how it fits in with all the other technologies that you are likely to use in a typical web-based application."

Software

Clark, james. XT.
http://www.jclark.com/xml/xt.html

Kay, Michael. Saxon and Instant Saxon.
http://users.iclway.co.uk/mhkay/saxon/

Microsoft. MSXML Parser.

http://msdn.microsoft.com/downloads/
default.asp?URL=/code/sample.asp?
url=/msdn-files/027/000/541/msdncompositedoc.xml

VBXML. XSL Debugger.
http://www.vbxml.com/xsldebugger/

Whitehill Technologies. <xsl>Composer
http://www.whitehill.com/products/prod4.html

XSL Tutorials

XML 101. XSL
http://www.xml101.com:8081/xsl/default.asp
-- high-level introduction to XSL, XSL template creation and implementation

Microsoft. Getting Started with XSL
http://msdn.microsoft.com/library/psdk/xmlsdk/xslp817w.htm
-- Microsoft tutorial for using XSL with IE. Covers XSL template creation, doing conditional checking, accessing and formatting attributes, and pattern matching

Microsoft. Advanced XSL Features
http://msdn.microsoft.com/library/psdk/xmlsdk/xslp13g3.htm
-- introduces methods for dealing with irregular data, advanced pattern matching, filtering, debugging stylesheets

Microsoft. XSL Patterns
http://msdn.microsoft.com/library/psdk/xmlsdk/xslp0x2r.htm
-- rules for using the XSL query language to identify nodes in XML documents

26 September 2000