|
Comparing XSLT and Mathematica
In many situations, there is a need to transform a document from one XML format into another. One popular technique used for this purpose is XSLT transformations. However, Mathematica's pattern-matching and transformation abilities allow you to do similar transformations simply by importing the original document and then manipulating the resulting SymbolicXML expression. This section gives examples of some basic XSLT transformations and shows how to do the equivalent transformations in Mathematica.
A Simple Template
Let's consider a very simple example. Say our XML dialect uses the code tag to enclose program code. Typically, this is displayed in a monospace font. If we were to convert such a document to XHTML, we would probably want to use the pre tag for code. The following XSLT template would do this.
<xsl:template match="code">
<pre class="code">
<xsl:value-of select="."/>
</pre>
</xsl:template>
In Mathematica, you can create a function to do the same.
In[58]:=
In[59]:=
Inserting Attribute Values
Now consider an XML application which uses the termdef element to indicate the definition of a new term. Again, we will convert this to XHTML. We would like to anchor the definition with an element named a so that we can link directly to that location in the document. Assuming we have templates to handle whatever string formatting is inside the termdef element, we can use the following XSLT.
<xsl:template match="termdef">
<span class="termdef">
<a name="{@id}">[Definition:] </a>
<xsl:apply-templates/>
</span>
</xsl:template>
Notice that the name attribute in the resultant XHTML gets the value of the id attribute of the original termdef element. In Mathematica, you can do the following.
In[60]:=
Using Predicates
Consider a more complicated example, which will use XPath predicates. Assume we would like to match a note element, but only if it either has a role attribute set to example or if it contains an eg element as a child. Let us look at an XSLT template, and then explain what it does.
<xsl:template match="note[@role='example' or child::eg]">
<div class="exampleOuter">
<div class="exampleHeader">Example</div>
<xsl:if test="*[1][self::p]">
<div class="exampleWrapper">
<xsl:apply-templates select="*[1]"/>
</div>
</xsl:if>
<div class="exampleInner">
<xsl:apply-templates select="eg"/>
</div>
<xsl:if test="*[position()>1 and self::p]">
<div class="exampleWrapper">
<xsl:apply-templates
select="*[position>1 and self::p]"/>
</div>
</xsl:if>
</div>
</xsl:template>
The first xsl:if element checks to see if the first child element is a p element. If it is, then xsl:apply-templates is called on that child. This is similar to calling Map across the results of Cases. In the second xsl:if element, we check if there are p child elements beyond the first child. If so, xsl:apply-templates is called on those. Here is the corresponding Mathematica code.
In[61]:=
In[62]:=
Out[62]=
Traversing Upwards
So far, all the examples we have given in XSLT have had a very simple implementation in Mathematica using SymbolicXML. In each of these cases, however, we were selecting expressions that were nested inside of the given expression. What if we wanted to select an ancestor or sibling? Let us see how this can be done.
To clarify the problem and find a solution, we have to realize that an XML document is just a stream of characters that follows a grammar. Tools for manipulating XML documents treat XML according to some model. In the case of XSLT (and its path-selection language, XPath), this model is that of a tree. Since Mathematica is a list-based language, it treats XML as nested expression lists.
While these two models are similar, they have important differences. Most notably, in nested lists you do not inherently have any concept of the containing list. Technically, any transformation that can be done with axis types like ancestor can also be done without them. However, it is often convenient to traverse up the XML document.
Let us look at an example and then discuss how to implement the same behavior in Mathematica. While it will involve a slightly different technique than we have used above, it will nonetheless be rather simple. Consider the following XML document.
In[63]:=
Out[63]=
We will assume we simply want to have a template which matches bibref elements and replaces them with the text inside of the corresponding bibl element. In XSLT, we would write the following template.
<xsl:template match="bibref">
<xsl:param name="ref">
<xsl:value-of select="@ref"/>
</xsl:param>
<xsl:value-of select="/bibliography/bibl[@id = $ref][1]"/>
</xsl:template>
The problem with using the same approach in Mathematica is that once we have matched a bibref element, we no longer have any information about the elements containing it. As a remedy, we will instead pass an expression containing the entire SymbolicXML expression. Notice that the bibref element in question can be obtained from
In[64]:=
Out[64]=
Rather than pass the XMLElement expression, we can pass this expression wrapped in Hold. That way, we can easily obtain the bibref element by calling ReleaseHold, and we can access ancestors by dropping indices from the Part expression. However, we will need to write a pattern-matching function so that we can match these in definitions of functions.
In[65]:=
In[66]:=
Out[66]=
The Mathematica transformation then becomes relatively simple.
In[67]:=
In[68]:=
In[69]:=
Out[69]=
|