|
Representing Other Objects
The XMLObject expression is used as a container for parts of an XML document other than elements, such as comments, processing instructions, or declarations. It is also used as a container for the entire document itself. This structure has the syntax XMLObject[object][data], where object describes the type of object being represented and data specifies the details of the object. There are six types of objects that can be specified as the first argument, each corresponding to a specific type of XML construct.
Declaration
The XMLObject["Declaration"] expression is used to represent the XML declaration that typically appears at the start of an XML document. This has the following syntax.
XMLObject["Declaration"]["Version" "1.0", option value]
There are two options allowed:
-
"Standalone": this takes the value "yes" if the document references an external DTD and "no" otherwise.
-
"Encoding": this specifies the character encoding used in the document. Not all encodings will be honored on export. If an encoding that Mathematica cannot export is specified, an error message is produced and the encoding is changed in the document.
Here is a typical XML declaration.
<?xml version="1.0" encoding="ascii" standalone="yes"?>
Here is the corresponding SymbolicXML expression.
XMLObject["Declaration"]["Version" "1.0", "Encoding" "ascii" "Standalone" "Yes"]
Comment
The XMLObject["Comment"] expression is used to represent XML comments. It has the following syntax.
XMLObject["comment"]["string"]
Here is an example of an XML comment.
<!-- Created on 3/6/02. -->
Here is the corresponding SymbolicXML expression.
XMLObject["Comment"]["Created on 3/6/02."]
Document
The most important XMLObject is XMLObject["Document"]. It is used as a container for the entire document and has the following syntax.
XMLObject["Document"][{prolog}, document tree, {epilog}]
The prolog may contain an XMLObject["Declaration"], followed by optional processing instructions and DTD declarations. The epilog contains either processing instructions or comments.
Here is an example of a simple document consisting of an XML declaration, a comment, and a single element.
<?xml version='1.0'?>
<!--this is a sample file-->
<root/>
Here is the corresponding SymbolicXML expression.
XMLObject["Document"][{XMLObject["Declaration"]["Version" "1.0"],XMLObject["Comment"]["this is a sample file"]},XMLElement["root",{},{}],{}]
The only option for XMLObject["Document"] is "Valid". This option is set automatically by the parser. If the document was validated on import and validation succeeded, then the option "Valid" True will be included in the XMLObject expression. If validation was attempted but failed, then "Valid" False will be included in the XMLObject. If validation was not attempted, then the option is omitted from the XMLObject expression..
Doctype
The XMLObject["Doctype"] expression is used to represent XML document type declarations. It has the following syntax.
XMLObject["Doctype"][name, option value]
There are three options allowed:
-
"System": specifies a DTD in the local file system, either as a relative pathname or a URI.
-
"Public": specifies a standardized name that is used to publicly identify the DTD.
-
"Internal": specifies an internal DTD subset. Its value is a string that contains the data in the internal DTD subset.
Here is a Doctype declaration that has both a formalized public identifier name as well as a specific location for the DTD along, with an internal DTD subset.
<!DOCTYPE catalog PUBLIC "-//FOO//DTD catalog 1.1//EN" "www.foo.com/example/catalog.dtd"
[internal DTD stuff]>
Here is the corresponding SymbolicXML expression.
XMLObject["Doctype"]["catalog", "Public" "-//FOO//DTD catalog 1.1//EN", "System" "www.foo.com/example/catalog.dtd", "InternalSubset" "internal DTD stuff"]
For more details on XML Doctype declarations, see the W3C XML specification.
ProcessingInstruction
The XMLObject["ProcessingInstruction"] expression is used to represent XML processing instructions. This has the following syntax.
XMLObject["ProcessingInstruction"][target string, optional data string]]
It is common to use attribute-like syntax in processing instructions. These pseudo-attributes are not parsed but are returned as raw strings. Here is a processing instruction that specifies a style sheet.
<?xml-stylesheet href="mystyle.css" type="text/css"?>
Here is the corresponding SymbolicXML expression. Notice that the double quotes around the attribute values are escaped, to distinguish them from the double quotes around the argument as a whole.
XMLObject["ProcessingInstruction"]["xml-stylesheet",
"href=\"mystyle.css\" type=\"text/css\""]
CDATASection
The XMLObject["CDATASection"] expression is used to represent CDATA sections. CDATA is a W3C abbreviation for Character Data. CDATA sections are used in an XML document as a wrapper for raw character data to avoid having to escape special characters such as " and <. These characters would normally have to be indicated as "e; and <, respectively. CDATA sections are used in XML to enclose character data that would require a lot of escaping, such as programs or math expressions.
Here is a simple fragment from an XML document containing a CDATA section.
<![CDATA[ 5 < 7 << 2*10^123]]>
Here is the corresponding SymbolicXML expression.
XMLObject["CDATASection"][" 5 < 7 << 2*10^123"]
By default, CDATASection object wrappers are not preserved on import, and only the contents of the CDATA section are retained. To preserve the CDATASection wrappers, you must explicitly set the option "PreserveCDATASections" True.
For more information on the conversion options for importing XML, see Import Conversion Options.
|