XML and Web Services
January 30, 2017 | Author: vijayakumar | Category: N/A
Short Description
Download XML and Web Services...
UNIT I XXm mll –– BBeenneeffiittss AAddvvaannttaaggeess O Off XXm mll O Ovveerr HHttm mll,, EEddii,, DDaattaabbaasseess XXm mll BBaasseedd SSttaannddaarrddss SSttrruuccttuurriinngg W maass Wiitthh SScchheem maass –– DDttdd,, XXm mll SScchheem XXm mll PPrroocceessssiinngg –– DDoom m XXm mll PPrroocceessssiinngg –– SSaaxx PPrreesseennttaattiioonn TTeecchhnnoollooggiieess –– XXssll XXffoorrm mss ,, XXhhttm mll TTrraannssffoorrm maattiioonn –– XXSSLLTT XXlliinnkk,,XXppaatthh XXqquueerryy 2
XML INTRODUCTION XML stands for eXtensible Markup Language.XML is designed to transport and store data. XML Document Example Tove Jani Reminder Don't forget me this weekend! Introduction to XML XML was designed to transport and store data.HTML was designed to display data. What is XML? • XML stands for EXtensible Markup Language • XML is a markup language much like HTML • XML was designed to carry data, not to display data • XML tags are not predefined. You must define your own tags • XML is designed to be self‐descriptive • XML is a W3C Recommendation The Difference between XML and HTML XML is not a replacement for HTML.XML and HTML were designed with different goals: • XML was designed to transport and store data, with focus on what data is. • HTML was designed to display data, with focus on how data looks. HTML is about displaying information, while XML is about carrying information. XML Does not DO Anything Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store, and transport information. 3
Tove Jani Reminder Don't forget me this weekend! The note above is quite self descriptive. It has sender and receiver information, it also has a heading and a message body.But still, this XML document does not DO anything. It is just pure information wrapped in tags. Someone must write a piece of software to send, receive or display it. XML is Just Plain Text XML is nothing special. It is just plain text. Software that can handle plain text can also handle XML. However, XML‐aware applications can handle the XML tags specially. The functional meaning of the tags depends on the nature of the application. With XML You Invent Your Own Tags The tags in the example above (like and ) are not defined in any XML standard. These tags are "invented" by the author of the XML document. That is because the XML language has no predefined tags. The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only use tags defined in the HTML standard (like , , etc.). XML allows the author to define his own tags and his own document structure. XML is Not a Replacement for HTML XML is a complement to HTML. It is important to understand that XML is not a replacement for HTML. In most web applications, XML is used to transport data, while HTML is used to format and display the data. My best description of XML is this: XML is a software‐ and hardware‐independent tool for carrying information. XML is a W3C Recommendation XML became a W3C Recommendation 10. February 1998. XML is Everywhere We have been participating in XML development since its creation. It has been amazing to see how quickly the XML standard has developed, and how quickly a large number of software vendors has adopted the standard. XML is now as important for the Web as HTML was to the foundation of the Web. XML is everywhere. It is the most common tool for data transmissions between all sorts of applications, and is becoming more and more popular in the area of storing and describing information. How Can XML be Used? XML is used in many aspects of web development, often to simplify data storage and sharing.
XML Separates Data from HTML 4
If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for layout and display, and be sure that changes in the underlying data will not require any changes to the HTML.With a few lines of JavaScript, you can read an external XML file and update the data content of your HTML. XML Simplifies Data Sharing In the real world, computer systems and databases contain data in incompatible formats.XML data is stored in plain text format. This provides a software‐ and hardware‐independent way of storing data.This makes it much easier to create data that different applications can share. XML Simplifies Data Transport With XML, data can easily be exchanged between incompatible systems. One of the most time‐consuming challenges for developers is to exchange data between incompatible systems over the Internet. Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications. XML Simplifies Platform Changes Upgrading to new systems (hardware or software platforms), is always very time consuming. Large amounts of data must be converted and incompatible data is often lost. XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data. XML Makes Your Data More Available Since XML is independent of hardware, software and application, XML can make your data more available and useful. Different applications can access your data, not only in HTML pages, but also from XML data sources. With XML, your data can be available to all kinds of "reading machines" (Handheld computers, voice machines, news feeds, etc), and make it more available for blind people, or people with other disabilities. XML is Used to Create New Internet Languages A lot of new Internet languages are created with XML. Here are some examples: • XHTML the latest version of HTML • WSDL for describing available web services • WAP and WML as markup languages for handheld devices • RSS languages for news feeds • RDF and OWL for describing resources and ontology • SMIL for describing multimedia for the web If Developers Have Sense If they DO have sense, future applications will exchange their data in XML. 5
The future might give us word processors, spreadsheet applications and databases that can read each other's data in a pure text format, without any conversion utilities in between. We can only pray that all the software vendors will agree. XML Tree XML documents form a tree structure that starts at "the root" and branches to "the leaves". An Example XML Document XML documents use a self‐describing and simple syntax: Tove Jani Reminder Don't forget me this weekend! The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO‐8859‐1 = Latin‐1/West European character set). The next line describes the root element of the document (like saying: "this document is a note"): The next 4 lines describe 4 child elements of the root (to, from, heading, and body): Tove Jani Reminder Don't forget me this weekend! And finally the last line defines the end of the root element: You can assume, from this example, that the XML document contains a note to Tove from Jani. Don't you agree that XML is pretty self‐descriptive? XML Documents Form a Tree Structure XML documents must contain a root element. This element is "the parent" of all other elements. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub elements (child elements): ..... The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters).All elements can have text content and attributes (just like in HTML). 6
The image above represents one book in the XML below: Everyday Italian Giada De Laurentiis 2005 30.00 Harry Potter J K. Rowling 2005 29.99 Learning XML Erik T. Ray 2003 39.95 The root element in the example is . All elements in the document are contained within . The element has 4 children: ,, , . XML Syntax Rules The syntax rules of XML are very simple and logical. The rules are easy to learn, and easy to use. All XML Elements Must Have a Closing Tag In HTML, you will often see elements that don't have a closing tag: This is a paragraph 7
This is another paragraph In XML, it is illegal to omit the closing tag. All elements must have a closing tag: This is a paragraph This is another paragraph Note: You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself, and it has no closing tag. XML Tags are Case Sensitive XML elements are defined using XML tags. XML tags are case sensitive. With XML, the tag is different from the tag . Opening and closing tags must be written with the same case: This is incorrect This is correct Note: "Opening and closing tags" are often referred to as "Start and end tags". Use whatever you prefer. It is exactly the same thing. XML Elements Must be Properly Nested In HTML, you might see improperly nested elements: This text is bold and italic In XML, all elements must be properly nested within each other: This text is bold and italic In the example above, "Properly nested" simply means that since the element is opened inside the element, it must be closed inside the element. XML Documents Must Have a Root Element XML documents must contain one element that is the parent of all other elements. This element is called the root element. ..... XML Attribute Values Must be Quoted XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct: Tove Jani Tove Jani The error in the first document is that the date attribute in the note element is not quoted. 8
Entity References Some characters have a special meaning in XML. If you place a character like " Harry Potter J K. Rowling 2005 9
29.99 Learning XML Erik T. Ray 2003 39.95 In the example above, and have element contents, because they contain other elements. has text content because it contains text. In the example above only has an attribute (category="CHILDREN"). XML Naming Rules XML elements must follow these naming rules: • Names can contain letters, numbers, and other characters • Names cannot start with a number or punctuation character • Names cannot start with the letters xml (or XML, or Xml, etc) • Names cannot contain spaces Any name can be used, no words are reserved. Best Naming Practices Make names descriptive. Names with an underscore separator are nice: , . Names should be short and simple, like this: not like this: . Avoid "‐" characters. If you name something "first‐name," some software may think you want to subtract name from first. Avoid "." characters. If you name something "first.name," some software may think that "name" is a property of the object "first." Avoid ":" characters. Colons are reserved to be used for something called namespaces (more later). XML documents often have a corresponding database. A good practice is to use the naming rules of your database for the elements in the XML documents. Non‐English letters like éòá are perfectly legal in XML, but watch out for problems if your software vendor doesn't support them. XML Elements are Extensible XML elements can be extended to carry more information. Look at the following XML example: Tove Jani Don't forget me this weekend! 10
Let's imagine that we created an application that extracted the , , and elements from the XML document to produce this output: MESSAGE To: Tove From: Jani Don't forget me this weekend! Imagine that the author of the XML document added some extra information to it: 2008‐01‐10 Tove Jani Reminder Don't forget me this weekend! Should the application break or crash? No. The application should still be able to find the , , and elements in the XML document and produce the same output. One of the beauties of XML, is that it can often be extended without breaking applications. XML Attributes XML elements can have attributes in the start tag, just like HTML. Attributes provide additional information about elements. XML Attributes From HTML you will remember this: . The "src" attribute provides additional information about the element. In HTML (and in XML) attributes provide additional information about elements: Attributes often provide information that is not a part of the data. In the example below, the file type is irrelevant to the data, but important to the software that wants to manipulate the element: computer.gif XML Attributes Must be Quoted Attribute values must always be enclosed in quotes, but either single or double quotes can be used. For a person's sex, the person tag can be written like this: or like this: If the attribute value itself contains double quotes you can use single quotes, like in this example: or you can use character entities:
XML Elements vs. Attributes Take a look at these examples: Anna Smith female Anna Smith In the first example sex is an attribute. In the last, sex is an element. Both examples provide the same information. There are no rules about when to use attributes and when to use elements. Attributes are handy in HTML. In XML my advice is to avoid them. Use elements instead. My Favorite Way The following three XML documents contain exactly the same information: A date attribute is used in the first example: Tove Jani Reminder Don't forget me this weekend! A date element is used in the second example: 10/01/2008 Tove Jani Reminder Don't forget me this weekend! An expanded date element is used in the third: (THIS IS MY FAVORITE): 10 01 2008 Tove Jani Reminder Don't forget me this weekend!
Avoid XML Attributes? Some of the problems with using attributes are: • attributes cannot contain multiple values (elements can) • attributes cannot contain tree structures (elements can) • attributes are not easily expandable (for future changes) Attributes are difficult to read and maintain. Use elements for data. Use attributes for information that is not relevant to the data.Don't end up like this: XML Attributes for Metadata Sometimes ID references are assigned to elements. These IDs can be used to identify XML elements in much the same way as the ID attribute in HTML. This example demonstrates this: Tove Jani Reminder Don't forget me this weekend! Jani Tove Re: Reminder I will not The ID above is just an identifier, to identify the different notes. It is not a part of the note itself.What I'm trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements.
XML Validation XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML. Well Formed XML Documents A "Well Formed" XML document has correct XML syntax. The syntax rules were described in the previous chapters: • XML documents must have a root element • XML elements must have a closing tag • XML tags are case sensitive • XML elements must be properly nested • XML attribute values must be quoted 13
Tove Jani Reminder Don't forget me this weekend! Valid XML Documents A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD): Tove Jani Reminder Don't forget me this weekend! The DOCTYPE declaration in the example above, is a reference to an external DTD file. The content of the file is shown in the paragraph below. XML DTD The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements: ]> XML Schema W3C supports an XML‐based alternative to DTD, called XML Schema:
A General XML Validator To help you check the syntax of your XML files, we have created an XML validator to syntax‐ check your XML. Viewing XML Files Raw XML files can be viewed in all major browsers. Don't expect XML files to be displayed as HTML pages.
Viewing XML Files ‐ Tove Jani Reminder Don't forget me this weekend! Look at this XML file: note.xml The XML document will be displayed with color‐coded root and child elements. A plus (+) or minus sign (‐) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and ‐ signs), select "View Page Source" or "View Source" from the browser menu. Note: In Chrome, Opera, and Safari, only the element text will be displayed. To view the raw XML, you must right click the page and select "View Source"
Why Does XML Display Like This? XML documents do not carry information about how to display the data. Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like describes an HTML table or a dining table. Without any information about how to display the data, most browsers will just display the XML document as it is. In the next chapters, we will take a look at different solutions to the display problem, using CSS, XSLT and JavaScript. Displaying XML with CSS With CSS (Cascading Style Sheets) you can add display information to an XML document. Displaying your XML Files with CSS? It is possible to use CSS to format an XML document. Below is an example of how to use a CSS style sheet to format an XML document: Below is a fraction of the XML file. The second line links the XML file to the CSS file: 15
Empire Burlesque Bob Dylan USA Columbia 10.90 1985 Hide your heart Bonnie Tyler UK CBS Records 9.90 1988 . . . Formatting XML with CSS is not the most common method. W3C recommend using XSLT instead. See the next chapter. Displaying XML with XSLT With XSLT you can transform an XML document into HTML. Displaying XML with XSLT XSLT is the recommended style sheet language of XML.XSLT (eXtensible Stylesheet Language Transformations) is far more sophisticated than CSS. XSLT can be used to transform XML into HTML, before it is displayed by a browser: Display XML with XSLT
Transforming XML with XSLT on the Server In the example above, the XSLT transformation is done by the browser, when the browser reads the XML file. Different browsers may produce different result when transforming XML with XSLT. To reduce this problem the XSLT transformation can be done on the server. The XMLHttpRequest Object With an XMLHttpRequest you can communicate with your server from inside a web page. What is the XMLHttpRequest Object? The XMLHttpRequest object is the developer’s dream, because you can: • Update a web page with new data without reloading the page 16
• •
Request and receive new data from a server after the page has loaded Communicate with a server in the background
XMLHttpRequest Example When you type in the input box below, an HTTP request is sent to the server and name suggestions are returned from a name list:
Type a letter in the input box: First Name Suggestions: Creating an XMLHttpRequest Object Creating an XMLHttpRequest object is done with one single line of JavaScript. In all modern browsers: var xmlhttp=new XMLHttpRequest() In older Microsoft browsers (IE 5 and 6): var xmlhttp=new ActiveXObject("Microsoft.XMLHTTP") In the next chapter, we will use the XMLHttpRequest object to retrieve XML information from a server.
The XMLHttpRequest object is supported in all modern browsers
Is the XMLHttpRequest Object a W3C Standard? The XMLHttpRequest object is not specified in any W3C recommendation. However, the W3C DOM Level 3 "Load and Save" specification contains some similar functionality, but these are not implemented in any browsers yet.
XML Parser Most browsers have a built‐in XML parser to read and manipulate XML. The parser converts XML into a JavaScript accessible object (the XML DOM). XML Parser The XML DOM contains methods (functions) to traverse XML trees, access, insert, and delete nodes. However, before an XML document can be accessed and manipulated, it must be loaded into an XML DOM object. An XML parser reads XML, and converts it into an XML DOM object that can be accessed with JavaScript. Most browsers have a built‐in XML parser. Load an XML Document The following JavaScript fragment loads an XML document ("books.xml"): Example 17
if (window.XMLHttpRequest) { xhttp=new XMLHttpRequest(); } else // Internet Explorer 5/6 { xhttp=new ActiveXObject("Microsoft.XMLHTTP"); } xhttp.open("GET","books.xml",false); xhttp.send(""); xmlDoc=xhttp.responseXML; Code explained: • Create an XMLHTTP object • Open the XMLHTTP object • Send an XML HTTP request to the server • Set the response as an XML DOM object
Load an XML String The following code loads and parses an XML string: Example if (window.DOMParser) { parser=new DOMParser(); xmlDoc=parser.parseFromString(text,"text/xml"); } else // Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(text); } Note: Internet Explorer uses the loadXML() method to parse an XML string, while other browsers use the DOMParser object.
Access Across Domains For security reasons, modern browsers do not allow access across domains. This means, that both the web page and the XML file it tries to load, must be located on the same server. The examples on W3Schools all open XML files located on the W3Schools domain. If you want to use the example above on one of your web pages, the XML files you load must be located on your own server. The XML DOM In the next chapter of this tutorial, you will learn how to access and retrieve data from the XML document object (the XML DOM). 18
XML DOM The DOM (Document Object Model) defines a standard way for accessing and manipulating documents. The XML DOM The XML DOM (XML Document Object Model) defines a standard way for accessing and manipulating XML documents. The DOM views XML documents as a tree‐structure. All elements can be accessed through the DOM tree. Their content (text and attributes) can be modified or deleted, and new elements can be created. The elements, their text, and their attributes are all known as nodes. In the examples below we use the following DOM reference to get the text from the element: xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue • xmlDoc ‐ the XML document created by the parser. • getElementsByTagName("to")[0] ‐ the first element • childNodes[0] ‐ the first child of the element (the text node) • nodeValue ‐ the value of the node (the text itself) The HTML DOM The HTML DOM (HTML Document Object Model) defines a standard way for accessing and manipulating HTML documents. All HTML elements can be accessed through the HTML DOM. In the examples below we use the following DOM reference to change the text of the HTML element where id="to": document.getElementById("to").innerHTML= • document ‐ the HTML document • getElementById("to") ‐ the HTML element where id="to" • innerHTML ‐ the inner text of the HTML element Load an XML File ‐ A Cross browser Example The following code loads an XML document ("note.xml") into the XML parser: Example W3Schools Internal Note To: From: Message: if (window.XMLHttpRequest) { xhttp=new XMLHttpRequest() } 19
else { xhttp=new ActiveXObject("Microsoft.XMLHTTP") } xhttp.open("GET","note.xml",false); xhttp.send(""); xmlDoc=xhttp.responseXML; document.getElementById("to").innerHTML= xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue; document.getElementById("from").innerHTML= xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue; document.getElementById("message").innerHTML= xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue;
Important Note To extract the text "Jani" from the XML, the syntax is: getElementsByTagName("from")[0].childNodes[0].nodeValue In the XML example there is only one tag, but you still have to specify the array index [0], because the XML parser method getElementsByTagName() returns an array of all nodes. Load an XML String ‐ A Cross browser Example The following code loads and parses an XML string: Example W3Schools Internal Note To: From: Message: text=""; text=text+"Tove"; text=text+"Jani"; text=text+"Reminder"; text=text+"Don't forget me this weekend!"; text=text+""; if (window.DOMParser) { parser=new DOMParser(); xmlDoc=parser.parseFromString(text,"text/xml"); } 20
else // Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(text); } document.getElementById("to").innerHTML= xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue; document.getElementById("from").innerHTML= xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue; document.getElementById("message").innerHTML= xmlDoc.getElementsByTagName("body")[0].childNodes[0].nodeValue; Note: Internet Explorer uses the loadXML() method to parse an XML string, while other browsers use the DOMParser object.
XML to HTML This chapter explains how to display XML data as HTML. Display XML Data in HTML In the example below, we loop through an XML file (cd_catalog.xml), and display each CD element as an HTML table row: Example if (window.XMLHttpRequest) { xhttp=new XMLHttpRequest(); } else // Internet Explorer 5/6 { xhttp=new ActiveXObject("Microsoft.XMLHTTP"); } xhttp.open("GET","cd_catalog.xml",false); xhttp.send(""); xmlDoc=xhttp.responseXML; document.write(""); var x=xmlDoc.getElementsByTagName("CD"); for (i=0;i In the example above, everything inside the CDATA section is ignored by the parser. Notes on CDATA sections: A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed. The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks. XML Encoding XML documents can contain non ASCII characters, like Norwegian æ ø å , or French ê è é. To avoid errors, specify the XML encoding, or save XML files as Unicode. XML Encoding Errors If you load an XML document, you can get two different errors indicating encoding problems: An invalid character was found in text content. You get this error if your XML contains non ASCII characters, and the file was saved as single‐ byte ANSI (or ASCII) with no encoding specified. Single byte XML file with encoding attribute. Same single byte XML file with no encoding attribute. Switch from current encoding to specified encoding not supported. You get this error if your XML file was saved as double‐byte Unicode (or UTF‐16) with a single‐ byte encoding (Windows‐1252, ISO‐8859‐1, UTF‐8) specified. You also get this error if your XML file was saved with single‐byte ANSI (or ASCII), with double‐ byte encoding (UTF‐16) specified. Double byte XML file without encoding. Same double byte XML file with single byte encoding. Windows Notepad Windows Notepad save files as single‐byte ANSI (ASCII) by default. 28
If you select "Save as...", you can specify double‐byte Unicode (UTF‐16). Save the XML file below as Unicode (note that the document does not contain any encoding attribute): Jani Tove Norwegian: æøå. French: êèé The file above, note_encode_none_u.xml will NOT generate an error. But if you specify a single‐ byte encoding it will. The following encoding (open it), will give an error message: The following encoding (open it), will give an error message: The following encoding (open it), will give an error message: The following encoding (open it), will NOT give an error: Conclusion • Always use the encoding attribute • Use an editor that supports encoding • Make sure you know what encoding the editor uses • Use the same encoding in your encoding attribute
XML on the Server XML files are plain text files just like HTML files. XML can easily be stored and generated by a standard web server. Storing XML Files on the Server XML files can be stored on an Internet server exactly the same way as HTML files. Start Windows Notepad and write the following lines: Jani Tove Remember me this weekend Save the file on your web server with a proper name like "note.xml". Generating XML with ASP XML can be generated on a server without any installed XML software. 29
To generate an XML response from the server ‐ simply write the following code and save it as an ASP file on the web server: Note that the content type of the response must be set to "text/xml". Generating XML with PHP To generate an XML response from the server using PHP, use following code: Note that the content type of the response header must be set to "text/xml". Generating XML From a Database XML can be generated from a database without any installed XML software. To generate an XML database response from the server, simply write the following code and save it as an ASP file on the web server: Transforming XML with XSLT on the Server This ASP transforms an XML file to XHTML on the server: `person', `parts' ‐> `part'). The order of the fields is again immaterial. The field A field is represented as an element node with a data node as its only child: d
The DTD specifies that the gatech_student element has two child elements, name and age, that contain character data as well as a gtnum attribute that contains character data. 9. XML Data Reduced (XDR): DTDs proved to be inadequate for the needs of users of XML due to to a number of reasons. The main reasons behind the criticisms of DTDs were the fact that they used a different syntax than XML and their non‐existent support for datatypes. XDR, a recommendation for XML schemas, was submitted to the W3C by the Microsoft Corporation as a potential XML schema standard which but was eventually rejected. XDR tackled some of the problems of DTDs by being XML based as well as supporting a number of datatypes analogous to those used in relational database management systems and popular programming languages. Below is an XML schema, using XDR, for the above XML fragment. 10. XDR FOR SAMPLE XML FRAGMENT 11. 12. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. The above schema specifies types for a name element that contains a string as its content, an age element that contains an unsigned integer value of size one byte (i.e. btw 0 and 255), and a gtnum attribute that is a string value. It also specifies a gatech_student element that has one occurence each of a name and an age element in sequence as well as a gtnum attribute. 24. XML Schema Definitions (XSD) : The W3C XML schema recommendation provides a sophisticated means of describing the structure and constraints on the content model of XML documents. W3C XML schema support more datatypes than XDR, allow for the creation of custom data types, and support object oriented programming concepts like inheritance and polymorphism. Currently XDR is used more widely than than W3C XML schema but this is primarily because the XML Schema recommendation is fairly new and will thus take time to become accepted by the software industry. 59
25. XSD FOR SAMPLE XML FRAGMENT 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. The above schema specifies a gatech_student complex type (meaning it can have elements as children) that contains a name and an age element in sequence as well as a gtnum attribute. The name element has to have a string as content, the age attribute has an unsigned integer value while the gtnum element has to be matched by a regular expression that matches the letters "gt" followed by 3 digits and a letter. The above examples show that DTDs give the least control over how one can constrain and structure data within an XML document while W3C XML schemas give the most. XML Querying: XPath and XQuery It is sometimes necessary to extract subsets of the data stored within an XML document. A number of languages have been created for querying XML documents including Lorel, Quilt, UnQL, XDuce, XML‐QL, XPath, XQL, XQuery and YaTL. Since XPath is already a W3C recommendation while XQuery is on its way to becoming one, the focus of this section will be on both these languages. Both languages can be used to retrieve and manipulate data from an XML document. 1. XML Path Language (XPath): XPath is a language for addressing parts of an XML document that utilizes a syntax that resembles hierarchical paths used to address parts of a filesystem or URL. XPath also supports the use of functions for interacting with the selected data from the document. It provides functions for the accessing information about document nodes as well as for the manipulation of strings, numbers and booleans. XPath is extensible with regards to functions which allows developers to add functions that manipulate the data retrieved by an XPath query to the library of functions available by default. XPath uses a compact, non‐XML syntax in order to facilitate the use of XPath within URIs and XML attribute values (this is important for other W3C recommendations like XML schema and XSLT that use XPath within 60
attributes). XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath is designed to operate on a single XML document which it views as a tree of nodes and the values returned by an XPath query are considered conceptually to be nodes. The types of nodes that exist in the XPath data model of a document are text nodes, element nodes, attribute nodes, root nodes, namespace nodes, processing instruction nodes, and comment nodes. Sample XPath Queries Against Sample XML Fragment a. /gatech_student/name Selects all name elements that are children of the root element gatech_student. b. //age Selects all age elements in the document. c. /gatech_student/* Selects all child elements of the root element gatech_student. d. /gatech_student[@gtnum] Selects all gtnum attributes of the gatech_student elements in the document. e. //*[name()='age'] Selects all elements that are named "age". f. /gatech_student/age/ancestor::* Selects all ancestors of all the age elements that are children of the gatech_student element (which should select the gatech_student element). 2. 3. XML Query Language (XQuery): XQuery is an attempt to provide a query language that provides the same breadth of functionality and underlying formalism as SQL does for relational databases. XQuery is a functional language where each query is an expression. XQuery expressions fall into seven broad types; path expressions, element constructors, FLWR expressions, expressions involving operators and functions, conditional expressions, quantified expressions or expressions that test or modify datatypes. The syntax and semantics of the different kinds of XQuery expressions vary significantly which is a testament to the numerous influences in the design of XQuery. XQuery has a sophisticated type system based on XML schema datatypes and supports the manipulation of the document nodes unlike XPath. Also the data model of XQuery is 61
not only designed to operate on a single XML document but also a well‐formed fragment of a document, a sequence of documents, or a sequence of document fragments. W3C is also working towards creating an alternate version of XQuery that has the same semantics but uses XML based syntax instead called XQueryX. Sample XQuery Queries and Expressions Taken From W3C Working Draft a. path expressions: XQuery supports path expressions that are a superset of those currently being proposed for the next version of XPath. i. //emp[name="Fred"]/salary * 12 From a document that contains employees and their monthly salaries, extract the annual salary of the employee named "Fred". ii. document("zoo.xml")//chapter[2 TO 5]//figure Find all the figures in chapters 2 through 5 of the document named "zoo.xml." b. element constructors: In some situations, it is necessary for a query to create or generate elements. Such elements can be embeded directly into a query in an expression called an element constructor. i. ii. {$name} iii. {$job} iv. v. Generate an element that has an "empid" attribute. The value of the attribute and the content of the element are specified by variables that are bound in other parts of the query. c. FLWR expressions: A FLWR (pronounced "flower") expression is a query construct composed of FOR, LET, WHERE, and a RETURN clauses. A FOR clause is an iteration construct that binds a variable to a sequence of values returned by a query (typically a path expression). A LET clause similarly binds variables to values but instead of a series of bindings only one occurs similar to an assignment statement in a programming language. A WHERE clause contains one or more predicates that are used on the nodes returned by preceding LET or FOR clauses. The RETURN clause generates the output of the FLWR expression, which may be any sequence of nodes or primitive values. The RETURN clause is executed once for each node returned by the FOR and LET clauses that passes the WHERE clause. The results of these multiple executions is concatenated and returned as the result of the expression. i. FOR $b IN document("bib.xml")//book ii. WHERE $b/publisher = "Morgan Kaufmann" iii. AND $b/year = "1998" iv. RETURN $b/title 62
List the titles of books published by Morgan Kaufmann in 1998.
vi. vii. { viii. FOR $p IN distinct(document("bib.xml")//publisher) ix. LET $b := document("bib.xml")//book[publisher = $p] x. WHERE count($b) > 100 xi. RETURN $p xii. } xiii. xiv. List the publishers who have published more than 100 books. d. conditional expressions: A conditional expression evaluates a test expression and then returns one of two result expressions. If the value of the test expression is true, the value of the first result expression is returned otherwise, the value of the second result expression is returned. i. FOR $h IN //holding ii. RETURN iii. iv. {$h/title, v. IF ($h/@type = "Journal") vi. THEN $h/editor vii. ELSE $h/author viii. } ix. x. SORTBY (title) xi. xii. Make a list of holdings, ordered by title. For journals, include the editor, and for all other holdings, include the author. e. quantified expressions: XQuery has constructs that are equivalent to quantifiers used in mathematics and logic. The SOME clause is an existential quantifier used for testing to see if a series of values contains at least one node that satisfies a predicate. The EVERY clause is a universal quantifier used to test to see if all nodes in a series of values satisfy a predicate. i. FOR $b IN //book ii. WHERE SOME $p IN $b//para SATISFIES iii. (contains($p, "sailing") AND contains($p, "windsurfing")) iv. RETURN $b/title v. vi. Find titles of books in which both sailing and windsurfing are mentioned in the same paragraph. vii. FOR $b IN //book viii. WHERE EVERY $p IN $b//para SATISFIES ix. contains($p, "sailing") x. RETURN $b/title 63
xi. xii. Find titles of books in where sailing is mentioned in every paragraph. f. expressions involving user defined functions: Besides providing a core library of functions similar to those in XPath, XQuery also allows user defined functions to be used to extend the core function library. i. NAMESPACE xsd = "http://www.w3.org/2001/XMLSchema" ii. iii. DEFINE FUNCTION depth($e) RETURNS xsd:integer iv. { v. # An empty element has depth 1 vi. # Otherwise, add 1 to max depth of children vii. IF (empty($e/*)) THEN 1 viii. ELSE max(depth($e/*)) + 1 ix. } x. xi. depth(document("partlist.xml")) xii. xiii. Find the maximum depth of the document named "partlist.xml." XML and Databases As was mentioned in the introduction, there is a dichotomy in how XML is used in industry. On one hand there is the document‐centric model of XML where XML is typically used as a means to creating semi‐structured documents with irregular content that are meant for human consumption. An example of document‐centric usage of XML is XHTML which is the XML based successor to HTML. SAMPLE XHTML DOCUMENT Sample Web Page My Sample Web Page All XHTML documents must be well‐formed and valid. The other primary usage of XML is in a data‐centric model. In a data‐centric model, XML is used as a storage or interchange format for data that is structured, appears in a 64
regular order and is most likely to be machine processed instead of read by a human. In a data‐centric model, the fact that the data is stored or transferred as XML is typically incidental since it could be stored or transferred in a number of other formats which may or may not be better suited for the task depending on the data and how it is used. An example of a data‐centric usage of XML is SOAP. SOAP is an XML based protocol used for exchanging information in a decentralized, distributed environment. A SOAP message consists of three parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application‐defined datatypes, and a convention for representing remote procedure calls and responses. SAMPLE SOAP MESSAGE TAKEN FROM W3C SOAP RECOMMENDATION DIS In both models where XML is used, it is sometimes necessary to store the XML in some sort of repository or database that allows for more sophisticated storage and retrieval of the data especially if the XML is to be accessed by multiple users. Below is a description of storage options based on what model of XML usage is required. 7. Data‐centric model: In a data‐centric model where data is stored in a relational database or similar repository; one may want to extract data from a database as XML, store XML into a database or both. For situations where one only needs to extract XML from the database one may use a middleware application or component that retrieves data from the database and returns it as XML. Middleware components that transform relational data to XML and back vary widely in the functionality they provide and how they provide it. 8. Document‐centric model: Content management systems are typically the tool of choice when considering storing, updating and retrieving various XML documents in a shared repository. A content management system typically consists of a repository that stores a variety of XML documents, an editor and an engine that provides one or more of the following features: version, revison and access control ability to reuse documents in different formats collaboration 65
web publishing facilities support for a variety of text editors (e.g. Microsoft Word, Adobe Framemaker, etc) indexing and search capabilities
Content management systems have been primarily of benefit for workflow management in corporate environments where information sharing is vital and as a way to manage the creation of web content in a modular fashion allowing web developers and content creators to perform their tasks with less interdependence than exists in a traditional web authoring environment. Examples of XML based content management systems are SyCOMAX, Content@, Frontier, Entrepid, XDisect, and SiberSafe. 9. Hybrid model: In situations where both documentric‐centric and data‐ centric models of XML usage will occur, the best data storage choice is usually a native XML database. What actually constitutes a native XML database has been a topic of some debate in various fora which has been compounded by the blurred lines that many see between XML‐enabled databases, XML query engines, XML servers and native XML databases. The most coherrent definition so far is one that was reached by consensus amongst members of the XML:DB mailing list which defines a native XML database as a database that has an XML document as its fundamental unit of (logical) storage and defines a (logical) model for an XML document, as opposed to the data in that document, and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Described below are two examples of native XML databases with the intent of showing the breadth of functionality and variety that can be expected in the native XML database arena. Querying the XML documents within the system is done using XPath and the documents can be indexed to improve query performance. dbXML is written in Java but supports access from other languages by exposing a CORBA API thus allowing interaction with any language that supports a CORBA binding. It also ships with a Java implementation of the XML:DB XML Database API which is designed to be a vendor neutral API for XML databases. A number of command line tools for managing documents and collections are also provided. dbXML is mostly still in development (version at time of writing was 1.0 beta 2) and does not currently support transactions or the use of schemas but these features are currently being developed for future versions. XML has evolved into a viable alternative for representing data. As more applications use XML, the big question becomes how to combine XML with relational databases. Let’s dive deeper into the issues involved in combining XML with databases and look at how all that data can be stored and queried. XML database types 66
There are two categories to consider when deciding which type of XML database fits a particular application: • Data‐centric: Products that actually store the data or content in non‐XML format • Document‐centric: Products that store complete XML documents in relational tables or on disk in file structures Data‐centric databases store data separate from the XML schema, usually just transforming the original content into relational tables. These products are referred to as XML‐enabled databases. If an XML document is needed, the data stored in relational tables can be queried and an XML document created. Most major relational databases (Sybase, Oracle, and SQL Server) fall into this category. Document‐centric databases store the entire XML document in a relational, text, or proprietary format. These are called native XML databases. A couple of popular native XML databases are the Xindice (zeen‐dee‐chay) open source product from Apache and eXist, which is also open source. Querying, XML style Support for XPath or XML queries is a primary feature in XML databases. The major relational database vendors provide XPath support, while native XML databases provide support for querying with XPath, usually via the XML:DB API. Finding developers who understand XPath, much less database administrators, is a problem. However, for a simply structured database or hierarchical database, or for XML documents, XPath is more efficient than SQL. Unfortunately, the necessary string and date functions to manipulate the results don’t exist in SQL (String and Date functions are used in the XSLT code). For more complete queries, XML Query is more like SQL but is less supported. For example, the SQL query below can’t be represented with XPath: SELECT left(name,3) from employees However, the following SQL: SELECT * FROM employees WHERE left(name,3) = 'hoo' can be queried with something like: //employees/name[starts‐with(last,'hoo')] and with this XML Query: for $t in document("employeeList.xml")//(employee)/name where contains($t/text(), "hoo") return $t } Bear in mind that XML:DB and XPath are more efficient at querying XML documents, not relational data structures. XML support in relational databases 67
XML‐enabled applications support creating information as XML, and reading XML is an important feature. Large vendors like Microsoft, Oracle, and IBM (and more) have succeeded in transforming relational data into XML and have XPath or XML Query implementations. Each platform also offers tools to compliment its database offerings. Programming is often required to maintain XML content, and SDKs (Software Development Kits) are available. Most systems (excluding XML databases) lack methods to directly import or read XML documents. You can program SQL Server 2000 with stored procedures to import XML content directly into one or more tables. DB2 and Oracle have similar functionality. XML‐centric applications, such as BizTalk Server and XML Spy, do a much better job at reading XML documents since they act as a bridge between the XML and database. However, these programs require a serious commitment and substantial investment. One reason that these applications import XML so easily is that they support XML Schemas and data type definitions (DTDs). As more mature XML applications are developed, translating XML data (reading and writing) based on a DTD or Schema will allow much more flexibility in how the data can be used, because the DTD or Schema is easily mapped to relational tables or the needed data model. Document content and Web pages Content delivered on Web sites is still basically stored in static HTML pages or relational databases, even though this type of "informational" content is probably best suited for XML. One of the more popular products that stores such content in XML is Cocoon from Apache. Enhydra is another Java/XML‐based application server, and the eXist and Xindice database products easily integrate with Cocoon. For catalogs, documents, and other data, XML delivers on the promise of an efficient data store and transport. More content is available with native XML databases or XML‐enabled relational databases. Web sites and online content will benefit most from a native XML database. For more information, check out the XML:DB Web site. Is XML a database? XML document is a collection of data. In other words it doesn’t make much difference between the other files that store data. A XML in a database format is a self describing, portable, and can describe data in tree or graph structure. XML is a sort of Database Management System (DBMS). XML provides storage, schemas, query languages, programming interfaces and so on .It lacks in triggers, queries, multi‐user access that a real database constitutes. The main advantage of XML is that the data is portable and it allows you to have nested entries. XML allows you to preserve physical document structure, supports document level transactions and execute queries in an XML query language. Mapping the XML document schema to the database schema does the transfer of data between XML documents and a database. Mappings between document schemas and database schemas are performed on attributes and text. There are 2 mappings that are generally used to map on XML document schema to the database schema: 68
I) TABLE BASED MAPPING ii) OBJECT RELATIONAL MAPPING Native XML databases are designed especially to store XML documents .It is always possible to store data in XML documents in a native XML database. This is done so, when your data is semi‐ structured. Although, this kind of data can be stored in object oriented and hierarchical databases, it is always better to store it in a native XML database. It enables us to retrieve data much faster than a relational database. One more reason is to store data in a native XML database is to exploit XML specification capabilities, such as executing XML queries. Advantages of web services built on XML based standards Web services built on XML based standards has a lot of benefits over the other web services that are based on RPC. The RPCs' are platform dependent but the web services built using the XML standards are platform and language independent. With this advantage you can use it for communication between any types of application that resides on any platform. The invocation information is passed to the service provider in the form an XML document and hence it is platform independent. The protocol used for such a transfer is the HTTP that is supported by all the browsers. Hence you can just pass on the information regarding the object that is to be executed through the browser itself using the HTTP. This is one of the major advantages of using the XML based web services as it can easily pass through the firewalls. Standard
Description and Comments
XML 1.0 (4th Ed.)
The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document from the W3C. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.
Namespaces in XML 1.0 (2nd Ed.)
XML namespaces provide a simple method for qualifying element and attribute names used in Extensible Markup Language (XML) documents by 69
associating them with namespaces identified by URI references.
XML Infoset 1.0 (2nd Ed.)
XML Information Set (Infoset) provides a set of definitions for use in other specifications that need to refer to the information in an XML document.
XSLT 1.0
XSLT 1.0 is designed for use as part of XSL, which is a stylesheet language for XML. In addition to XSLT, XSL includes an XML vocabulary for specifying formatting. XSL specifies the styling of an XML document by using XSLT to describe how the document is transformed into another XML document that uses the formatting vocabulary. The rarely used elements xsl:strip‐space and xsl:preserve‐space are currently ignored.
XSLT 2.0
XSLT 2.0 is the long‐awaited upgrade to XSLT 1.0 and includes important new schema‐aware functions, grouping, aggregation, node‐set, "for" loops, and much more. For a detailed description of the new capabilities, please see this comparison. The rarely used elements xsl:strip‐space and xsl:preserve‐space are currently ignored. Also, the attribute input‐type‐annotations is not yet supported.
XPath 1.0
XPath 1.0 is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer.
XPath 2.0
XPath 2.0 is a superset of [XPath 1.0], with the added capability to support a richer set of data types, and to take advantage of the type information that becomes available when documents are validated using XML Schema. For a detailed description of the new capabilities, please see this comparison.
XQuery 1.0
An extension of the XPath 2.0 specification, XQuery is a language for extracting information from XML documents and databases.
XInclude 1.0 (2nd Ed.)
XInclude specifies a processing model and syntax for general purpose inclusion. Inclusion is accomplished by merging a number of XML information sets into a single composite infoset. 70
Specification of the XML documents (infosets) to be merged and control over the merging process is expressed in XML‐friendly syntax (elements, attributes, URI references).
XPointer 1.0
XML Pointer Language (XPointer) is the language to be used as the basis for a fragment identifier for any URI reference that locates a resource whose Internet media type is one of text/xml, application/xml, text/xml‐external‐parsed‐entity, or application/xml‐external‐parsed‐entity.
XML Schema 1.0 (2nd Ed.)
XML Schema specifies the XML Schema definition language, which offers facilities for describing the structure and constraining the contents of XML 1.0 documents, including those which exploit the XML Namespace facility. The schema language, which is itself represented in XML 1.0 and uses namespaces, substantially reconstructs and considerably extends the capabilities found in XML 1.0 document type definitions (DTDs).
SOAP 1.2
SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of three parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application‐defined datatypes, and a convention for representing remote procedure calls and responses. SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and HTTP Extension Framework.
WSDL 1.1
WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document‐oriented or procedure‐oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description 71
of endpoints and their messages regardless of what message formats or network protocols are used to communicate, however, the only bindings described in this document describe how to use WSDL in conjunction with SOAP 1.1, HTTP GET/POST, and MIME.
Resource Description Framework (RDF) is a family of W3C specifications originally designed as a metadata data model, that has come to be used as a general method of modeling information through a variety of syntax formats.
The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine interpretability of Web content than that supported by XML, RDF, and RDF Schema (RDF‐S) by providing additional vocabulary along with a formal semantics. OWL has three increasingly‐expressive sublanguages: OWL Lite, OWL DL, and OWL Full.
XML Catalogs
In order to make optimal use of the information about an XML external resource, there needs to be some interoperable way to map the information in an XML external identifier into a URI reference for the desired resource. This OASIS XML Catalog Standard defines an entity catalog that handles two simple cases: • Mapping an external entity's public identifier and/or system identifier to a URI reference. • Mapping the URI reference of a resource (a namespace name, stylesheet, image, etc.) to another URI reference.
Unicode 4.1.0
The Unicode Standard, Version 4.1.0, defined by: The Unicode Standard, Version 4.0 (Boston, MA, Addison‐Wesley, 2003. ISBN 0‐321‐18578‐1), as amended by Unicode 4.0.1 and Unicode 4.1.0.
UML 2.2
UML is a graphical language for organizing, analyzing, and planning object‐oriented or component‐based software projects. The UML 2.2 specification defines thirteen major different 72
diagram types and over one thousand graphical and textual language elements, as well as additional extension mechanisms.
XMI 2.1
XMI is a model driven XML Integration framework for defining, interchanging, manipulating and integrating XML data and objects. XMI‐based standards are in use for integrating tools, repositories, applications and data warehouses. XMI provides rules by which a schema can be generated for any valid XMI‐transmissible MOF‐based metamodel.
BPMN 1.0
The Business Process Modeling Notation (BPMN) is a graphical notation that depicts the steps in a business process. BPMN depicts the end to end flow of a business process. The notation has been specifically designed to coordinate the sequence of processes and the messages that flow between different process participants in a related set of activities.
CSS 2.1
CSS 2.1 is a style sheet language that allows authors and users to attach style (e.g., fonts and spacing) to structured documents (e.g., HTML documents and XML applications). By separating the presentation style of documents from the content of documents, CSS 2.1 simplifies Web authoring and site maintenance.
HTML 4.01
HTML 4 supports more multimedia options, scripting languages, style sheets, better printing facilities, and documents that are more accessible to users with disabilities. HTML 4 also takes great strides towards the internationalization of documents, with the goal of making the Web truly World Wide.
JavaScript is a scripting language that is often used for client‐side Web development to write functions that are embedded or included from HTML pages for dynamic presentation features such as pop‐up windows, form validation, and mouse‐over effects. JavaScript is a superset of the ECMA‐262 Edition 3 (ECMAScript) standard scripting language, with only mild differences from the published standard.
EDIFACT D 1993A ‐ D 2007B
EDIFACT is a set of United Nations rules for Electronic Data Interchange for Administration, Commerce and Transport. They comprise a set of internationally agreed standards, directories and guidelines for the electronic interchange of structured data, and in particular that related to trade in goods and services between independent, computerized information systems.
X12 3040 ‐ 5030
ASC X12 brings together business and industry professionals in a cross‐industry forum to develop and support electronic data exchange standards and related documents for the national and international marketplace to enhance business processes, reduce costs and expand organizational reach.
WebDAV stands for "Web‐based Distributed Authoring and Versioning". It is a set of extensions to the HTTP protocol which allows users to collaboratively edit and manage files on remote web servers.
ISO/IEC 9075 defines the SQL database language. The scope of SQL is the definition of data structure and the operations on data stored in that structure. ISO/IEC 9075‐1, ‐2 and ‐11 encompass the minimum requirements of the language. Other parts define extensions.
Output formats: Standard Description and Comments
RTF 1.9
The Rich Text Format (RTF) Specification provides a format for text and graphics interchange that can be used with different output devices, operating environments, and operating systems. Version 1.9.1 of the specification contains the latest updates introduced by Microsoft Office Word 2007.
PDF 1.7
PDF is now a formal open standard known as ISO 32000. Maintained by the International Organization for Standardization, ISO 32000 will continue to be developed with the objective of protecting the integrity and longevity of PDF, providing an open standard for the more than one billion PDF files in existence today.
XSL:FO, is a markup language for XML document formatting which is most often used to generate PDFs. XSL:FO is part of XSL, a set of W3C technologies designed for the transformation and formatting of XML data. The other parts of XSL are XSLT and XPath.
DTD –DOCUMENT TYPE DEFINITION The purpose of a DTD (Document Type Definition) is to define the legal building blocks of an XML document. A DTD defines the document structure with a list of legal elements and attributes. DTD Newspaper Example ]> Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference. Internal DTD Declaration If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax: 75
Example XML document with an internal DTD: ]> Tove Jani Reminder Don't forget me this weekend Open the XML file above in your browser (select "view source" or "view page source" to view the DTD) The DTD above is interpreted like this: • !DOCTYPE note defines that the root element of this document is note • !ELEMENT note defines that the note element contains four elements: "to,from,heading,body" • !ELEMENT to defines the to element to be of type "#PCDATA" • !ELEMENT from defines the from element to be of type "#PCDATA" • !ELEMENT heading defines the heading element to be of type "#PCDATA" • !ELEMENT body defines the body element to be of type "#PCDATA" External DTD Declaration If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax: This is the same XML document as above, but with an external DTD (Open it, and select view source): Tove Jani Reminder Don't forget me this weekend! And this is the file "note.dtd" which contains the DTD:
Why Use a DTD? With a DTD, each of your XML files can carry a description of its own format. With a DTD, independent groups of people can agree to use a standard DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid.You can also use a DTD to verify your own data. DTD ‐ XML Building Blocks The main building blocks of both XML and HTML documents are elements. The Building Blocks of XML Documents Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the following building blocks: • Elements • Attributes • Entities • PCDATA • CDATA
Elements Elements are the main building blocks of both XML and HTML documents. Examples of HTML elements are "body" and "table". Examples of XML elements could be "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img". Examples: some text some text Attributes Attributes provide extra information about elements. Attributes are always placed inside the opening tag of an element. Attributes always come in name/value pairs. The following "img" element has additional information about a source file: The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /". Entities Some characters have a special meaning in XML, like the less than sign ( Jani Tove Re: Reminder I will not! The ID in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data. What I am trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements. DTD ‐ Entities Entities are variables used to define shortcuts to standard text or special characters. • Entity references are references to entities • Entities can be declared internal or external
An Internal Entity Declaration Syntax DTD Example: XML example: &writer;©right; Note: An entity has three parts: an ampersand (&), an entity name, and a semicolon (;). An External Entity Declaration Syntax Example DTD Example: XML example: &writer;©right; DTD Validation With Internet Explorer 5+ you can validate your XML against a DTD. Validating With the XML Parser If you try to open an XML document, the XML Parser might generate an error. By accessing the parseError object, you can retrieve the error code, the error text, or even the line that caused the error. Note: The load( ) method is used for files, while the loadXML( ) method is used for strings. Example var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.validateOnParse="true"; xmlDoc.load("note_dtd_error.xml"); document.write("Error Code: "); document.write(xmlDoc.parseError.errorCode); document.write("Error Reason: "); 86
document.write(xmlDoc.parseError.reason); document.write("Error Line: "); document.write(xmlDoc.parseError.line);
Turn Validation Off Validation can be turned off by setting the XML parser's validateOnParse="false". Example var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.validateOnParse="false"; xmlDoc.load("note_dtd_error.xml"); document.write("Error Code: "); document.write(xmlDoc.parseError.errorCode); document.write("Error Reason: "); document.write(xmlDoc.parseError.reason); document.write("Error Line: "); document.write(xmlDoc.parseError.line);
DTD ‐ Examples from the internet TV Schedule DTD By David Moisan. Copied from http://www.davidmoisan.org/ ]>
Newspaper Article DTD Copied from http://www.vervet.com/ 87
]> Product Catalog DTD Copied from http://www.vervet.com/ ]> DTD Summary This tutorial has taught you how to describe the structure of an XML document. You have learned how to use a DTD to define the legal elements of an XML document, and how a DTD can be declared inside your XML document, or as an external reference. You have learned how to declare the legal elements, attributes, entities, and CDATA sections for XML documents. You have also seen how to validate an XML document against a DTD. Now You Know DTD, What's Next? The next step is to learn about XML Schema. XML Schema is used to define the legal elements of an XML document, just like a DTD. We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. XML Schema is an XML‐based alternative to DTD. Unlike DTD, XML Schemas has support for data types and namespaces. Limitations / Problems with DTD Top 15 reasons for avoiding DTD: 1. not itself using XML syntax (the SGML heritage can be very unintuitive + if using XML, DTDs could potentially themselves be syntax checked with a "meta DTD") 2. mixed into the XML 1.0 spec (would be much less confusing if specified separately + even non‐validating processors must look at the DTD) 3. no constraints on character data (if character data is allowed, any character data is allowed) 4. too simple attribute value models (enumerations are clearly insufficient) 5. cannot mix character data and regexp content models (and the content models are generally hard to use for complex requirements) 6. no support for Namespaces (of course, XML 1.0 was defined before Namespaces) 7. very limited support for modularity and reuse (the entity mechanism is too low‐level) 8. no support for schema evolution, extension, or inheritance of declarations (difficult to write, maintain, and read large DTDs, and to define families of related schemas) 9. limited white‐space control (xml:space is rarely used) 10. no embedded, structured self‐documentation ( are not enough) 11. content and attribute declarations cannot depend on attributes or element context (many XML languages use that, but their DTDs have to "allow too much") 12. too simple ID attribute mechanism (no points‐to requirements, uniqueness scope, etc.) 13. only defaults for attributes, not for elements (but that would often be convenient) 14. cannot specify "any element" or "any attribute" (useful for partial specifications and during schema development) 15. defaults cannot be specified separate from the declarations (would be convenient to have defaults in separate modules) 89
XML Schema An XML Schema describes the structure of an XML document. In this tutorial you will learn how to create XML Schemas, why XML Schemas are more powerful than DTDs, and how to use XML Schema in your application. XML Schema Example
Introduction to XML Schema XML Schema is an XML‐based alternative to DTD. An XML schema describes the structure of an XML document. The XML Schema language is also referred to as XML Schema Definition (XSD). What You Should Already Know Before you continue you should have a basic understanding of the following: • • •
HTML / XHTML XML and XML Namespaces A basic understanding of DTD
What is an XML Schema? The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: • • • • • • • •
defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes
XML Schemas are the Successors of DTDs We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons: • • • • •
XML Schemas are extensible to future additions XML Schemas are richer and more powerful than DTDs XML Schemas are written in XML XML Schemas support data types XML Schemas support namespaces
XML Schema is a W3C Recommendation XML Schema became a W3C Recommendation 02. May 2001. Why Use XML Schemas? XML Schemas are much more powerful than DTDs. XML Schemas Support Data Types One of the greatest strength of XML Schemas is the support for data types. With support for data types: • • • •
It is easier to describe allowable document content It is easier to validate the correctness of data It is easier to work with data from a database It is easier to define data facets (restrictions on data) 91
• •
It is easier to define data patterns (data formats) It is easier to convert data between different data types
XML Schemas use XML Syntax Another great strength about XML Schemas is that they are written in XML. Some benefits of that XML Schemas are written in XML: • • • • •
You don't have to learn a new language You can use your XML editor to edit your Schema files You can use your XML parser to parse your Schema files You can manipulate your Schema with the XML DOM You can transform your Schema with XSLT
XML Schemas Secure Data Communication When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about the content. With XML Schemas, the sender can describe the data in a way that the receiver will understand. A date like: "03‐11‐2004" will, in some countries, be interpreted as 3.November and in other countries as 11.March. However, an XML element with a data type like this: 2004‐03‐11 ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYY‐MM‐DD". XML Schemas are Extensible XML Schemas are extensible, because they are written in XML. With an extensible Schema definition you can: • • •
Reuse your Schema in other Schemas Create your own data types derived from the standard types Reference multiple schemas in the same document
Well‐Formed is not Enough A well‐formed XML document is a document that conforms to the XML syntax rules, like: • • • •
it must begin with the XML declaration it must have one unique root element start‐tags must have matching end‐tags elements are case sensitive 92
• • • •
all elements must be closed all elements must be properly nested all attribute values must be quoted entities must be used for special characters
Even if documents are well‐formed they can still contain errors, and those errors can have serious consequences. Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers. With XML Schemas, most of these errors can be caught by your validating software. XSD How To? XML documents can have a reference to a DTD or to an XML Schema. A Simple XML Document Look at this simple XML document called "note.xml": Tove Jani Reminder Don't forget me this weekend! A DTD File The following example is a DTD file called "note.dtd" that defines the elements of the XML document above ("note.xml"): The first line defines the note element to have four child elements: "to, from, heading, body". Line 2‐5 defines the to, from, heading, body elements to be of type "#PCDATA". 93
An XML Schema The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document above ("note.xml"): The note element is a complex type because it contains other elements. The other elements (to, from, heading, body) are simple types because they do not contain other elements. You will learn more about simple and complex types in the following chapters. A Reference to a DTD This XML document has a reference to a DTD: Tove Jani Reminder Don't forget me this weekend! 94
A Reference to an XML Schema This XML document has a reference to an XML Schema: Tove Jani Reminder Don't forget me this weekend! XSD ‐ The Element The element is the root element of every XML Schema. The Element The element is the root element of every XML Schema: ... ... The element may contain some attributes. A schema declaration often looks something like this: ... ... 95
The following fragment: xmlns:xs="http://www.w3.org/2001/XMLSchema" indicates that the elements and data types used in the schema come from the "http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the elements and data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: This fragment: targetNamespace="http://www.w3schools.com" indicates that the elements defined by this schema (note, to, from, heading, body.) come from the "http://www.w3schools.com" namespace. This fragment: xmlns="http://www.w3schools.com" indicates that the default namespace is "http://www.w3schools.com". This fragment: elementFormDefault="qualified" indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified. Referencing a Schema in an XML Document This XML document has a reference to an XML Schema: Tove Jani Reminder Don't forget me this weekend! The following fragment: 96
xmlns="http://www.w3schools.com" specifies the default namespace declaration. This declaration tells the schema‐validator that all the elements used in this XML document are declared in the "http://www.w3schools.com" namespace. Once you have the XML Schema Instance namespace available: xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance" you can use the schemaLocation attribute. This attribute has two values. The first value is the namespace to use. The second value is the location of the XML schema to use for that namespace: xsi:schemaLocation="http://www.w3schools.com note.xsd" XSD Simple Elements XML Schemas define the elements of your XML files. A simple element is an XML element that contains only text. It cannot contain any other elements or attributes. What is a Simple Element? A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes. However, the "only text" restriction is quite misleading. The text can be of many different types. It can be one of the types included in the XML Schema definition (boolean, string, date, etc.), or it can be a custom type that you can define yourself. You can also add restrictions (facets) to a data type in order to limit its content, or you can require the data to match a specific pattern. Defining a Simple Element The syntax for defining a simple element is: where xxx is the name of the element and yyy is the data type of the element. XML Schema has a lot of built‐in data types. The most common types are: • •
xs:string xs:decimal 97
• • • •
xs:integer xs:boolean xs:date xs:time
Example Here are some XML elements: Refsnes 36 1970‐03‐27 And here are the corresponding simple element definitions: Default and Fixed Values for Simple Elements Simple elements may have a default value OR a fixed value specified. A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red": A fixed value is also automatically assigned to the element, and you cannot specify another value. In the following example the fixed value is "red": XSD Attributes All attributes are declared as simple types. What is an Attribute? Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type.
How to Define an Attribute? The syntax for defining an attribute is: where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a lot of built‐in data types. The most common types are: • • • • • •
xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
Example Here is an XML element with an attribute: Smith And here is the corresponding attribute definition: Default and Fixed Values for Attributes Attributes may have a default value OR a fixed value specified. A default value is automatically assigned to the attribute when no other value is specified. In the following example the default value is "EN": A fixed value is also automatically assigned to the attribute, and you cannot specify another value. In the following example the fixed value is "EN":
Optional and Required Attributes Attributes are optional by default. To specify that the attribute is required, use the "use" attribute: Restrictions on Content When an XML element or attribute has a data type defined, it puts restrictions on the element's or attribute's content. If an XML element is of type "xs:date" and contains a string like "Hello World", the element will not validate. With XML Schemas, you can also add your own restrictions to your XML elements and attributes. These restrictions are called facets. You can read more about facets in the next chapter. XSD Restrictions/Facets Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on XML elements are called facets. Restrictions on Values The following example defines an element called "age" with a restriction. The value of age cannot be lower than 0 or greater than 120: Restrictions on a Set of Values To limit the content of an XML element to a set of acceptable values, we would use the enumeration constraint. 100
The example below defines an element called "car" with a restriction. The only acceptable values are: Audi, Golf, BMW: The example above could also have been written like this: Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element. Restrictions on a Series of Values To limit the content of an XML element to define a series of numbers or letters that can be used, we would use the pattern constraint. The example below defines an element called "letter" with a restriction. The only acceptable value is ONE of the LOWERCASE letters from a to z: The next example defines an element called "initials" with a restriction. The only acceptable value is THREE of the UPPERCASE letters from a to z: 101
The next example also defines an element called "initials" with a restriction. The only acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a to z: The next example defines an element called "choice" with a restriction. The only acceptable value is ONE of the following letters: x, y, OR z: The next example defines an element called "prodid" with a restriction. The only acceptable value is FIVE digits in a sequence, and each digit must be in a range from 0 to 9: Other Restrictions on a Series of Values The example below defines an element called "letter" with a restriction. The acceptable value is zero or more occurrences of lowercase letters from a to z: 102
The next example also defines an element called "letter" with a restriction. The acceptable value is one or more pairs of letters, each pair consisting of a lower case letter followed by an upper case letter. For example, "sToP" will be validated by this pattern, but not "Stop" or "STOP" or "stop": The next example defines an element called "gender" with a restriction. The only acceptable value is male OR female: The next example defines an element called "password" with a restriction. There must be exactly eight characters in a row and those characters must be lowercase or uppercase letters from a to z, or a number from 0 to 9: 103
Restrictions on Whitespace Characters To specify how whitespace characters should be handled, we would use the whiteSpace constraint. This example defines an element called "address" with a restriction. The whiteSpace constraint is set to "preserve", which means that the XML processor WILL NOT remove any white space characters: This example also defines an element called "address" with a restriction. The whiteSpace constraint is set to "replace", which means that the XML processor WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage returns) with spaces: This example also defines an element called "address" with a restriction. The whiteSpace constraint is set to "collapse", which means that the XML processor WILL REMOVE all white space characters (line feeds, tabs, spaces, carriage returns are replaced with spaces, leading and trailing spaces are removed, and multiple spaces are reduced to a single space): Restrictions on Length To limit the length of a value in an element, we would use the length, maxLength, and minLength constraints. 104
This example defines an element called "password" with a restriction. The value must be exactly eight characters: This example defines another element called "password" with a restriction. The value must be minimum five characters and maximum eight characters: Restrictions for Datatypes Constraint Description enumeration Defines a list of acceptable values fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or greater than zero length Specifies the exact number of characters or list items allowed. Must be equal to or greater than zero maxExclusive Specifies the upper bounds for numeric values (the value must be less than this value) maxInclusive Specifies the upper bounds for numeric values (the value must be less than or equal to this value) maxLength Specifies the maximum number of characters or list items allowed. Must be equal to or greater than zero minExclusive Specifies the lower bounds for numeric values (the value must be greater than this value) minInclusive Specifies the lower bounds for numeric values (the value must be greater than or equal to this value) minLength Specifies the minimum number of characters or list items allowed. Must be equal to or greater than zero pattern Defines the exact sequence of characters that are acceptable 105
totalDigits Specifies the exact number of digits allowed. Must be greater than zero whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handled XSD Complex Elements A complex element contains other elements and/or attributes. What is a Complex Element? A complex element is an XML element that contains other elements and/or attributes. There are four kinds of complex elements: • • • •
empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text
Note: Each of these elements may contain attributes as well! Examples of Complex Elements A complex XML element, "product", which is empty: A complex XML element, "employee", which contains only other elements: John Smith A complex XML element, "food", which contains only text: Ice cream A complex XML element, "description", which contains both elements and text: It happened on 03.03.99 .... 106
How to Define a Complex Element Look at this complex XML element, "employee", which contains only other elements: John Smith We can define a complex element in an XML Schema two different ways: 1. The "employee" element can be declared directly by naming the element, like this: If you use the method described above, only the "employee" element can use the specified complex type. Note that the child elements, "firstname" and "lastname", are surrounded by the indicator. This means that the child elements must appear in the same order as they are declared. You will learn more about indicators in the XSD Indicators chapter. 2. The "employee" element can have a type attribute that refers to the name of the complex type to use: If you use the method described above, several elements can refer to the same complex type, like this: 107
You can also base a complex element on an existing complex element and add some elements, like this: XSD Empty Elements An empty complex element cannot have contents, only attributes. Complex Empty Elements An empty XML element: The "product" element above has no content at all. To define a type with no content, we must define a type that allows elements in its content, but we do not actually declare any elements, like this: 108
In the example above, we define a complex type with a complex content. The complexContent element signals that we intend to restrict or extend the content model of a complex type, and the restriction of integer declares one attribute but does not introduce any element content. However, it is possible to declare the "product" element more compactly, like this: Or you can give the complexType element a name, and let the "product" element have a type attribute that refers to the name of the complexType (if you use this method, several elements can refer to the same complex type): XSD Elements Only An "elements‐only" complex type contains an element that contains only other elements. Complex Types Containing Elements Only An XML element, "person", that contains only other elements: John Smith 109
You can define the "person" element in a schema, like this: Notice the tag. It means that the elements defined ("firstname" and "lastname") must appear in that order inside a "person" element. Or you can give the complexType element a name, and let the "person" element have a type attribute that refers to the name of the complexType (if you use this method, several elements can refer to the same complex type): XSD Text‐Only Elements A complex text‐only element can contain text and attributes. Complex Text‐Only Elements This type contains only simple content (text and attributes), therefore we add a simpleContent element around the content. When using simple content, you must define an extension OR a restriction within the simpleContent element, like this: .... .... 110
OR .... .... Tip: Use the extension/restriction element to expand or to limit the base simple type for the element. Here is an example of an XML element, "shoesize", that contains text‐only: 35 The following example declares a complexType, "shoesize". The content is defined as an integer value, and the "shoesize" element also contains an attribute named "country": We could also give the complexType element a name, and let the "shoesize" element have a type attribute that refers to the name of the complexType (if you use this method, several elements can refer to the same complex type): 111
XSD Mixed Content A mixed complex type element can contain attributes, elements, and text. Complex Types with Mixed Content An XML element, "letter", that contains both text and other elements: Dear Mr.John Smith. Your order 1032 will be shipped on 2001‐07‐13. The following schema declares the "letter" element: Note: To enable character data to appear between the child‐elements of "letter", the mixed attribute must be set to "true". The tag means that the elements defined (name, orderid and shipdate) must appear in that order inside a "letter" element. We could also give the complexType element a name, and let the "letter" element have a type attribute that refers to the name of the complexType (if you use this method, several elements can refer to the same complex type): 112
XSD Indicators We can control HOW elements are to be used in documents with indicators. Indicators There are seven indicators: Order indicators: • • •
All Choice Sequence
Occurrence indicators: • •
maxOccurs minOccurs
Group indicators: • •
Group name attributeGroup name
Order Indicators Order indicators are used to define the order of the elements. All Indicator The indicator specifies that the child elements can appear in any order, and that each child element must occur only once:
Note: When using the indicator you can set the indicator to 0 or 1 and the indicator can only be set to 1 (the and are described later). Choice Indicator The indicator specifies that either one child element or another can occur: Sequence Indicator The indicator specifies that the child elements must appear in a specific order: Occurrence Indicators Occurrence indicators are used to define how often an element can occur. Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group name, and group reference) the default value for maxOccurs and minOccurs is 1. maxOccurs Indicator The indicator specifies the maximum number of times an element can occur: 114
The example above indicates that the "child_name" element can occur a minimum of one time (the default value for minOccurs is 1) and a maximum of ten times in the "person" element. minOccurs Indicator The indicator specifies the minimum number of times an element can occur: The example above indicates that the "child_name" element can occur a minimum of zero times and a maximum of ten times in the "person" element. Tip: To allow an element to appear an unlimited number of times, use the maxOccurs="unbounded" statement: A working example: An XML file called "Myfamily.xml": Hege Refsnes Cecilie Tove Refsnes Hege Stale Jim Borge 115
Stale Refsnes The XML file above contains a root element named "persons". Inside this root element we have defined three "person" elements. Each "person" element must contain a "full_name" element and it can contain up to five "child_name" elements. Here is the schema file "family.xsd": Group Indicators Group indicators are used to define related sets of elements. Element Groups Element groups are defined with the group declaration, like this: ...
You must define an all, choice, or sequence element inside the group declaration. The following example defines a group named "persongroup", that defines a group of elements that must occur in an exact sequence: After you have defined a group, you can reference it in another definition, like this: Attribute Groups Attribute groups are defined with the attributeGroup declaration, like this: ... The following example defines an attribute group named "personattrgroup": 117
After you have defined an attribute group, you can reference it in another definition, like this: XSD The Element The element enables us to extend the XML document with elements not specified by the schema! The Element The element enables us to extend the XML document with elements not specified by the schema. The following example is a fragment from an XML schema called "family.xsd". It shows a declaration for the "person" element. By using the element we can extend (after ) the content of "person" with any element: Now we want to extend the "person" element with a "children" element. In this case we can do so, even if the author of the schema above never declared any "children" element. Look at this schema file, called "children.xsd": The XML file below (called "Myfamily.xml"), uses components from two different schemas; "family.xsd" and "children.xsd": Hege Refsnes Cecilie Stale Refsnes The XML file above is valid because the schema "family.xsd" allows us to extend the "person" element with an optional element after the "lastname" element. The and elements are used to make EXTENSIBLE documents! They allow documents to contain additional elements that are not declared in the main XML schema.
XSD The Element The element enables us to extend the XML document with attributes not specified by the schema! The Element The element enables us to extend the XML document with attributes not specified by the schema. The following example is a fragment from an XML schema called "family.xsd". It shows a declaration for the "person" element. By using the element we can add any number of attributes to the "person" element: Now we want to extend the "person" element with a "gender" attribute. In this case we can do so, even if the author of the schema above never declared any "gender" attribute. Look at this schema file, called "attribute.xsd": The XML file below (called "Myfamily.xml"), uses components from two different schemas; "family.xsd" and "attribute.xsd": 120
Hege Refsnes Stale Refsnes The XML file above is valid because the schema "family.xsd" allows us to add an attribute to the "person" element. The and elements are used to make EXTENSIBLE documents! They allow documents to contain additional elements that are not declared in the main XML schema. XSD Element Substitution With XML Schemas, one element can substitute another element. Element Substitution Let's say that we have users from two different countries: England and Norway. We would like the ability to let the user choose whether he or she would like to use the Norwegian element names or the English element names in the XML document. To solve this problem, we could define a substitutionGroup in the XML schema. First, we declare a head element and then we declare the other elements which state that they are substitutable for the head element. In the example above, the "name" element is the head element and the "navn" element is substitutable for "name". Look at this fragment of an XML schema: 121
A valid XML document (according to the schema above) could look like this: John Smith or like this: John Smith Blocking Element Substitution To prevent other elements from substituting with a specified element, use the block attribute: Look at this fragment of an XML schema: A valid XML document (according to the schema above) looks like this: 122
John Smith BUT THIS IS NO LONGER VALID: John Smith Using substitutionGroup The type of the substitutable elements must be the same as, or derived from, the type of the head element. If the type of the substitutable element is the same as the type of the head element you will not have to specify the type of the substitutable element. Note that all elements in the substitutionGroup (the head element and the substitutable elements) must be declared as global elements, otherwise it will not work! What are Global Elements? Global elements are elements that are immediate children of the "schema" element! Local elements are elements nested within other elements. An XSD Example This chapter will demonstrate how to write an XML Schema. You will also learn that a schema can be written in different ways. An XML Document Let's have a look at this XML document called "shiporder.xml": John Smith Ola Nordmann Langgt 23 123
4000 Stavanger Norway Empire Burlesque Special Edition 1 10.90 Hide your heart 1 9.90 The XML document above consists of a root element, "shiporder", that contains a required attribute called "orderid". The "shiporder" element contains three different child elements: "orderperson", "shipto" and "item". The "item" element appears twice, and it contains a "title", an optional "note" element, a "quantity", and a "price" element. The line above: xmlns:xsi="http://www.w3.org/2001/XMLSchema‐instance" tells the XML parser that this document should be validated against a schema. The line: xsi:noNamespaceSchemaLocation="shiporder.xsd" specifies WHERE the schema resides (here it is in the same folder as "shiporder.xml"). Create an XML Schema Now we want to create a schema for the XML document above. We start by opening a new file that we will call "shiporder.xsd". To create the schema we could simply follow the structure in the XML document and define each element as we find it. We will start with the standard XML declaration followed by the xs:schema element that defines a schema: ... In the schema above we use the standard namespace (xs), and the URI associated with this namespace is the Schema language definition, which has the standard value of http://www.w3.org/2001/XMLSchema. Next, we have to define the "shiporder" element. This element has an attribute and it contains other elements, therefore we consider it as a complex type. The child elements of the
"shiporder" element is surrounded by a xs:sequence element that defines an ordered sequence of sub elements: ... Then we have to define the "orderperson" element as a simple type (because it does not contain any attributes or other elements). The type (xs:string) is prefixed with the namespace prefix associated with XML Schema that indicates a predefined schema data type: Next, we have to define two elements that are of the complex type: "shipto" and "item". We start by defining the "shipto" element: With schemas we can define the number of possible occurrences for an element with the maxOccurs and minOccurs attributes. maxOccurs specifies the maximum number of occurrences for an element and minOccurs specifies the minimum number of occurrences for an element. The default value for both maxOccurs and minOccurs is 1! Now we can define the "item" element. This element can appear multiple times inside a "shiporder" element. This is specified by setting the maxOccurs attribute of the "item" element to "unbounded" which means that there can be as many occurrences of the "item" element as the author wishes. Notice that the "note" element is optional. We have specified this by setting the minOccurs attribute to zero: 125
We can now declare the attribute of the "shiporder" element. Since this is a required attribute we specify use="required". Note: The attribute declarations must always come last: Here is the complete listing of the schema file called "shiporder.xsd": 126
Divide the Schema The previous design method is very simple, but can be difficult to read and maintain when documents are complex. The next design method is based on defining all elements and attributes first, and then referring to them using the ref attribute. Here is the new design of the schema file ("shiporder.xsd"): 127
Using Named Types The third design method defines classes or types, that enables us to reuse element definitions. This is done by naming the simpleTypes and complexTypes elements, and then point to them through the type attribute of the element. Here is the third design of the schema file ("shiporder.xsd"): 128
The restriction element indicates that the datatype is derived from a W3C XML Schema namespace datatype. So, the following fragment means that the value of the element or attribute must be a string value: The restriction element is more often used to apply restrictions to elements. Look at the following lines from the schema above: 129
This indicates that the value of the element or attribute must be a string, it must be exactly six characters in a row, and those characters must be a number from 0 to 9. XSD String Data Types String data types are used for values that contains character strings. String Data Type The string data type can contain characters, line feeds, carriage returns, and tab characters. The following is an example of a string declaration in a schema: An element in your document might look like this: John Smith Or it might look like this: John Smith Note: The XML processor will not modify the value if you use the string data type. NormalizedString Data Type The normalizedString data type is derived from the String data type. The normalizedString data type also contains characters, but the XML processor will remove line feeds, carriage returns, and tab characters. The following is an example of a normalizedString declaration in a schema: An element in your document might look like this: John Smith Or it might look like this: John Smith 130
Note: In the example above the XML processor will replace the tabs with spaces. Token Data Type The token data type is also derived from the String data type. The token data type also contains characters, but the XML processor will remove line feeds, carriage returns, tabs, leading and trailing spaces, and multiple spaces. The following is an example of a token declaration in a schema: An element in your document might look like this: John Smith Or it might look like this: John Smith Note: In the example above the XML processor will remove the tabs. String Data Types Note that all of the data types below derive from the String data type (except for string itself)! Name ENTITIES ENTITY ID IDREF IDREFS language Name NCName NMTOKEN NMTOKENS normalizedString QName
Description A string that represents the ID attribute in XML (only used with schema attributes) A string that represents the IDREF attribute in XML (only used with schema attributes) A string that contains a valid language id A string that contains a valid XML name A string that represents the NMTOKEN attribute in XML (only used with schema attributes) A string that does not contain line feeds, carriage returns, or tabs 131
string token
A string A string that does not contain line feeds, carriage returns, tabs, leading or trailing spaces, or multiple spaces
Restrictions on String Data Types Restrictions that can be used with String data types: • • • • • •
enumeration length maxLength minLength pattern (NMTOKENS, IDREFS, and ENTITIES cannot use this constraint) whiteSpace
XSD Date and Time Data Types Date and time data types are used for values that contain date and time. Date Data Type The date data type is used to specify a date. The date is specified in the following form "YYYY‐MM‐DD" where: • • •
YYYY indicates the year MM indicates the month DD indicates the day
Note: All components are required! The following is an example of a date declaration in a schema: An element in your document might look like this: 2002‐09‐24 Time Zones To specify a time zone, you can either enter a date in UTC time by adding a "Z" behind the date ‐ like this:
2002‐09‐24Z or you can specify an offset from the UTC time by adding a positive or negative time behind the date ‐ like this: 2002‐09‐24‐06:00 or 2002‐09‐24+06:00 Time Data Type The time data type is used to specify a time. The time is specified in the following form "hh:mm:ss" where: • • •
hh indicates the hour mm indicates the minute ss indicates the second
Note: All components are required! The following is an example of a time declaration in a schema: An element in your document might look like this: 09:00:00 Or it might look like this: 09:30:10.5 Time Zones To specify a time zone, you can either enter a time in UTC time by adding a "Z" behind the time ‐ like this: 09:30:10Z or you can specify an offset from the UTC time by adding a positive or negative time behind the time ‐ like this: 09:30:10‐06:00 133
or 09:30:10+06:00 DateTime Data Type The dateTime data type is used to specify a date and a time. The dateTime is specified in the following form "YYYY‐MM‐DDThh:mm:ss" where: • • • • • • •
YYYY indicates the year MM indicates the month DD indicates the day T indicates the start of the required time section hh indicates the hour mm indicates the minute ss indicates the second
Note: All components are required! The following is an example of a dateTime declaration in a schema: An element in your document might look like this: 2002‐05‐30T09:00:00 Or it might look like this: 2002‐05‐30T09:30:10.5 Time Zones To specify a time zone, you can either enter a dateTime in UTC time by adding a "Z" behind the time ‐ like this: 2002‐05‐30T09:30:10Z or you can specify an offset from the UTC time by adding a positive or negative time behind the time ‐ like this: 2002‐05‐30T09:30:10‐06:00 or 2002‐05‐30T09:30:10+06:00 134
Duration Data Type The duration data type is used to specify a time interval. The time interval is specified in the following form "PnYnMnDTnHnMnS" where: • • • • • • • •
P indicates the period (required) nY indicates the number of years nM indicates the number of months nD indicates the number of days T indicates the start of a time section (required if you are going to specify hours, minutes, or seconds) nH indicates the number of hours nM indicates the number of minutes nS indicates the number of seconds
The following is an example of a duration declaration in a schema: An element in your document might look like this: P5Y The example above indicates a period of five years. Or it might look like this: P5Y2M10D The example above indicates a period of five years, two months, and 10 days. Or it might look like this: P5Y2M10DT15H The example above indicates a period of five years, two months, 10 days, and 15 hours. Or it might look like this: PT15H The example above indicates a period of 15 hours. Negative Duration To specify a negative duration, enter a minus sign before the P: 135
‐P10D The example above indicates a period of minus 10 days. Date and Time Data Types Name Description date Defines a date value dateTime Defines a date and time value duration Defines a time interval gDay Defines a part of a date ‐ the day (DD) gMonth Defines a part of a date ‐ the month (MM) gMonthDay Defines a part of a date ‐ the month and day (MM‐DD) gYear Defines a part of a date ‐ the year (YYYY) gYearMonth Defines a part of a date ‐ the year and month (YYYY‐MM) time Defines a time value Restrictions on Date Data Types Restrictions that can be used with Date data types: • • • • • • •
enumeration maxExclusive maxInclusive minExclusive minInclusive pattern whiteSpace
XSD Numeric Data Types Decimal data types are used for numeric values. Decimal Data Type The decimal data type is used to specify a numeric value. The following is an example of a decimal declaration in a schema:
An element in your document might look like this: 999.50 Or it might look like this: +999.5450 Or it might look like this: ‐999.5230 Or it might look like this: 0 Or it might look like this: 14 Note: The maximum number of decimal digits you can specify is 18. Integer Data Type The integer data type is used to specify a numeric value without a fractional component. The following is an example of an integer declaration in a schema: An element in your document might look like this: 999 Or it might look like this: +999 Or it might look like this: ‐999 Or it might look like this: 0 137
Numeric Data Types Note that all of the data types below derive from the Decimal data type (except for decimal itself)! Name byte decimal int integer long negativeInteger nonNegativeInteger nonPositiveInteger positiveInteger short unsignedLong unsignedInt unsignedShort unsignedByte
Description A signed 8‐bit integer A decimal value A signed 32‐bit integer An integer value A signed 64‐bit integer An integer containing only negative values (..,‐2,‐1) An integer containing only non‐negative values (0,1,2,..) An integer containing only non‐positive values (..,‐2,‐1,0) An integer containing only positive values (1,2,..) A signed 16‐bit integer An unsigned 64‐bit integer An unsigned 32‐bit integer An unsigned 16‐bit integer An unsigned 8‐bit integer
Restrictions on Numeric Data Types Restrictions that can be used with Numeric data types: • • • • • • • • •
enumeration fractionDigits maxExclusive maxInclusive minExclusive minInclusive pattern totalDigits whiteSpace
XSD Miscellaneous Data Types Other miscellaneous data types are boolean, base64Binary, hexBinary, float, double, anyURI, QName, and NOTATION.
Boolean Data Type The boolean data type is used to specify a true or false value. The following is an example of a boolean declaration in a schema: An element in your document might look like this: 999 Note: Legal values for boolean are true, false, 1 (which indicates true), and 0 (which indicates false). Binary Data Types Binary data types are used to express binary‐formatted data. We have two binary data types: • •
base64Binary (Base64‐encoded binary data) hexBinary (hexadecimal‐encoded binary data)
The following is an example of a hexBinary declaration in a schema: AnyURI Data Type The anyURI data type is used to specify a URI. The following is an example of an anyURI declaration in a schema: An element in your document might look like this: Note: If a URI has spaces, replace them with %20. Miscellaneous Data Types Name Description 139
anyURI base64Binary boolean double float hexBinary NOTATION QName
Restrictions on Miscellaneous Data Types Restrictions that can be used with the other data types: • • • • • •
enumeration (a Boolean data type cannot use this constraint) length (a Boolean data type cannot use this constraint) maxLength (a Boolean data type cannot use this constraint) minLength (a Boolean data type cannot use this constraint) pattern whiteSpace
XML Editors If you are serious about XML, you will benefit from using a professional XML Editor. XML is Text‐based XML is a text‐based markup language. One great thing about XML is that XML files can be created and edited using a simple text‐ editor like Notepad. However, when you start working with XML, you will soon find that it is better to edit XML documents using a professional XML editor. Why Not Notepad? Many web developers use Notepad to edit both HTML and XML documents because Notepad is included with the most common OS and it is simple to use. Personally I often use Notepad for quick editing of simple HTML, CSS, and XML files. But, if you use Notepad for XML editing, you will soon run into problems. Notepad does not know that you are writing XML, so it will not be able to assist you. 140
Why an XML Editor? Today XML is an important technology, and development projects use XML‐based technologies like: • • • • • • •
XML Schema to define XML structures and data types XSLT to transform XML data SOAP to exchange XML data between applications WSDL to describe web services RDF to describe web resources XPath and XQuery to access XML data SMIL to define graphics
To be able to write error‐free XML documents, you will need an intelligent XML editor! XML Editors Professional XML editors will help you to write error‐free XML documents, validate your XML against a DTD or a schema, and force you to stick to a valid XML structure. An XML editor should be able to: • • • • •
Add closing tags to your opening tags automatically Force you to write valid XML Verify your XML against a DTD Verify your XML against a Schema Color code your XML syntax
XML Processing •
SAX (Simple API for XML). Low‐level approach viewing an XML document as a sequence of tags to which actions are assigned.
DOM (Document Object Model) Views a document as a hierarchy of elements. | +‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐+‐‐‐‐‐‐‐‐‐‐‐‐‐+ 141
| | |
• • •
XSLT (Extensible Stylesheet Language Transformation) Provides a template‐oriented instead of procedural‐oriented approach JDOM (Java Document Object Model): a variant of DOM adjusted for streamlined for java. JAXB (Java API for XML Building): XML translation into java classes
if( QName.equals( "rect" ) ){ g.setColor( Color.getColor( atts.getValue("fill") )); g.fillRect( Integer.getInteger( atts.getValue( "x" )).intValue(), Integer.getInteger( atts.getValue( "y" )).intValue(), Integer.getInteger( atts.getValue("width" )).intValue(), Integer.getInteger( atts.getValue("height")).intValue() ); } else if( QName.equals( "circle" ) ){ g.setColor( Color.getColor( atts.getValue("fill") )); int r = Integer.getInteger( atts.getValue("r")).intValue(); g.fillRect( Integer.getInteger( atts.getValue( "cx" )).intValue() ‐ r, Integer.getInteger( atts.getValue( "cy" )).intValue() ‐ r, 2*r, 2*r ); } ‐_‐_‐ import javax.xml.parsers.*; import org.xml.sax.XMLReader; import java.io.File; import org.xml.sax.Attributes; 142
import org.xml.sax.helpers.DefaultHandler; import java.awt.*; import javax.swing.*; class MySVGBrowser { static public void main(String[] args) { new WebPage( args[0]); } } class MyContentHandler extends DefaultHandler { Graphics g; MyContentHandler(Graphics g){ this.g = g; } public void startElement(String namespace, String localName, String QName, Attributes atts) { } } class WebPage extends JFrame { String fileName; WebPage ( String fileName) { this.fileName = fileName; setSize(200,200); setVisible(true); } public void paint(Graphics g) { try{ SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setNamespaceAware( true ); SAXParser saxParser = factory.newSAXParser(); XMLReader xmlReader = saxParser.getXMLReader(); xmlReader.setContentHandler( new MyContentHandler(g) ); xmlReader.parse( new File(fileName).toURL().toString() ); } catch( Exception e ){} } } 143
‐_‐_‐ The painting componnet is similar to that in the following program. import java.awt.*; import javax.swing.*; class JavaPaint { public static void main(String args[]) { new Pic(); } } class Pic extends JFrame { Pic() { setSize(200,200); setVisible(true); } public void paint(Graphics g) { g.fillRect(0, 0, 50, 60); } } Specifications • • • • • •
Entities and Unicode‐‐data representation XML namespaces‐‐mixed vocabularies DTDs, XML schemas, RELAX NG‐‐structural constraints and data types XLinks, XPointers, XPath‐‐Linking and addressing CSS, XSL‐FO‐‐Presentation of XML Web Accessibility
... ... How SAX processing works
SAX analyzes an XML stream as it goes by, much like an old tickertape. Consider the following XML code snippet: UNIX color A SAX processor analyzing this code snipped would generate, in general, the following events: Start document Start element (samples) Characters (white space) Start element (server) Characters (UNIX) End element (server) Characters (white space) Start element (monitor) 145
Characters (color) End element (monitor) Characters (white space) End element (samples) The SAX API allows a developer to capture these events and act on them. SAX processing involves the following steps: * Create an event handler. * Create the SAX parser. * Assign the event handler to the parser. * Parse the document, sending each event to the handler. The pros and cons of event‐based processing The advantages of this kind of processing are much like the advantages of streaming media; analysis can get started immediately, rather than having to wait for all of the data to be processed. Also, because the application is simply examining the data as it goes by, it doesn't need to store it in memory. This is a huge advantage when it comes to large documents. In general, SAX is also much faster than the alternative, the Document Object Model. On the other hand, because the application is not storing the data in any way, it is impossible to make changes to it using SAX, or to move "backward" in the data stream. Presented by developerWorks, your source for great tutorials ibm.com/developerWorks Understanding SAX Page 4 DOM and tree‐based processing
The Document Object Model, or DOM, is the "traditional" way of handling XML data. With DOM the data is loaded into memory in a tree‐like structure. 147
For instance, the same document used as an example in the preceding panel would be represented as nodes, as shown to the left. The rectangular boxes represent element nodes, and the ovals represent text nodes. DOM uses a root node and parent‐child relationships. For instance, in this case, samples would be the root node with five children: three text nodes (the white space), and the two element nodes, server and monitor. One important thing to realize is that the server and monitor actually have values of null. Instead, they have text nodes for children, UNIX 0and color. Pros and cons of tree‐based processing DOM, and by extension tree‐based processing, has several advantages. First, because the tree is persistent in memory, it can be modified so an application can make changes to the data and the structure. It can also work its way up and down the tree at any time, as opposed to the "one‐shot deal" of SAX. DOM can also be much simpler to use. On the other hand, there is a lot of overhead involved in building these trees in memory. It's not unusual for large files to completely overrun a system's capacity. In addition, creating a DOM tree can be a very slow process. How to choose between SAX and DOM Whether you choose DOM or SAX is going to depend on several factors: * Purpose of the application: If you are going to have to make changes to the data and output it as XML, then in most cases, DOM is the way to go. This is particularly true if the changes are to the data itself, as opposed to a simple structural change that can be accomplished with XSL transformations. * Amount of data: For large files, SAX is a better bet. * How the data will be used: If only a small amount of the data will actually be used, you may be better off using SAX to extract it into your application. On the other hand, if you Presented by developerWorks, your source for great tutorials ibm.com/developerWorks Understanding SAX Page 5 know that you will need to refer back to information that has already been processed, SAX is probably not the right choice. * The need for speed: SAX implementations are normally faster than DOM implementations. It's important to remember that SAX and DOM are not mutually exclusive. You can use DOM to create a SAX stream of events, and you can use SAX to create a DOM tree. In fact, most parsers used to create DOM trees are actually using SAX to do it! Presented by developerWorks, your source for great tutorials ibm.com/developerWorks Understanding SAX Page 6 Disadvantages of SAX are: * Easily forgets previous elements it worked on * Not easy to re‐order elements * Cannot validate an XML document * Canot easily verify ID‐REF links
DOM versus SAX parsing: Practical differences are the following 1. DOM APIs map the XML document into an internal tree structure and allows you to refer to the nodes and the elements in any way you want and as many times as you want. This usually means less programming and planning ahead but also means bad performance in terms of memory or CPU cycles. 2. SAX APIs on the other hand are event based ie they traverse the XML document and allows you to trap the events as it passes through the document. You can trap start of the document, start of an element and the start of any characters within an element. This usually means more programming and planning on your part but is compensated by the fact that it will take less memory and less CPU cycles. 3. DOM performance may not be an issue if it used in a batch environment because the performance impact will be felt once and may be negligible compared to the rest of the batch process. 4. DOM performance may become an issue in an on line transaction processing environment because the performance impact will be felt for each and every transaction. It may not be negligible compared to the rest of the on line processing, since by nature they are short living process. 5. Elapsed time difference in DOM vs SAX A XML document 13kb long with 2354 elements or tags. This message represents an accounting G/L entries sent from one Banking system to another. Windows 2000 running in Pentium SAX version ‐ 1 sec DOM version ‐ 4 secs IBM mainframe under CICS 1.3 SAX version‐ 2 secs DOM version 10 secs IBM mainframe under CICS 2.2 SAX version‐ 1 sec DOM version 2 secs The significant reduction in under CICS2.2 is due to the fact that the JVM is reusable and it uses jdk1.3 vs jdk1.1 149
6. Examples of the difference in coding Sample XML Document 3R Computer XML Help Page PRESENTATION TECHNOLOGIES IN XML The Extensible Stylesheet Language Family (XSL) XSL is a family of recommendations for defining XML document transformation and presentation. It consists of three parts: XSL Transformations (XSLT) a language for transforming XML the XML Path Language (XPath) an expression language used by XSLT to access or refer to parts of an XML document. (XPath is also used by the XML Linking specification) XSL Formatting Objects (XSL‐FO) an XML vocabulary for specifying formatting semantics An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary, such as (X)HTML or XSL‐FO. For a more detailed explanation of how XSL works, see the What Is XSL page. For background information on style sheets, see the Web style sheets resource page. XSL is developed by the W3C XSL Working Group (members only) whose charter is to develop the next version of XSL. XSL is part of W3C's XML Activity, whose work is described in the XML Activity Statement. Sample XSL file Sample ShoXS from sample XSL file 150
NOTE: X FORMS XForms ‐XForms is the next generation of HTML forms. XForms uses XML to create input forms on the Web. From XForms 1.1: XForms is an XML application that represents the next generation of forms for the Web. XForms is not a free‐standing document type, but is intended to be integrated into other markup languages, such as XHTML, ODF or SVG. An XForms‐based web form gathers and processes XML data using an architecture that separates presentation, purpose and content. The underlying data of a form is organized into instances of data schema (though formal schema definitions are not required). An XForm allows processing of data to occur using three mechanisms: 152
• • •
a declarative model composed of formulae for data calculations and constraints, data type and other property declarations, and data submission parameters a view layer composed of intent‐based user interface controls an imperative controller for orchestrating data manipulations, interactions between the model and view layers, and data submissions.
Before you continue you should have a basic understanding of the following: • • • •
If you want to study these subjects first, find the tutorials on our Home Page. What Is XForms? • • • • • • • • • •
XForms is the next generation of HTML forms XForms is richer and more flexible than HTML forms XForms will be the forms standard in XHTML 2.0 XForms is platform and device independent XForms separates data and logic from presentation XForms uses XML to define form data XForms stores and transports data in XML documents XForms contains features like calculations and validations of forms XForms reduces or eliminates the need for scripting XForms is a W3C Recommendation
XForms Is The Successors Of HTML Forms Forms are an important part of many web applications today. An HTML form makes it possible for web applications to accept input from a user. Today, ten years after HTML forms became a part of the HTML standard, web users do complex transactions that are starting to exceed the limitations of standard HTML forms. XForms provides a richer, more secure, and device independent way of handling web input. We should expect future web solutions to demand the use of XForms‐enabled browsers (All future browsers should support XForms). 153
XForms Separate Data From Presentation XForms uses XML for data definition and HTML or XHTML for data display. XForms separates the data logic of a form from its presentation. This way the XForms data can be defined independent of how the end‐user will interact with the application. XForms Uses XML To Define Form Data With XForms, the rules for describing and validating data are expressed in XML. XForms Uses XML To Store And Transport Data With XForms, the data displayed in a form are stored in an XML document, and the data submitted from the form, are transported over the internet using XML. The data content is coded in, and transported as Unicode bytes. XForms Is Device Independent Separating data from presentation makes XForms device independent, because the data model can be used for all devices. The presentation can be customized for different user interfaces, like mobile phones, handheld devices, and Braille readers for the blind. Since XForms is device independent and based on XML, it is also possible to add XForms elements directly into other XML applications like VoiceXML (speaking web data), WML (Wireless Markup Language), and SVG (Scalable Vector Graphics). The XForms Framework The purpose of an HTML form is to collect data. XForms has the same purpose. With XForms, input data is described in two different parts: • •
The XForm model ‐ defines what the form is, what it should do, what data it contains The XForm user interface ‐ defines the input fields and how they should be displayed
The XForms Model The XForms model describes the data. The XForms model defines a data model inside a model element: In the example above, the XForms model uses an instance element to define the XML‐template for the data to be collected, and a submission element to describe how to submit the data. Note: The XForms model does not say anything about the visual part of the form (the user interface). XForms Namespace If you are missing the XForms namespace in these examples, or if you don't know what a namespace is, it will be introduced in the next chapter. The instance Element The instance element defines the data to be collected. XForms is always collecting data for an XML document. The instance element in the XForms model defines the XML document. In the example above the "data instance" (the XML document) the form is collecting data for looks like this: 155
After collecting the data, the XML document might look like this: John Smith The submission Element The submission element describes how to submit the data. The submission element defines a form and how it should be submitted. In the example above, the id="form1" identifies a form, the action="submit.asp" defines the URL to where the form should be submitted, and the method="get" attribute defines the method to use when submitting the form data. The XForms User Interface The XForms user interface defines the input fields and how they should be displayed. The user interface elements are called controls (or input controls): First Name Last Name Submit In the example above the two elements define two input fields. The ref="fname" and ref="lname" attributes point to the and elements in the XForms model. The element has a submission="form1" attribute which refers to the element in the XForms model. A submit element is usually displayed as a button. Notice the elements in the example. With XForms every input control element has a required element. 156
XForms Example You can test XForms with Internet Explorer (XForms will not work in IE prior version 5). Just click on the "Try it Yourself" button under the example. Example First Name Last Name Submit
HTML/XHTML Forms and XForms Function
Validation and Calculation
HTML/XHTML Forms Heavy reliance on scripting languages, both client‐side (Javascript) and server‐side.
XPath, W3C XML Schema
User Feedback
Scripting languages
XML form model
Initializing Data
Server‐side process to dynamically XML instance data generate form
Data Representation
name=value pairs
Host Language
XHTML, XHTML Mobile Profile, SVG, etc.
Using XForms •
• •
Browser o Native o Plugin o Javascript XForms Player Server‐side processing to XHTML/JavaScript/Ajax
HTML Forms
What Are e XForms? Tradition nal HTML We eb forms don't separatee the purposee from the p presentation of a form. XForms, in contrast, are comprissed of separaate sections that describ be what the form does, aand how the form looks. This allows for flexible p presentation n options, including classsic XHTML forms, to o be attached to an XMLL form definition. The following illustraates how a single device‐independen nt XML form m definition, called the XForms M Model, has tthe capabilitty to work w with a varietyy of standard d or propriettary user interfacees:
The XForrms User Intterface provides a standard set of visual controls that are taargeted towaard replacingg today's XHTTML form co ontrols. Thesse form conttrols are direectly usable inside XHTM ML and otheer XML documents, like SSVG. Other ggroups, such h as the Voicce Browser W Working Group, may also o independen ntly develop p user interfaace components for XFo orms. An imporrtant concep pt in XFormss is that form ms collect data, which is expressed aas XML instance data. Am mong other d duties, the XForms Modeel describes the structurre of the insttance data. TThis is importtant, since likke XML, form ms represent a structureed interchan nge of data. W Workflow, aauto‐ fill, and p pre‐fill form applicationss are supporrted through h the use of iinstance datta. Finally, th here needs tto be a chan nnel for instaance data to flow to and d from the XFForms Proceessor. For this, the XForms Submit Prottocol definees how XForm ms send and d receive datta, including the o suspend an nd resume th he completio on of a form. ability to The following illustraation summaarizes the main aspects o of XForms:
Key Goalls of XFormss • • • • • • • • • •
Support for sttructured fo orm data A Advanced for rms logic witthout serverr round‐tripp ping D Dynamic acce ess to serverr data sourcees during forrm execution D Decoupled da ata, logic and d presentatiion Seeamless inte egration with h other XMLL tag sets Richer user in nterface to m meet the neeeds of business, consum mer and devicce control applications Support for h handheld, television, and d desktop brrowsers, plus printers an nd scanners Im mproved internationalization and acccessibility M Multiple form ms per page, and pages p per form Suspend and Resume cap pabilities 160
1. What is XHTML? This section is informative. XHTML is a family of current and future document types and modules that reproduce, subset, and extend HTML 4 [HTML4]. XHTML family document types are XML based, and ultimately are designed to work in conjunction with XML‐based user agents. The details of this family and its evolution are discussed in more detail in [XHTMLMOD]. XHTML 1.0 (this specification) is the first document type in the XHTML family. It is a reformulation of the three HTML 4 document types as applications of XML 1.0 [XML]. It is intended to be used as a language for content that is both XML‐conforming and, if some simple guidelines are followed, operates in HTML 4 conforming user agents. Developers who migrate their content to XHTML 1.0 will realize the following benefits: • •
• •
XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools. XHTML documents can be written to operate as well or better than they did before in existing HTML 4‐conforming user agents as well as in new, XHTML 1.0 conforming user agents. XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model [DOM]. As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments.
The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility. 1.1. What is HTML 4? HTML 4 [HTML4] is an SGML (Standard Generalized Markup Language) application conforming to International Standard ISO 8879, and is widely regarded as the standard publishing language of the World Wide Web. SGML is a language for describing markup languages, particularly those used in electronic document exchange, document management, and document publishing. HTML is an example of a language defined in SGML. SGML has been around since the middle 1980's and has remained quite stable. Much of this stability stems from the fact that the language is both feature‐rich and flexible. This flexibility, however, comes at a price, and that price is a level of complexity that has inhibited its adoption in a diversity of environments, including the World Wide Web. 161
HTML, as originally conceived, was to be a language for the exchange of scientific and other technical documents, suitable for use by non‐document specialists. HTML addressed the problem of SGML complexity by specifying a small set of structural and semantic tags suitable for authoring relatively simple documents. In addition to simplifying the document structure, HTML added support for hypertext. Multimedia capabilities were added later. In a remarkably short space of time, HTML became wildly popular and rapidly outgrew its original purpose. Since HTML's inception, there has been rapid invention of new elements for use within HTML (as a standard) and for adapting HTML to vertical, highly specialized, markets. This plethora of new elements has led to interoperability problems for documents across different platforms. 1.2. What is XML? XML™ is the shorthand name for Extensible Markup Language [XML]. XML was conceived as a means of regaining the power and flexibility of SGML without most of its complexity. Although a restricted form of SGML, XML nonetheless preserves most of SGML's power and richness, and yet still retains all of SGML's commonly used features. While retaining these beneficial features, XML removes many of the more complex features of SGML that make the authoring and design of suitable software both difficult and costly. 1.3. Why the need for XHTML? The benefits of migrating to XHTML 1.0 are described above. Some of the benefits of migrating to XHTML in general are: •
Document developers and user agent designers are constantly discovering new ways to express their ideas through new markup. In XML, it is relatively easy to introduce new elements or additional element attributes. The XHTML family is designed to accommodate these extensions through XHTML modules and techniques for developing new XHTML‐conforming modules (described in the XHTML Modularization specification). These modules will permit the combination of existing and new feature sets when developing content and when designing new user agents. Alternate ways of accessing the Internet are constantly being introduced. The XHTML family is designed with general user agent interoperability in mind. Through a new user agent and document profiling mechanism, servers, proxies, and user agents will be able to perform best effort content transformation. Ultimately, it will be possible to develop XHTML‐conforming content that is usable by any XHTML‐conforming user agent.
2. Definitions This section is normative. 162
2.1. Terminology The following terms are used in this specification. These terms extend the definitions in [RFC2119] in ways based upon similar definitions in ISO/IEC 9945‐1:1990 [POSIX.1]: May With respect to implementations, the word "may" is to be interpreted as an optional feature that is not required in this specification but can be provided. With respect to Document Conformance, the word "may" means that the optional feature must not be used. The term "optional" has the same definition as "may". Must In this specification, the word "must" is to be interpreted as a mandatory requirement on the implementation or on Strictly Conforming XHTML Documents, depending upon the context. The term "shall" has the same definition as "must". Optional See "May". Reserved A value or behavior is unspecified, but it is not allowed to be used by Conforming Documents nor to be supported by Conforming User Agents. Shall See "Must". Should With respect to implementations, the word "should" is to be interpreted as an implementation recommendation, but not a requirement. With respect to documents, the word "should" is to be interpreted as recommended programming practice for documents and a requirement for Strictly Conforming XHTML Documents. Supported Certain facilities in this specification are optional. If a facility is supported, it behaves as specified by this specification. Unspecified When a value or behavior is unspecified, the specification defines no portability requirements for a facility on an implementation even when faced with a document that 163
uses the facility. A document that requires specific behavior in such an instance, rather than tolerating any behavior when using that facility, is not a Strictly Conforming XHTML Document. 2.2. General Terms Attribute An attribute is a parameter to an element declared in the DTD. An attribute's type and value range, including a possible default value, are defined in the DTD. DTD A DTD, or document type definition, is a collection of XML markup declarations that, as a collection, defines the legal structure, elements, and attributes that are available for use in a document that complies to the DTD. Document A document is a stream of data that, after being combined with any other streams it references, is structured such that it holds information contained within elements that are organized as defined in the associated DTD. See Document Conformance for more information. Element An element is a document structuring unit declared in the DTD. The element's content model is defined in the DTD, and additional semantics may be defined in the prose description of the element. Facilities Facilities are elements, attributes, and the semantics associated with those elements and attributes. Implementation See User Agent. Parsing Parsing is the act whereby a document is scanned, and the information contained within the document is filtered into the context of the elements in which the information is structured. Rendering 164
Rendering is the act whereby the information in a document is presented. This presentation is done in the form most appropriate to the environment (e.g. aurally, visually, in print). User Agent A user agent is a system that processes XHTML documents in accordance with this specification. See User Agent Conformance for more information. Validation Validation is a process whereby documents are verified against the associated DTD, ensuring that the structure, use of elements, and use of attributes are consistent with the definitions in the DTD. Well‐formed A document is well‐formed when it is structured according to the rules defined in Section 2.1 of the XML 1.0 Recommendation [XML]. 3. Normative Definition of XHTML 1.0 This section is normative. 3.1. Document Conformance This version of XHTML provides a definition of strictly conforming XHTML 1.0 documents, which are restricted to elements and attributes from the XML and XHTML 1.0 namespaces. See Section 3.1.2 for information on using XHTML with other namespaces, for instance, to include metadata expressed in RDF within XHTML documents. 3.1.1. Strictly Conforming Documents A Strictly Conforming XHTML Document is an XML document that requires only the facilities described as mandatory in this specification. Such a document must meet all of the following criteria: 1. It must conform to the constraints expressed in one of the three DTDs found in DTDs and in Appendix B. 2. The root element of the document must be html. 3. The root element of the document must contain an xmlns declaration for the XHTML namespace [XMLNS]. The namespace for XHTML is defined to be http://www.w3.org/1999/xhtml. An example root element might look like: 4. 165
5. There must be a DOCTYPE declaration in the document prior to the root element. The public identifier included in the DOCTYPE declaration must reference one of the three DTDs found in DTDs using the respective Formal Public Identifier. The system identifier may be changed to reflect local system conventions. 6. 9. 10. 13. 14. 17. The DTD subset must not be used to override any parameter entities in the DTD. An XML declaration is not required in all XML documents; however XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required when the character encoding of the document is other than the default UTF‐8 or UTF‐ 16 and no encoding was determined by a higher‐level protocol. Here is an example of an XHTML document. In this example, the XML declaration is included. Virtual Library Moved to example.org. 3.1.2. Using XHTML with other namespaces The XHTML namespace may be used with other XML namespaces as per [XMLNS], although such documents are not strictly conforming XHTML 1.0 documents as defined above. Work by W3C is addressing ways to specify conformance for documents involving multiple namespaces. For an example, see [XHTML+MathML]. The following example shows the way in which XHTML 1.0 could be used in conjunction with the MathML Recommendation: 166
A Math Example The following is MathML markup: 3 x The following example shows the way in which XHTML 1.0 markup could be incorporated into another XML namespace: Cheaper by the Dozen 1568491379 This is also available online. 3.2. User Agent Conformance A conforming user agent must meet all of the following criteria: 1. In order to be consistent with the XML 1.0 Recommendation [XML], the user agent must parse and evaluate an XHTML document for well‐formedness. If the user agent claims to be a validating user agent, it must also validate documents against their referenced DTDs according to [XML]. 2. When the user agent claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities' definition. 167
3. When a user agent processes an XHTML document as generic XML, it shall only recognize attributes of type ID (i.e. the id attribute on most XHTML elements) as fragment identifiers. 4. If a user agent encounters an element it does not recognize, it must process the element's content. 5. If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value). 6. If a user agent encounters an attribute value it does not recognize, it must use the default attribute value. 7. If it encounters an entity reference (other than one of the entities defined in this recommendation or in the XML recommendation) for which the user agent has processed no declaration (which could happen if the declaration is in the external subset which the user agent hasn't read), the entity reference should be processed as the characters (starting with the ampersand and ending with the semi‐colon) that make up the entity reference. 8. When processing content, user agents that encounter characters or character entity references that are recognized but not renderable may substitute another rendering that gives the same meaning, or must display the document in such a way that it is obvious to the user that normal rendering has not taken place. 9. White space is handled according to the following rules. The following characters are defined in [XML] white space characters: o SPACE ( ) o HORIZONTAL TABULATION ( ) o CARRIAGE RETURN (
) The XML processor normalizes different systems' line end codes into one single LINE FEED character, that is passed up to the application. The user agent must use the definition from CSS for processing whitespace characters [CSS2]. Note that the CSS2 recommendation does not explicitly address the issue of whitespace handling in non‐Latin character sets. This will be addressed in a future version of CSS, at which time this reference will be updated. Note that in order to produce a Canonical XHTML document, the rules above must be applied and the rules in [XMLC14N] must also be applied to the document. 4. Differences with HTML 4 This section is informative. Due to the fact that XHTML is an XML application, certain practices that were perfectly legal in SGML‐based HTML 4 [HTML4] must be changed.
4.1. Documents must be well‐formed Well‐formedness is a new concept introduced by [XML]. Essentially this means that all elements must either have closing tags or be written in a special form (as described below), and that all the elements must nest properly. Although overlapping is illegal in SGML, it is widely tolerated in existing browsers. CORRECT: nested elements. here is an emphasized paragraph. INCORRECT: overlapping elements here is an emphasized paragraph. 4.2. Element and attribute names must be in lower case XHTML documents must use lower case for all HTML element and attribute names. This difference is necessary because XML is case‐sensitive e.g. and are different tags. 4.3. For non‐empty elements, end tags are required In SGML‐based HTML 4 certain elements were permitted to omit the end tag; with the elements that followed implying closure. XML does not allow end tags to be omitted. All elements other than those declared in the DTD as EMPTY must have an end tag. Elements that are declared in the DTD as EMPTY can have an end tag or can use empty element shorthand (see Empty Elements). CORRECT: terminated elements here is a paragraph.here is another paragraph. INCORRECT: unterminated elements here is a paragraph.here is another paragraph. 4.4. Attribute values must always be quoted All attribute values must be quoted, even those which appear to be numeric. CORRECT: quoted attribute values INCORRECT: unquoted attribute values 169
4.5. Attribute Minimization XML does not support attribute minimization. Attribute‐value pairs must be written in full. Attribute names such as compact and checked cannot occur in elements without their value being specified. CORRECT: unminimized attributes INCORRECT: minimized attributes 4.6. Empty Elements Empty elements must either have an end tag or the start tag must end with />. For instance, or . See HTML Compatibility Guidelines for information on ways to ensure this is backward compatible with HTML 4 user agents. CORRECT: terminated empty elements INCORRECT: unterminated empty elements 4.7. White Space handling in attribute values When user agents process attributes, they do so according to Section 3.3.3 of [XML]: • •
Strip leading and trailing white space. Map sequences of one or more white space characters (including line breaks) to a single inter‐word space.
4.8. Script and Style elements In XHTML, the script and style elements are declared as having #PCDATA content. As a result, An internal stylesheet example code { color: green; font‐family: monospace; font‐weight: bold; } This is text that uses our internal stylesheet. C.15. White Space Characters in HTML vs. XML Some characters that are legal in HTML documents, are illegal in XML document. For example, in HTML, the Formfeed character (U+000C) is treated as white space, in XHTML, due to XML's definition of characters, it is illegal. C.16. The Named Character Reference ' The named character reference ' (the apostrophe, U+0027) was introduced in XML 1.0 but does not appear in HTML. Authors should therefore use ' instead of ' to work as expected in HTML 4 user agents. 1. Overview This document defines VoiceXML, the Voice Extensible Markup Language. Its background, basic concepts and use are presented in Section 1. The dialog constructs of form, menu and link, and the mechanism (Form Interpretation Algorithm) by which they are interpreted are then introduced in Section 2. User input using DTMF and speech grammars is covered in Section 3, 178
while Section 4 covers system output using speech synthesis and recorded audio. Mechanisms for manipulating dialog control flow, including variables, events, and executable elements, are explained in Section 5. Environment features such as parameters and properties as well as resource handling are specified in Section 6. The appendices provide additional information including the VoiceXML Schema, a detailed specification of the Form Interpretation Algorithm and timing, audio file formats, and statements relating to conformance, internationalization, accessibility and privacy. The origins of VoiceXML began in 1995 as an XML‐based dialog design language intended to simplify the speech recognition application development process within an AT&T project called Phone Markup Language (PML). As AT&T reorganized, teams at AT&T, Lucent and Motorola continued working on their own PML‐like languages. In 1998, W3C hosted a conference on voice browsers. By this time, AT&T and Lucent had different variants of their original PML, while Motorola had developed VoxML, and IBM was developing its own SpeechML. Many other attendees at the conference were also developing similar languages for dialog design; for example, such as HP's TalkML and PipeBeach's VoiceHTML. The VoiceXML Forum was then formed by AT&T, IBM, Lucent, and Motorola to pool their efforts. The mission of the VoiceXML Forum was to define a standard dialog design language that developers could use to build conversational applications. They chose XML as the basis for this effort because it was clear to them that this was the direction technology was going. In 2000, the VoiceXML Forum released VoiceXML 1.0 to the public. Shortly thereafter, VoiceXML 1.0 was submitted to the W3C as the basis for the creation of a new international standard. VoiceXML 2.0 is the result of this work based on input from W3C Member companies, other W3C Working Groups, and the public. Developers familiar with VoiceXML 1.0 are particularly directed to Changes from Previous Public Version which summarizes how VoiceXML 2.0 differs from VoiceXML 1.0. 1.1 Introduction VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations. Its major goal is to bring the advantages of Web‐based development and content delivery to interactive voice response applications. Here are two short examples of VoiceXML. The first is the venerable "Hello World": Hello World! The top‐level element is , which is mainly a container for dialogs. There are two types of dialogs: forms and menus. Forms present information and gather input; menus offer choices of what to do next. This example has a single form, which contains a block that synthesizes and presents "Hello World!" to the user. Since the form does not specify a successor dialog, the conversation ends. Our second example asks the user for a choice of drink and then submits it to a server script: Would you like coffee, tea, milk, or nothing? A field is an input field. The user must provide a value for the field before proceeding to the next element in the form. A sample interaction is: C (computer): Would you like coffee, tea, milk, or nothing? H (human): Orange juice. C: I did not understand what you said. (a platform‐specific default message.) C: Would you like coffee, tea, milk, or nothing? H: Tea C: (continues in document drink2.asp) 180
1.2 Background This section contains a high‐level architectural model, whose terminology is then used to describe the goals of VoiceXML, its scope, its design principles, and the requirements it places on the systems that support it. 1.2.1 Architectural Model The architectural model assumed by this document has the following components:
Figure 1: Architectural Model
A document server (e.g. a Web server) processes requests from a client application, the VoiceXML Interpreter, through the VoiceXML interpreter context. The server produces VoiceXML documents in reply, which are processed by the VoiceXML interpreter. The VoiceXML interpreter context may monitor user inputs in parallel with the VoiceXML interpreter. For example, one VoiceXML interpreter context may always listen for a special escape phrase that takes the user to a high‐level personal assistant, and another may listen for escape phrases that alter user preferences like volume or text‐to‐speech characteristics. The implementation platform is controlled by the VoiceXML interpreter context and by the VoiceXML interpreter. For instance, in an interactive voice response application, the VoiceXML interpreter context may be responsible for detecting an incoming call, acquiring the initial VoiceXML document, and answering the call, while the VoiceXML interpreter conducts the dialog after answer. The implementation platform generates events in response to user actions (e.g. spoken or character input received, disconnect) and system events (e.g. timer expiration). Some of these events are acted upon by the VoiceXML interpreter itself, as specified by the VoiceXML document, while others are acted upon by the VoiceXML interpreter context.
1.2.2 Goals of VoiceXML VoiceXML's main goal is to bring the full power of Web development and content delivery to voice response applications, and to free the authors of such applications from low‐level programming and resource management. It enables integration of voice services with data services using the familiar client‐server paradigm. A voice service is viewed as a sequence of interaction dialogs between a user and an implementation platform. The dialogs are provided by document servers, which may be external to the implementation platform. Document servers maintain overall service logic, perform database and legacy system operations, and produce dialogs. A VoiceXML document specifies each interaction dialog to be conducted by a VoiceXML interpreter. User input affects dialog interpretation and is collected into requests submitted to a document server. The document server replies with another VoiceXML document to continue the user's session with other dialogs. VoiceXML is a markup language that: • • • • •
Minimizes client/server interactions by specifying multiple interactions per document. Shields application authors from low‐level, and platform‐specific details. Separates user interaction code (in VoiceXML) from service logic (e.g. CGI scripts). Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers. Is easy to use for simple interactions, and yet provides language features to support complex dialogs.
While VoiceXML strives to accommodate the requirements of a majority of voice response services, services with stringent requirements may best be served by dedicated applications that employ a finer level of control. 1.2.3 Scope of VoiceXML The language describes the human‐machine interaction provided by voice response systems, which includes: • • • • • • •
Output of synthesized speech (text‐to‐speech). Output of audio files. Recognition of spoken input. Recognition of DTMF input. Recording of spoken input. Control of dialog flow. Telephony features such as call transfer and disconnect.
The language provides means for collecting character and/or spoken input, assigning the input results to document‐defined request variables, and making decisions that affect the interpretation of documents written in the language. A document may be linked to other documents through Universal Resource Identifiers (URIs). 182
1.2.4 Principles of Design VoiceXML is an XML application [XML]. 1. The language promotes portability of services through abstraction of platform resources. 2. The language accommodates platform diversity in supported audio file formats, speech grammar formats, and URI schemes. While producers of platforms may support various grammar formats the language requires a common grammar format, namely the XML Form of the W3C Speech Recognition Grammar Specification [SRGS], to facilitate interoperability. Similarly, while various audio formats for playback and recording may be supported, the audio formats described in Appendix E must be supported 3. The language supports ease of authoring for common types of interactions. 4. The language has well‐defined semantics that preserves the author's intent regarding the behavior of interactions with the user. Client heuristics are not required to determine document element interpretation. 5. The language recognizes semantic interpretations from grammars and makes this information available to the application. 6. The language has a control flow mechanism. 7. The language enables a separation of service logic from interaction behavior. 8. It is not intended for intensive computation, database operations, or legacy system operations. These are assumed to be handled by resources outside the document interpreter, e.g. a document server. 9. General service logic, state management, dialog generation, and dialog sequencing are assumed to reside outside the document interpreter. 10. The language provides ways to link documents using URIs, and also to submit data to server scripts using URIs. 11. VoiceXML provides ways to identify exactly which data to submit to the server, and which HTTP method (GET or POST) to use in the submittal. 12. The language does not require document authors to explicitly allocate and deallocate dialog resources, or deal with concurrency. Resource allocation and concurrent threads of control are to be handled by the implementation platform. 1.2.5 Implementation Platform Requirements This section outlines the requirements on the hardware/software platforms that will support a VoiceXML interpreter. Document acquisition. The interpreter context is expected to acquire documents for the VoiceXML interpreter to act on. The "http" URI scheme must be supported. In some cases, the document request is generated by the interpretation of a VoiceXML document, while other requests are generated by the interpreter context in response to events outside the scope of the language, for example an incoming phone call. When issuing document requests via http, the interpreter context identifies itself using the "User‐Agent" header variable with the value "/", for example, "acme‐browser/1.2" 183
Audio output. An implementation platform must support audio output using audio files and text‐to‐speech (TTS). The platform must be able to freely sequence TTS and audio output. If an audio output resource is not available, an error.noresource event must be thrown. Audio files are referred to by a URI. The language specifies a required set of audio file formats which must be supported (see Appendix E); additional audio file formats may also be supported. Audio input. An implementation platform is required to detect and report character and/or spoken input simultaneously and to control input detection interval duration with a timer whose length is specified by a VoiceXML document. If an audio input resource is not available, an error.noresource event must be thrown. •
It must report characters (for example, DTMF) entered by a user. Platforms must support the XML form of DTMF grammars described in the W3C Speech Recognition Grammar Specification [SRGS]. They should also support the Augmented BNF (ABNF) form of DTMF grammars described in the W3C Speech Recognition Grammar Specification [SRGS]. It must be able to receive speech recognition grammar data dynamically. It must be able to use speech grammar data in the XML Form of the W3C Speech Recognition Grammar Specification [SRGS]. It should be able to receive speech recognition grammar data in the ABNF form of the W3C Speech Recognition Grammar Specification [SRGS], and may support other formats such as the JSpeech Grammar Format [JSGF] or proprietary formats. Some VoiceXML elements contain speech grammar data; others refer to speech grammar data through a URI. The speech recognizer must be able to accommodate dynamic update of the spoken input for which it is listening through either method of speech grammar data specification. It must be able to record audio received from the user. The implementation platform must be able to make the recording available to a request variable. The language specifies a required set of recorded audio file formats which must be supported (see Appendix E); additional formats may also be supported.
Transfer The platform should be able to support making a third party connection through a communications network, such as the telephone. 1.3 Concepts A VoiceXML document (or a set of related documents called an application) forms a conversational finite state machine. The user is always in one conversational state, or dialog, at a time. Each dialog determines the next dialog to transition to. Transitions are specified using URIs, which define the next document and dialog to use. If a URI does not refer to a document, the current document is assumed. If it does not refer to a dialog, the first dialog in the document is assumed. Execution is terminated when a dialog does not specify a successor, or if it has an element that explicitly exits the conversation.
1.3.1 Dialogs and Subdialogs There are two kinds of dialogs: forms and menus. Forms define an interaction that collects values for a set of form item variables. Each field may specify a grammar that defines the allowable inputs for that field. If a form‐level grammar is present, it can be used to fill several fields from one utterance. A menu presents the user with a choice of options and then transitions to another dialog based on that choice. A subdialog is like a function call, in that it provides a mechanism for invoking a new interaction, and returning to the original form. Variable instances, grammars, and state information are saved and are available upon returning to the calling document. Subdialogs can be used, for example, to create a confirmation sequence that may require a database query; to create a set of components that may be shared among documents in a single application; or to create a reusable library of dialogs shared among many applications. 1.3.2 Sessions A session begins when the user starts to interact with a VoiceXML interpreter context, continues as documents are loaded and processed, and ends when requested by the user, a document, or the interpreter context. 1.3.3 Applications An application is a set of documents sharing the same application root document. Whenever the user interacts with a document in an application, its application root document is also loaded. The application root document remains loaded while the user is transitioning between other documents in the same application, and it is unloaded when the user transitions to a document that is not in the application. While it is loaded, the application root document's variables are available to the other documents as application variables, and its grammars remain active for the duration of the application, subject to the grammar activation rules discussed in Section 3.1.4. Figure 2 shows the transition of documents (D) in an application that share a common application root document (root).
Figure 2: Transitioning between documents in an application. 185
1.3.4 Grammars Each dialog has one or more speech and/or DTMF grammars associated with it. In machine directed applications, each dialog's grammars are active only when the user is in that dialog. In mixed initiative applications, where the user and the machine alternate in determining what to do next, some of the dialogs are flagged to make their grammars active (i.e., listened for) even when the user is in another dialog in the same document, or on another loaded document in the same application. In this situation, if the user says something matching another dialog's active grammars, execution transitions to that other dialog, with the user's utterance treated as if it were said in that dialog. Mixed initiative adds flexibility and power to voice applications. 1.3.5 Events VoiceXML provides a form‐filling mechanism for handling "normal" user input. In addition, VoiceXML defines a mechanism for handling events not covered by the form mechanism. Events are thrown by the platform under a variety of circumstances, such as when the user does not respond, doesn't respond intelligibly, requests help, etc. The interpreter also throws events if it finds a semantic error in a VoiceXML document. Events are caught by catch elements or their syntactic shorthand. Each element in which an event can occur may specify catch elements. Furthermore, catch elements are also inherited from enclosing elements "as if by copy". In this way, common event handling behavior can be specified at any level, and it applies to all lower levels. 1.3.6 Links A link supports mixed initiative. It specifies a grammar that is active whenever the user is in the scope of the link. If user input matches the link's grammar, control transfers to the link's destination URI. A link can be used to throw an event or go to a destination URI. 1.4 VoiceXML Elements Element
Assign a variable a value
Play an audio clip within a prompt
A container of (non‐interactive) executable code
Catch an event
Define a menu item
Purpose Clear one or more form item variables
Disconnect a session
Used in elements
Used in elements
Shorthand for enumerating the choices in a menu
Catch an error event
Exit a session
Declares an input field in a form
An action executed when fields are filled
A dialog for presenting information and collecting data
Go to another dialog in the same or different document
Specify a speech recognition or DTMF grammar
Catch a help event
Simple conditional logic
Declares initial logic upon entry into a (mixed initiative) form
Specify a transition common to all dialogs in the link's scope
Generate a debug message
A dialog for choosing amongst alternative destinations
Define a metadata item as a name/value pair
Define metadata information using a metadata schema
Catch a noinput event
Catch a nomatch event
Interact with a custom extension
Specify an option in a
Parameter in or
Queue speech synthesis and audio output to the user
Control implementation platform settings.
Record an audio sample
Play a field prompt when a field is re‐visited after an 5.3.6 event
Return from a subdialog.
Specify a block of ECMAScript client‐side scripting logic
Invoke another dialog as a subdialog of the current 2.3.4 188
Submit values to a document server
Throw an event.
Transfer the caller to another destination
Insert the value of an expression in a prompt
Declare a variable
Top‐level element in each VoiceXML document
Table 1: VoiceXML Elements 1.5 Document Structure and Execution A VoiceXML document is primarily composed of top‐level elements called dialogs. There are two types of dialogs: forms and menus. A document may also have and elements, and elements, elements, elements, and elements. 1.5.1 Execution within One Document Document execution begins at the first dialog by default. As each dialog executes, it determines the next dialog. When a dialog doesn't specify a successor dialog, document execution stops. Here is "Hello World!" expanded to illustrate some of this. It now has a document level variable called "hi" which holds the greeting. Its value is used as the prompt in the first form. Once the first form plays the greeting, it goes to the form named "say_goodbye", which prompts the user with "Goodbye!" Because the second form does not transition to another dialog, it causes the document to be exited. 189
Goodbye! Alternatively the forms can be combined: Goodbye! Attributes of include:
The version of VoiceXML of this document (required). The current version number is 2.0.
The designated namespace for VoiceXML (required). The namespace for VoiceXML is defined to be http://www.w3.org/2001/vxml.
The base URI for this document as defined in [XML‐BASE]. As in xml:base [HTML], a URI which all relative references within the document take as their base.
The language identifier for this document . If omitted, the value is a platform‐specific default.
application The URI of this document's application root document, if any. Table 2: Attributes
Language information is inherited down the document hierarchy: the value of "xml:lang" is inherited by elements which also define the "xml:lang" attribute, such as and , unless these elements specify an alternative value. 1.5.2 Executing a Multi‐Document Application Normally, each document runs as an isolated application. In cases where you want multiple documents to work together as one application, you select one document to be the application root document, and the rest to be application leaf documents. Each leaf document names the root document in its element. When this is done, every time the interpreter is told to load and execute a leaf document in this application, it first loads the application root document if it is not already loaded. The application root document remains loaded until the interpreter is told to load a document that belongs to a different application. Thus one of the following two conditions always holds during interpretation: • •
The application root document is loaded and the user is executing in it: there is no leaf document. The application root document and a single leaf document are both loaded and the user is executing in the leaf document.
If there is a chain of subdialogs defined in separate documents, then there may be more than one leaf document loaded although execution will only be in one of these documents. When a leaf document load causes a root document load, none of the dialogs in the root document are executed. Execution begins in the leaf document. There are several benefits to multi‐document applications. •
The root document's variables are available for use by the leaf documents, so that information can be shared and retained. 191
• • • •
Root document elements specify default values for properties used in the leaf documents. Common ECMAScript code can be defined in root document elements and used in the leaf documents. Root document elements define default event handling for the leaf documents. Document‐scoped grammars in the root document are active when the user is in a leaf document, so that the user is able to interact with forms, links, and menus in the root document.
Here is a two‐document application illustrating this: Application root document (app‐root.vxml) operator Leaf document (leaf.vxml) Shall we say ? 192
In this example, the application is designed so that leaf.vxml must be loaded first. Its application attribute specifies that app‐root.vxml should be used as the application root document. So, app‐root.vxml is then loaded, which creates the application variable bye and also defines a link that navigates to operator‐xfer.vxml whenever the user says "operator". The user starts out in the say_goodbye form: C: Shall we say Ciao? H: Si. C: I did not understand what you said. (a platform‐specific default message.) C: Shall we say Ciao? H: Ciao C: I did not understand what you said. H: Operator. C: (Goes to operator_xfer.vxml, which transfers the caller to a human operator.) Note that when the user is in a multi‐document application, at most two documents are loaded at any one time: the application root document and, unless the user is actually interacting with the application root document, an application leaf document. A root document's element does not have an application attribute specified. A leaf document's element does have an application attribute specified. An interpreter always has an application root document loaded; it does not always have an application leaf document loaded. The name of the interpreter's current application is the application root document's absolute URI. The absolute URI includes a query string, if present, but it does not include a fragment identifier. The interpreter remains in the same application as long as the name remains the same. When the name changes, a new application is entered and its root context is initialized. The application's root context consists of the variables, grammars, catch elements, scripts, and properties in application scope. During a user session an interpreter transitions from one document to another as requested by , , , and elements. Some transitions are within an application, others are between applications. The preservation or initialization of the root context depends on the type of transition: Root to Leaf Within Application
A root to leaf transition within the same application occurs when the current document is a root document and the target document's application attribute's value resolves to the same absolute URI as the name of the current application. The application root document and its context are preserved. Leaf to Leaf Within Application A leaf to leaf transition within the same application occurs when the current document is a leaf document and the target document's application attribute's value resolves to the same absolute URI as the name of the current application. The application root document and its context are preserved. Leaf to Root Within Application A leaf to root transition within the same application occurs when the current document is a leaf document and the target document's absolute URI is the same as the name of the current application. The current application root document and its context are preserved when the transition is caused by a , , or element. The root context is initialized when a element causes the leaf to root transition, because a always results in a fetch of its URI. Root to Root A root to root transition occurs when the current document is a root document and the target document is a root document, i.e. it does not have an application attribute. The root context is initialized with the application root document returned by the caching policy in Section 6.1.2. The caching policy is consulted even when the name of the target application and the current application are the same. Subdialog A subdialog invocation occurs when a root or leaf document executes a element. As discussed in Section 2.3.4, subdialog invocation creates a new execution context. The application root document and its context in the calling document's execution context are preserved untouched during subdialog execution, and are used again once the subdialog returns. A subdialog's new execution context has its own root context and, possibly, leaf context. When the subdialog is invoked with a non‐empty URI reference, the caching policy in Section 6.1.2 is used to acquire the root and leaf documents that will be used to initialize the new root and leaf contexts. If a subdialog is invoked with an empty URI reference and a fragment identifier, e.g. "#sub1", the root and leaf documents remain unchanged, and therefore the current root and leaf documents will be used to initialize the new root and leaf contexts. Inter‐Application Transitions 194
All other transitions are between applications which cause the application root context to be initialized with the next application's root document. If a document refers to a non‐existent application root document, an error.badfetch event is thrown. If a document's application attribute refers to a document that also has an application attribute specified, an error.semantic event is thrown. The following diagrams illustrate the effect of the transitions between root and leaf documents on the application root context. In these diagrams, boxes represent documents, box texture changes identify root context initialization, solid arrows symbolize transitions to the URI in the arrow's label, dashed vertical arrows indicate an application attribute whose URI is the arrow's label.
Figure 3: Transitions that Preserve the Root Context
In this diagram, all the documents belong to the same application. The transitions are identified by the numbers 1‐4 across the top of the figure. They are: 1. A transition to URI A results in document 1, the application context is initialized from document 1's content. Assume that this is the first document in the session. The current application's name is A. 2. Document 1 specifies a transition to URI B, which yields document 2. Document 2's application attribute equals URI A. The root is document 1 with its context preserved. This is a root to leaf transition within the same application. 3. Document 2 specifies a transition to URI C, which yields another leaf document, document 3. Its application attribute also equals URI A. The root is document 1 with its context preserved. This is a leaf to leaf transition within the same application. 4. Document 3 specifies a transition to URI A using a , , or . Document 1 is used with its root context intact. This is a leaf to root transition within the same application. The next diagram illustrates transitions which initialize the root context. 195
Figure 4: Transitions that Initialize the Root Context
5. Document 1 specifies a transition to its own URI A. The resulting document 4 does not have an application attribute, so it is considered a root document, and the root context is initialized. This is a root to root transition. 6. Document 4 specifies a transition to URI D, which yields a leaf document 5. Its application attribute is different: URI E. A new application is being entered. URI E produces the root document 6. The root context is initialized from the content of document 6. This is an inter‐application transition. 7. Document 5 specifies a transition to URI A. The cache check returns document 4 which does not have an application attribute and therefore belongs to application A, so the root context is initialized. Initialization occurs even though this application and this root document were used earlier in the session. This is an inter‐application transition. 1.5.3 Subdialogs A subdialog is a mechanism for decomposing complex sequences of dialogs to better structure them, or to create reusable components. For example, the solicitation of account information may involve gathering several pieces of information, such as account number, and home telephone number. A customer care service might be structured with several independent applications that could share this basic building block, thus it would be reasonable to construct it as a subdialog. This is illustrated in the example below. The first document, app.vxml, seeks to adjust a customer's account, and in doing so must get the account information and then the adjustment level. The account information is obtained by using a subdialog element that invokes another VoiceXML document to solicit the user input. While the second document is being executed, the calling dialog is suspended, awaiting the return of information. The second document provides the results of its user interactions using a element, and the resulting values are accessed through the variable defined by the name attribute on the element. Customer Service Application (app.vxml) 196
What is the value of your account adjustment? Document Containing Account Information Subdialog (acct_info.vxml) What is your account number? 197
What is your home telephone number? Subdialogs add a new execution context when they are invoked.The subdialog could be a new dialog within the existing document, or a new dialog within a new document. Subdialogs can be composed of several documents. Figure 5 shows the execution flow where a sequence of documents (D) transitions to a subdialog (SD) and then back.
Figure 5: Subdialog composed of several documents returning from the last subdialog document.
The execution context in dialog D2 is suspended when it invokes the subdialog SD1 in document sd1.vxml. This subdialog specifies execution is to be transfered to the dialog in sd2.vxml (using ). Consequently, when the dialog in sd2.vxml returns, control is returned directly to dialog D2. Figure 6 shows an example of a multi‐document subdialog where control is transferred from one subdialog to another. 198
Figure 6: Subdialog composed of several documents returning from the first subdialog document.
The subdialog in sd1.vxml specifies that control is to be transfered to a second subdialog, SD2, in sd2.vxml. When executing SD2, there are two suspended contexts: the dialog context in D2 is suspending awaiting SD1 to return; and the dialog context in SD1 awaiting SD2 to return. When SD2 returns, control is returned to the SD1. It in turn returns control to dialog D2. 1.5.4 Final Processing Under certain circumstances (in particular, while the VoiceXML interpreter is processing a disconnect event) the interpreter may continue executing in the final processing state after there is no longer a connection to allow the interpreter to interact with the end user. The purpose of this state is to allow the VoiceXML application to perform any necessary final cleanup, such as submitting information to the application server. For example, the following element will catch the connection.disconnect.hangup event and execute in the final processing state: While in the final processing state the application must remain in the transitioning state and may not enter the waiting state (as described in Section 4.1.8). Thus for example the application should not enter , , or while in the final processing state. The VoiceXML interpreter must exit if the VoiceXML application attempts to enter the waiting state while in the final processing state. 199
Aside from this restriction, execution of the VoiceXML application continues normally while in the final processing state. Thus for example the application may transition between documents while in the final processing state, and the interpreter must exit if no form item is eligible to be selected (as described in Section 2.1.1). 1 Introduction This specification defines the syntax and semantics of the XSLT language. A transformation in the XSLT language is expressed as a well‐formed XML document [XML] conforming to the Namespaces in XML Recommendation [XML Names], which may include both elements that are defined by XSLT and elements that are not defined by XSLT. XSLT‐defined elements are distinguished by belonging to a specific XML namespace (see [2.1 XSLT Namespace]), which is referred to in this specification as the XSLT namespace. Thus this specification is a definition of the syntax and semantics of the XSLT namespace. A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern is matched against elements in the source tree. A template is instantiated to create part of the result tree. The result tree is separate from the source tree. The structure of the result tree can be completely different from the structure of the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary structure can be added. A transformation expressed in XSLT is called a stylesheet. This is because, in the case when XSLT is transforming into the XSL formatting vocabulary, the transformation functions as a stylesheet. This document does not specify how an XSLT stylesheet is associated with an XML document. It is recommended that XSL processors support the mechanism described in [XML Stylesheet]. When this or any other mechanism yields a sequence of more than one XSLT stylesheet to be applied simultaneously to a XML document, then the effect should be the same as applying a single stylesheet that imports each member of the sequence in order (see [2.6.2 Stylesheet Import]). A stylesheet contains a set of template rules. A template rule has two parts: a pattern which is matched against nodes in the source tree and a template which can be instantiated to form part of the result tree. This allows a stylesheet to be applicable to a wide class of documents that have similar source tree structures. A template is instantiated for a particular source element to create part of the result tree. A template can contain elements that specify literal result element structure. A template can also contain elements from the XSLT namespace that are instructions for creating result tree fragments. When a template is instantiated, each instruction is executed and replaced by the result tree fragment that it creates. Instructions can select and process descendant source elements. Processing a descendant element creates a result tree fragment by finding the applicable template rule and instantiating its template. Note that elements are only processed 200
when they have been selected by the execution of an instruction. The result tree is constructed by finding the template rule for the root node and instantiating its template. In the process of finding the applicable template rule, more than one template rule may have a pattern that matches a given element. However, only one template rule will be applied. The method for deciding which template rule to apply is described in [5.5 Conflict Resolution for Template Rules]. A single template by itself has considerable power: it can create structures of arbitrary complexity; it can pull string values out of arbitrary locations in the source tree; it can generate structures that are repeated according to the occurrence of elements in the source tree. For simple transformations where the structure of the result tree is independent of the structure of the source tree, a stylesheet can often consist of only a single template, which functions as a template for the complete result tree. Transformations on XML documents that represent data are often of this kind (see [D.2 Data Example]). XSLT allows a simplified syntax for such stylesheets (see [2.3 Literal Result Element as Stylesheet]). When a template is instantiated, it is always instantiated with respect to a current node and a current node list. The current node is always a member of the current node list. Many operations in XSLT are relative to the current node. Only a few instructions change the current node list or the current node (see [5 Template Rules] and [8 Repetition]); during the instantiation of one of these instructions, the current node list changes to a new list of nodes and each member of this new list becomes the current node in turn; after the instantiation of the instruction is complete, the current node and current node list revert to what they were before the instruction was instantiated. XSLT makes use of the expression language defined by [XPath] for selecting elements for processing, for conditional processing and for generating text. XSLT provides two "hooks" for extending the language, one hook for extending the set of instruction elements used in templates and one hook for extending the set of functions used in XPath expressions. These hooks are both based on XML namespaces. This version of XSLT does not define a mechanism for implementing the hooks. See [14 Extensions]. NOTE:The XSL WG intends to define such a mechanism in a future version of this specification or in a separate specification. The element syntax summary notation used to describe the syntax of XSLT‐defined elements is described in [18 Notation]. The MIME media types text/xml and application/xml [RFC2376] should be used for XSLT stylesheets. It is possible that a media type will be registered specifically for XSLT stylesheets; if and when it is, that media type may also be used.
2 Stylesheet Structure 2.1 XSLT Namespace The XSLT namespace has the URI http://www.w3.org/1999/XSL/Transform. NOTE:The 1999 in the URI indicates the year in which the URI was allocated by the W3C. It does not indicate the version of XSLT being used, which is specified by attributes (see [2.2 Stylesheet Element] and [2.3 Literal Result Element as Stylesheet]). XSLT processors must use the XML namespaces mechanism [XML Names] to recognize elements and attributes from this namespace. Elements from the XSLT namespace are recognized only in the stylesheet not in the source document. The complete list of XSLT‐defined elements is specified in [B Element Syntax Summary]. Vendors must not extend the XSLT namespace with additional elements or attributes. Instead, any extension must be in a separate namespace. Any namespace that is used for additional instruction elements must be identified by means of the extension element mechanism specified in [14.1 Extension Elements]. This specification uses a prefix of xsl: for referring to elements in the XSLT namespace. However, XSLT stylesheets are free to use any prefix, provided that there is a namespace declaration that binds the prefix to the URI of the XSLT namespace. An element from the XSLT namespace may have any attribute not from the XSLT namespace, provided that the expanded‐name of the attribute has a non‐null namespace URI. The presence of such attributes must not change the behavior of XSLT elements and functions defined in this document. Thus, an XSLT processor is always free to ignore such attributes, and must ignore such attributes without giving an error if it does not recognize the namespace URI. Such attributes can provide, for example, unique identifiers, optimization hints, or documentation. It is an error for an element from the XSLT namespace to have attributes with expanded‐names that have null namespace URIs (i.e. attributes with unprefixed names) other than attributes defined for the element in this document. NOTE:The conventions used for the names of XSLT elements, attributes and functions are that names are all lower‐case, use hyphens to separate words, and use abbreviations only if they already appear in the syntax of a related language such as XML or HTML. 2.2 Stylesheet Element 202
A stylesheet is represented by an xsl:stylesheet element in an XML document. xsl:transform is allowed as a synonym for xsl:stylesheet. An xsl:stylesheet element must have a version attribute, indicating the version of XSLT that the stylesheet requires. For this version of XSLT, the value should be 1.0. When the value is not equal to 1.0, forwards‐compatible processing mode is enabled (see [2.5 Forwards‐Compatible Processing]). The xsl:stylesheet element may contain the following types of elements: • • • • • • • • • • • •
xsl:import xsl:include xsl:strip‐space xsl:preserve‐space xsl:output xsl:key xsl:decimal‐format xsl:namespace‐alias xsl:attribute‐set xsl:variable xsl:param xsl:template
An element occurring as a child of an xsl:stylesheet element is called a top‐level element. This example shows the structure of a stylesheet. Ellipses (...) indicate where attribute values or content have been omitted. Although this example shows one of each type of allowed element, stylesheets may contain zero or more of each of these elements. 203
... ... ... ... ... The order in which the children of the xsl:stylesheet element occur is not significant except for xsl:import elements and for error recovery. Users are free to order the elements as they prefer, and stylesheet creation tools need not provide control over the order in which the elements occur. In addition, the xsl:stylesheet element may contain any element not from the XSLT namespace, provided that the expanded‐name of the element has a non‐null namespace URI. The presence of such top‐level elements must not change the behavior of XSLT elements and functions defined in this document; for example, it would not be permitted for such a top‐level element to specify that xsl:apply‐templates was to use different rules to resolve conflicts. Thus, an XSLT processor is always free to ignore such top‐level elements, and must ignore a top‐level element without giving an error if it does not recognize the namespace URI. Such elements can provide, for example, • • • • •
information used by extension elements or extension functions (see [14 Extensions]), information about what to do with the result tree, information about how to obtain the source tree, metadata about the stylesheet, structured documentation for the stylesheet. 204
2.3 Literal Result Element as Stylesheet A simplified syntax is allowed for stylesheets that consist of only a single template for the root node. The stylesheet may consist of just a literal result element (see [7.1.1 Literal Result Elements]). Such a stylesheet is equivalent to a stylesheet with an xsl:stylesheet element containing a template rule containing the literal result element; the template rule has a match pattern of /. For example Expense Report Summary Total Amount: has the same meaning as Expense Report Summary Total Amount: A literal result element that is the document element of a stylesheet must have an xsl:version attribute, which indicates the version of XSLT that the stylesheet requires. For this version of XSLT, the value should be 1.0; the value must be a Number. Other literal result elements may also have an xsl:version attribute. When the xsl:version attribute is not equal to 1.0, forwards‐ compatible processing mode is enabled (see [2.5 Forwards‐Compatible Processing]). The allowed content of a literal result element when used as a stylesheet is no different from when it occurs within a stylesheet. Thus, a literal result element used as a stylesheet cannot contain top‐level elements. 205
In some situations, the only way that a system can recognize that an XML document needs to be processed by an XSLT processor as an XSLT stylesheet is by examining the XML document itself. Using the simplified syntax makes this harder. NOTE:For example, another XML language (AXL) might also use an axl:version on the document element to indicate that an XML document was an AXL document that required processing by an AXL processor; if a document had both an axl:version attribute and an xsl:version attribute, it would be unclear whether the document should be processed by an XSLT processor or an AXL processor. Therefore, the simplified syntax should not be used for XSLT stylesheets that may be used in such a situation. This situation can, for example, arise when an XSLT stylesheet is transmitted as a message with a MIME media type of text/xml or application/xml to a recipient that will use the MIME media type to determine how the message is processed. 2.4 Qualified Names The name of an internal XSLT object, specifically a named template (see [6 Named Templates]), a mode (see [5.7 Modes]), an attribute set (see [7.1.4 Named Attribute Sets]), a key (see [12.2 Keys]), a decimal‐format (see [12.3 Number Formatting]), a variable or a parameter (see [11 Variables and Parameters]) is specified as a QName. If it has a prefix, then the prefix is expanded into a URI reference using the namespace declarations in effect on the attribute in which the name occurs. The expanded‐name consisting of the local part of the name and the possibly null URI reference is used as the name of the object. The default namespace is not used for unprefixed names. 2.5 Forwards‐Compatible Processing An element enables forwards‐compatible mode for itself, its attributes, its descendants and their attributes if either it is an xsl:stylesheet element whose version attribute is not equal to 1.0, or it is a literal result element that has an xsl:version attribute whose value is not equal to 1.0, or it is a literal result element that does not have an xsl:version attribute and that is the document element of a stylesheet using the simplified syntax (see [2.3 Literal Result Element as Stylesheet]). A literal result element that has an xsl:version attribute whose value is equal to 1.0 disables forwards‐compatible mode for itself, its attributes, its descendants and their attributes. If an element is processed in forwards‐compatible mode, then: • •
if it is a top‐level element and XSLT 1.0 does not allow such elements as top‐level elements, then the element must be ignored along with its content; if it is an element in a template and XSLT 1.0 does not allow such elements to occur in templates, then if the element is not instantiated, an error must not be signaled, and if the element is instantiated, the XSLT must perform fallback for the element as specified in [15 Fallback]; 206
if the element has an attribute that XSLT 1.0 does not allow the element to have or if the element has an optional attribute with a value that the XSLT 1.0 does not allow the attribute to have, then the attribute must be ignored.
Thus, any XSLT 1.0 processor must be able to process the following stylesheet without error, although the stylesheet includes elements from the XSLT namespace that are not defined in this specification: XSLT 1.1 required Sorry, this stylesheet requires XSLT 1.1. NOTE:If a stylesheet depends crucially on a top‐level element introduced by a version of XSL after 1.0, then the stylesheet can use an xsl:message element with terminate="yes" (see [13 Messages]) to ensure that XSLT processors implementing earlier versions of XSL will not silently ignore the top‐level element. For example, Sorry, this stylesheet requires XSLT 1.1. 207
... ... If an expression occurs in an attribute that is processed in forwards‐compatible mode, then an XSLT processor must recover from errors in the expression as follows: • • •
if the expression does not match the syntax allowed by the XPath grammar, then an error must not be signaled unless the expression is actually evaluated; if the expression calls a function with an unprefixed name that is not part of the XSLT library, then an error must not be signaled unless the function is actually called; if the expression calls a function with a number of arguments that XSLT does not allow or with arguments of types that XSLT does not allow, then an error must not be signaled unless the function is actually called.
2.6 Combining Stylesheets XSLT provides two mechanisms to combine stylesheets: • •
an inclusion mechanism that allows stylesheets to be combined without changing the semantics of the stylesheets being combined, and an import mechanism that allows stylesheets to override each other.
2.6.1 Stylesheet Inclusion An XSLT stylesheet may include another XSLT stylesheet using an xsl:include element. The xsl:include element has an href attribute whose value is a URI reference identifying the stylesheet to be included. A relative URI is resolved relative to the base URI of the xsl:include element (see [3.2 Base URI]). The xsl:include element is only allowed as a top‐level element. The inclusion works at the XML tree level. The resource located by the href attribute value is parsed as an XML document, and the children of the xsl:stylesheet element in this document replace the xsl:include element in the including document. The fact that template rules or definitions are included does not affect the way they are processed.
The included stylesheet may use the simplified syntax described in [2.3 Literal Result Element as Stylesheet]. The included stylesheet is treated the same as the equivalent xsl:stylesheet element. It is an error if a stylesheet directly or indirectly includes itself. NOTE:Including a stylesheet multiple times can cause errors because of duplicate definitions. Such multiple inclusions are less obvious when they are indirect. For example, if stylesheet B includes stylesheet A, stylesheet C includes stylesheet A, and stylesheet D includes both stylesheet B and stylesheet C, then A will be included indirectly by D twice. If all of B, C and D are used as independent stylesheets, then the error can be avoided by separating everything in B other than the inclusion of A into a separate stylesheet B' and changing B to contain just inclusions of B' and A, similarly for C, and then changing D to include A, B', C'. 2.6.2 Stylesheet Import An XSLT stylesheet may import another XSLT stylesheet using an xsl:import element. Importing a stylesheet is the same as including it (see [2.6.1 Stylesheet Inclusion]) except that definitions and template rules in the importing stylesheet take precedence over template rules and definitions in the imported stylesheet; this is described in more detail below. The xsl:import element has an href attribute whose value is a URI reference identifying the stylesheet to be imported. A relative URI is resolved relative to the base URI of the xsl:import element (see [3.2 Base URI]). The xsl:import element is only allowed as a top‐level element. The xsl:import element children must precede all other element children of an xsl:stylesheet element, including any xsl:include element children. When xsl:include is used to include a stylesheet, any xsl:import elements in the included document are moved up in the including document to after any existing xsl:import elements in the including document. For example, italic
The xsl:stylesheet elements encountered during processing of a stylesheet that contains xsl:import elements are treated as forming an import tree. In the import tree, each xsl:stylesheet element has one import child for each xsl:import element that it contains. Any xsl:include elements are resolved before constructing the import tree. An xsl:stylesheet element in the import tree is defined to have lower import precedence than another xsl:stylesheet element in the import tree if it would be visited before that xsl:stylesheet element in a post‐order traversal of the import tree (i.e. a traversal of the import tree in which an xsl:stylesheet element is visited after its import children). Each definition and template rule has import precedence determined by the xsl:stylesheet element that contains it. For example, suppose • • •
stylesheet A imports stylesheets B and C in that order; stylesheet B imports stylesheet D; stylesheet C imports stylesheet E.
Then the order of import precedence (lowest first) is D, B, E, C, A. NOTE:Since xsl:import elements are required to occur before any definitions or template rules, an implementation that processes imported stylesheets at the point at which it encounters the xsl:import element will encounter definitions and template rules in increasing order of import precedence. In general, a definition or template rule with higher import precedence takes precedence over a definition or template rule with lower import precedence. This is defined in detail for each kind of definition and for template rules. It is an error if a stylesheet directly or indirectly imports itself. Apart from this, the case where a stylesheet with a particular URI is imported in multiple places is not treated specially. The import tree will have a separate xsl:stylesheet for each place that it is imported. NOTE:If xsl:apply‐imports is used (see [5.6 Overriding Template Rules]), the behavior may be different from the behavior if the stylesheet had been imported only at the place with the highest import precedence. 2.7 Embedding Stylesheets Normally an XSLT stylesheet is a complete XML document with the xsl:stylesheet element as the document element. However, an XSLT stylesheet may also be embedded in another resource. Two forms of embedding are possible: • •
the XSLT stylesheet may be textually embedded in a non‐XML resource, or the xsl:stylesheet element may occur in an XML document other than as the document element.
To facilitate the second form of embedding, the xsl:stylesheet element is allowed to have an ID attribute that specifies a unique identifier. NOTE:In order for such an attribute to be used with the XPath id function, it must actually be declared in the DTD as being an ID. The following example shows how the xml‐stylesheet processing instruction [XML Stylesheet] can be used to allow a document to contain its own stylesheet. The URI reference uses a relative URI with a fragment identifier to locate the xsl:stylesheet element: ... NOTE:A stylesheet that is embedded in the document to which it is to be applied or that may be included or imported into an stylesheet that is so embedded typically needs to contain a template rule that specifies that xsl:stylesheet elements are to be ignored. 3 Data Model The data model used by XSLT is the same as that used by XPath with the additions described in this section. XSLT operates on source, result and stylesheet documents using the same data model. Any two XML documents that have the same tree will be treated the same by XSLT. Processing instructions and comments in the stylesheet are ignored: the stylesheet is treated as if neither processing instruction nodes nor comment nodes were included in the tree that represents the stylesheet. 211
3.1 Root Node Children The normal restrictions on the children of the root node are relaxed for the result tree. The result tree may have any sequence of nodes as children that would be possible for an element node. In particular, it may have text node children, and any number of element node children. When written out using the XML output method (see [16 Output]), it is possible that a result tree will not be a well‐formed XML document; however, it will always be a well‐formed external general parsed entity. When the source tree is created by parsing a well‐formed XML document, the root node of the source tree will automatically satisfy the normal restrictions of having no text node children and exactly one element child. When the source tree is created in some other way, for example by using the DOM, the usual restrictions are relaxed for the source tree as for the result tree. 3.2 Base URI Every node also has an associated URI called its base URI, which is used for resolving attribute values that represent relative URIs into absolute URIs. If an element or processing instruction occurs in an external entity, the base URI of that element or processing instruction is the URI of the external entity; otherwise, the base URI is the base URI of the document. The base URI of the document node is the URI of the document entity. The base URI for a text node, a comment node, an attribute node or a namespace node is the base URI of the parent of the node. 3.3 Unparsed Entities The root node has a mapping that gives the URI for each unparsed entity declared in the document's DTD. The URI is generated from the system identifier and public identifier specified in the entity declaration. The XSLT processor may use the public identifier to generate a URI for the entity instead of the URI specified in the system identifier. If the XSLT processor does not use the public identifier to generate the URI, it must use the system identifier; if the system identifier is a relative URI, it must be resolved into an absolute URI using the URI of the resource containing the entity declaration as the base URI [RFC2396]. 3.4 Whitespace Stripping After the tree for a source document or stylesheet document has been constructed, but before it is otherwise processed by XSLT, some text nodes are stripped. A text node is never stripped unless it contains only whitespace characters. Stripping the text node removes the text node from the tree. The stripping process takes as input a set of element names for which whitespace must be preserved. The stripping process is applied to both stylesheets and source documents, but the set of whitespace‐preserving element names is determined differently for stylesheets and for source documents. A text node is preserved if any of the following apply:
• • •
The element name of the parent of the text node is in the set of whitespace‐preserving element names. The text node contains at least one non‐whitespace character. As in XML, a whitespace character is #x20, #x9, #xD or #xA. An ancestor element of the text node has an xml:space attribute with a value of preserve, and no closer ancestor element has xml:space with a value of default.
Otherwise, the text node is stripped. The xml:space attributes are not stripped from the tree. NOTE:This implies that if an xml:space attribute is specified on a literal result element, it will be included in the result. For stylesheets, the set of whitespace‐preserving element names consists of just xsl:text. For source documents, the set of whitespace‐preserving element names is specified by xsl:strip‐ space and xsl:preserve‐space top‐level elements. These elements each have an elements attribute whose value is a whitespace‐separated list of NameTests. Initially, the set of whitespace‐preserving element names contains all element names. If an element name matches a NameTest in an xsl:strip‐space element, then it is removed from the set of whitespace‐preserving element names. If an element name matches a NameTest in an xsl:preserve‐space element, then it is added to the set of whitespace‐preserving element names. An element matches a NameTest if and only if the NameTest would be true for the element as an XPath node test. Conflicts between matches to xsl:strip‐space and xsl:preserve‐ space elements are resolved the same way as conflicts between template rules (see [5.5 Conflict Resolution for Template Rules]). Thus, the applicable match for a particular element name is determined as follows: • •
First, any match with lower import precedence than another match is ignored. Next, any match with a NameTest that has a lower default priority than the default priority of the NameTest of another match is ignored.
It is an error if this leaves more than one match. An XSLT processor may signal the error; if it does not signal the error, it must recover by choosing, from amongst the matches that are left, the one that occurs last in the stylesheet.
4 Expressions XSLT uses the expression language defined by XPath [XPath]. Expressions are used in XSLT for a variety of purposes including: • • •
selecting nodes for processing; specifying conditions for different ways of processing a node; generating text to be inserted in the result tree.
An expression must match the XPath production Expr. Expressions occur as the value of certain attributes on XSLT‐defined elements and within curly braces in attribute value templates. In XSLT, an outermost expression (i.e. an expression that is not part of another expression) gets its context as follows: • • • • •
the context node comes from the current node the context position comes from the position of the current node in the current node list; the first position is 1 the context size comes from the size of the current node list the variable bindings are the bindings in scope on the element which has the attribute in which the expression occurs (see [11 Variables and Parameters]) the set of namespace declarations are those in scope on the element which has the attribute in which the expression occurs; this includes the implicit declaration of the prefix xml required by the the XML Namespaces Recommendation [XML Names]; the default namespace (as declared by xmlns) is not part of this set the function library consists of the core function library together with the additional functions defined in [12 Additional Functions] and extension functions as described in [14 Extensions]; it is an error for an expression to include a call to any other function
1 Introduction This specification defines the XML Linking Language (XLink), which allows elements to be inserted into XML documents in order to create and describe links between resources. XLink provides a framework for creating both basic unidirectional links and more complex linking structures. It allows XML documents to: • • •
Assert linking relationships among more than two resources Associate metadata with a link Express links that reside in a location separate from the linked resources
An important application of XLink is in hypermedia systems that have hyperlinks. A simple case of a hyperlink is an HTML A element, which has these characteristics: 214
• • • • •
The hyperlink uses URIs as its locator technology. The hyperlink is expressed at one of its two ends. The hyperlink identifies the other end (although a server may have great freedom in finding or dynamically creating that destination). Users can initiate traversal only from the end where the hyperlink is expressed to the other end. The hyperlink's effect on windows, frames, go‐back lists, style sheets in use, and so on is determined by user agents, not by the hyperlink itself. For example, traversal of A links normally replaces the current view, perhaps with a user option to open a new window.
This set of characteristics is powerful, but the model that underlies them limits the range of possible hyperlink functionality. The model defined in this specification shares with HTML the use of URI technology, but goes beyond HTML in offering features, previously available only in dedicated hypermedia systems, that make hyperlinking more scalable and flexible. Along with providing linking data structures, XLink provides a minimal link behavior model; higher‐level applications layered on XLink will often specify alternate or more sophisticated rendering and processing treatments. Integrated treatment of specialized links used in other technical domains, such as foreign keys in relational databases and reference values in programming languages, is outside the scope of this specification. 1.1 Origin and Goals The design of XLink has been informed by knowledge of established hypermedia systems and standards. The following standards have been especially influential: • • •
HTML [HTML]: Defines several element types that represent links. HyTime [ISO/IEC 10744]: Defines inline and inbound and third‐party link structures and some semantic features, including traversal control and presentation of objects. Text Encoding Initiative Guidelines [TEI]: Provides structures for creating links, aggregate objects, and link collections.
Many other linking systems have also informed the design of XLink, especially [Dexter], [FRESS], [OHS], [MicroCosm], and [Intermedia]. See the XLink Requirements Document [XLREQ] for a thorough explanation of requirements for the design of XLink. 2 XLink Concepts This section describes the terms and concepts that are essential to understanding XLink, without discussing the syntax used to create XLink constructs. A few additional terms are introduced in later parts of this specification.
2.1 Links and Resources [Definition: An XLink link is an explicit relationship between resources or portions of resources.] [Definition: It is made explicit by an XLink linking element, which is an XLink‐conforming XML element that asserts the existence of a link.] There are six XLink elements; only two of them are considered linking elements. The others provide various pieces of information that describe the characteristics of a link. (The term "link" as used in this specification refers only to an XLink link, though nothing prevents non‐XLink constructs from serving as links.) The notion of resources is universal to the World Wide Web. [Definition: As discussed in [IETF RFC 2396], a resource is any addressable unit of information or service.] Examples include files, images, documents, programs, and query results. The means used for addressing a resource is a URI (Uniform Resource Identifier) reference (described more in 5.4 Locator Attribute (href)). It is possible to address a portion of a resource. For example, if the whole resource is an XML document, a useful portion of that resource might be a particular element inside the document. Following a link to it might result, for example, in highlighting that element or scrolling to that point in the document. [Definition: When a link associates a set of resources, those resources are said to participate in the link.] Even though XLink links must appear in XML documents, they are able to associate all kinds of resources, not just XML‐encoded ones. One of the common uses of XLink is to create hyperlinks. [Definition: A hyperlink is a link that is intended primarily for presentation to a human user.] Nothing in XLink's design, however, prevents it from being used with links that are intended solely for consumption by computers. 2.2 Arcs, Traversal, and Behavior [Definition: Using or following a link for any purpose is called traversal.] Even though some kinds of link can associate arbitrary numbers of resources, traversal always involves a pair of resources (or portions of them); [Definition: the source from which traversal is begun is the starting resource] and [Definition: the destination is the ending resource]. Note that the term "resource" used in this fashion may at times apply to a resource portion, not a whole resource. [Definition: Information about how to traverse a pair of resources, including the direction of traversal and possibly application behavior information as well, is called an arc]. If two arcs in a link specify the same pair of resources, but they switch places as starting and ending resources, then the link is multidirectional, which is not the same as merely "going back" after traversing a link. 2.3 Resources in Relation to the Physical Location of a Linking Element [Definition: A local resource is an XML element that participates in a link by virtue of having as its parent, or being itself, a linking element]. [Definition: Any resource or resource portion that participates in a link by virtue of being addressed with a URI reference is considered a remote resource, even if it is in the same XML document as the link, or even inside the same linking 216
element.] Put another way, a local resource is specified "by value," and a remote resource is specified "by reference." [Definition: An arc that has a local starting resource and a remote ending resource goes outbound, that is, away from the linking element.] (Examples of links with such an arc are the HTML A element, HyTime "clinks," and Text Encoding Initiative XREF elements.) [Definition: If an arc's ending resource is local but its starting resource is remote, then the arc goes inbound.] [Definition: If neither the starting resource nor the ending resource is local, then the arc is a third‐party arc.] Though it is not required, any one link typically specifies only one kind of arc throughout, and thus might be referred to as an inbound, outbound, or third‐party link. To create a link that emanates from a resource to which you do not have (or choose not to exercise) write access, or from a resource that offers no way to embed linking constructs, it is necessary to use an inbound or third‐party arc. When such arcs are used, the requirements for discovery of the link are greater than for outbound arcs. [Definition: Documents containing collections of inbound and third‐party links are called link databases, or linkbases.] 3 XLink Processing and Conformance This section details processing and conformance requirements on XLink applications and markup. [Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].] 3.1 Processing Dependencies XLink processing depends on [XML], [XML Names], [XML Base], and [IETF RFC 2396] (as updated by [IETF RFC 2732]). 3.2 Markup Conformance An XML element conforms to XLink if: 1. it has a type attribute from the XLink namespace whose value is one of "simple", "extended", "locator", "arc", "resource", "title", or "none", and 2. it adheres to the conformance constraints imposed by the chosen XLink element type, as prescribed in this specification. This specification imposes no particular constraints on DTDs; conformance applies only to elements and attributes.
3.3 Application Conformance An XLink application is any software module that interprets well‐formed XML documents containing XLink elements and attributes, or XML information sets [XIS] containing information items and properties corresponding to XLink elements and attributes. (This document refers to elements and attributes, but all specifications herein apply to their information set equivalents as well.) Such an application is conforming if: 1. it observes the mandatory conditions for applications ("must") set forth in this specification, and 2. for any optional conditions ("should" and "may") it chooses to observe, it observes them in the way prescribed, and 3. it performs markup conformance testing according to all the conformance constraints appearing in this specification. 4 XLink Markup Design This section describes the design of XLink's markup vocabulary. Link markup needs to be recognized reliably by XLink applications in order to be traversed and handled properly. XLink uses the mechanism described in the Namespaces in XML Recommendation [XML Names] to accomplish recognition of the constructs in the XLink vocabulary. The XLink namespace defined by this specification has the following URI: http://www.w3.org/1999/xlink As dictated by [XML Names], the use of XLink elements and attributes requires declaration of the XLink namespace. For example, the following declaration would make the prefix xlink available within the myElement element to represent the XLink namespace: ... Note: Most code examples in this specification do not show an XLink namespace declaration. The xlink prefix is used throughout to stand for the declaration of the XLink namespace on elements in whose scope the so‐marked attribute appears (on the same element that bears the attribute or on some ancestor element), whether or not an XLink namespace declaration is present in the example. 218
XLink's namespace provides global attributes for use on elements that are in any arbitrary namespace. The global attributes are type, href, role, arcrole, title, show, actuate, label, from, and to. Document creators use the XLink global attributes to make the elements in their own namespace, or even in a namespace they do not control, recognizable as XLink elements. The type attribute indicates the XLink element type (simple, extended, locator, arc, resource, or title); the element type dictates the XLink‐imposed constraints that such an element must follow and the behavior of XLink applications on encountering the element. Following is an example of a crossReference element from a non‐XLink namespace that has XLink global attributes: Current List of Students Using global attributes always requires the use of namespace prefixes on the individual attributes and the use of the type attribute on the element. 4.1 XLink Attribute Usage Patterns While the XLink attributes are considered global by virtue of their use of the namespace mechanism, their allowed combinations on any one XLink element type depend greatly on the value of the special type attribute (see 5.3 XLink Element Type Attribute (type) for more information) for the element on which they appear. The conformance constraint notes in this specification detail their allowed usage patterns. Following is a summary of the element types (columns) on which the global attributes (rows) are allowed, with an indication of whether a value is required (R) or optional (O): simple extended locator arc resource title type
O 219
simple extended locator arc resource title arcrole O O title O O O O O show
actuate O
(See also B Sample DTD for a non‐normative DTD that illustrates the allowed patterns of attributes.) This specification uses the convention "xxx‐type element" to refer to elements that must adhere to a named set of constraints associated with an XLink element type, no matter what name the element actually has. For example, "locator‐type element" would refer to all of the following elements: 4.2 XLink Element Type Relationships Various XLink element types have special meanings dictated by this specification when they appear as direct children of other XLink element types. Following is a summary of the child element types that play a significant role in particular parent element types. (Other combinations have no XLink‐dictated significance.) Parent type
Significant child types
locator, arc, resource, title
Parent type
Significant child types
4.3 Attribute Value Defaulting Using XLink potentially involves using a large number of attributes for supplying important link information. In cases where the values of the desired XLink attributes are unchanging across individual instances in all the documents of a certain type, attribute value defaults (fixed or not) may be added to a DTD so that the attributes do not have to appear physically on element start‐tags. For example, if attribute defaults were provided for the xmlns:xlink, xmlns:my, type, show, and actuate attributes in the example in the introduction to 4 XLink Markup Design, the example would look as follows: Current List of Students Information sets that have been created under the control of a DTD have all attribute values filled in. 4.4 Integrating XLink Usage with Other Markup This specification defines only attributes and attribute values in the XLink namespace. There is no restriction on using non‐XLink attributes alongside XLink attributes. In addition, most XLink attributes are optional and the choice of simple or extended link is up to the markup designer or document creator, so a DTD that uses XLink features need not use or declare the entire set of XLink's attributes. Finally, while this specification identifies the minimum constraints on XLink markup, DTDs that use XLink are free to tighten these constraints. The use of XLink does not absolve a valid document from conforming to the constraints expressed in its governing DTD. Following is an example of a crossReference element with both XLink and non‐XLink attributes:
Current List of Students 4.5 Using XLink with Legacy Markup Because XLink's global attributes require the use of namespace prefixes, non‐XLink‐based links in legacy documents generally do not serve as conforming XLink constructs as they stand, even if attribute value defaulting is used. For example, XHTML 1.0 has an a element with an href attribute, but because the attribute is a local one attached to the a element in the XHTML namespace, it is not the same as an xlink:href global attribute in the XLink namespace. 5 XLink Elements and Attributes XLink offers two kinds of links: Extended links Extended links offer full XLink functionality, such as inbound and third‐party arcs, as well as links that have arbitrary numbers of participating resources. As a result, their structure can be fairly complex, including elements for pointing to remote resources, elements for containing local resources, elements for specifying arc traversal rules, and elements for specifying human‐readable resource and arc titles. XLink defines a way to give an extended link special semantics for finding linkbases; used in this fashion, an extended link helps an XLink application process other links. Simple links Simple links offer shorthand syntax for a common kind of link, an outbound link with exactly two participating resources (into which category HTML‐style A and IMG links fall). Because simple links offer less functionality than extended links, they have no special internal structure. While simple links are conceptually a subset of extended links, they are syntactically different. For example, to convert a simple link into an extended link, several structural changes would be needed. The following sections define the XLink elements and attributes. 222
5.1 Extended Links (extended‐Type Element) [Definition: An extended link is a link that associates an arbitrary number of resources. The participating resources may be any combination of remote and local.] The only kind of link that is able to have inbound and third‐party arcs is an extended link. Typically, extended linking elements are stored separately from the resources they associate (for example, in entirely different documents). Thus, extended links are important for situations where the participating resources are read‐only, or where it is expensive to modify and update them but inexpensive to modify and update a separate linking element, or where the resources are in formats with no native support for embedded links (such as many multimedia formats). The following diagram shows an extended link that associates five remote resources. This could represent, for example, information about a student's course load: one resource being a description of the student, another being a description of the student's academic advisor, two resources representing courses that the student is attending, and the last resource representing a course that the student is auditing.
Without the extended link, the resources might be entirely unrelated; for example, they might be in five separate documents. The lines emanating from the extended link represent the association it creates among the resources. However, notice that the lines do not have directionality. Directionality is expressed with traversal rules; without such rules being provided, the resources are associated in no particular order, with no implication as to whether and how individual resources are accessed. The following diagram shows an extended link that associates five remote resources and one local resource (a special element inside the extended link element). This could represent the same sort of course‐load example as described above, with the addition of the student's grade point average stored locally. Again, the lines represent mere association of the six resources, without traversal directions or behaviors implied. 223
The XLink element type for extended links is any element with an attribute in the XLink namespace called type with a value of "extended". The extended‐type element may contain a mixture of the following elements in any order, possibly along with other content and markup: • • • •
locator‐type elements that address the remote resources participating in the link arc‐type elements that provide traversal rules among the link's participating resources title‐type elements that provide human‐readable labels for the link resource‐type elements that supply local resources that participate in the link
It is not an error for an extended‐type element to associate fewer than two resources. If the link has only one participating resource, or none at all, it is simply untraversable. Such a link may still be useful, for example, to associate properties with a single resource by means of XLink attributes, or to provide a placeholder for link information that will be populated eventually. Subelements of the simple or extended type anywhere inside a parent extended‐type element have no XLink‐specified meaning. Subelements of the locator, arc, or resource type that are not direct children of an extended‐type element have no XLink‐specified meaning. The extended‐type element may have the semantic attributes role and title (see 5.5 Semantic Attributes (role, arcrole, and title)). They supply semantic information about the link as a whole; the role attribute indicates a property that the entire link has, and the title attribute indicates a human‐readable description of the entire link. If other XLink attributes are present on the element, they have no XLink‐specified relationship to the link. If both a title attribute and one or more title‐type elements are present, they have no XLink‐specified relationship; a higher‐level application built on XLink will likely want to specify appropriate treatment (for example, precedence) in this case.
Example: Sample extended‐Type Element Declarations and Instance Following is a non‐normative set of declarations for an extended‐type element and its subelements. Parts of this example are reused throughout this specification. Note that the type attribute and some other attributes are defaulted in the DTD in order to highlight the attributes that are changing on a per‐instance basis. Following is how XML elements using these declarations might look. Course Load for Pat Jones 3.5 5.2 Simple Links (simple‐Type Element) [Definition: A simple link is a link that associates exactly two resources, one local and one remote, with an arc going from the former to the latter. Thus, a simple link is always an outbound link.] The purpose of a simple link is to be a convenient shorthand for the equivalent extended link. A single simple linking element combines the basic functions of an extended‐type element, a locator‐type element, an arc‐type element, and a resource‐type element. The following diagram shows the characteristics of a simple link; it associates one local and one remote resource, and implicitly provides a single traversal arc from the local resource to the remote one. This could represent, for example, the name of a student appearing in text which, when clicked, leads to information about the student.
Example: Simple Link Functionality Done with an Extended Link A simple link could be represented by an extended link in approximately the following way: Pat Jones A simple link combines all the features above (except for the types and labels) into a single element. In cases where only this subset of features is required, the XLink simple linking element is available as an alternative to the extended linking element. The features missing from simple links are as follows: • • • • •
Supplying arbitrary numbers of local and remote resources Specifying an arc from its remote resource to its local resource Associating a title with the single hardwired arc Associating a role or title with the local resource Associating a role or title with the link as a whole
The XLink element for simple links is any element with an attribute in the XLink namespace called type with a value of "simple". The simple equivalent of the above extended link would be as follows: Pat Jones The simple‐type element may have any content. The simple‐type element itself, together with all of its content, is the local resource of the link, as if the element were a resource‐type element. If a simple‐type element contains nested XLink elements, such contained elements 228
have no XLink‐specified relationship to the parent link. It is possible for a simple‐type element to have no content; in cases where the link is expected to be traversed on request, interactive XLink applications will typically generate some content in order to give the user a way to initiate the traversal. The simple‐type element effectively takes the locator attribute href and the semantic attributes role and title from the locator‐type element, and the behavior attributes show and actuate and the single semantic attribute arcrole from the arc‐type element. It is not an error for a simple‐type element to have no locator (href) attribute value. If a value is not provided, the link is simply untraversable. Such a link may still be useful, for example, to associate properties with the resource by means of XLink attributes. Example: Sample simple‐Type Element Declarations and Instance Following is a non‐normative set of declarations for a simple‐type element. Following is how an XML document might use these declarations. ..., and Pat Jones is popular around the student union. 5.3 XLink Element Type Attribute (type) The attribute that identifies XLink element types is type. Constraint: type Value 229
The value of the type attribute must be supplied. The value must be one of "simple", "extended", "locator", "arc", "resource", "title", or "none". When the value of the type attribute is "none", the element has no XLink‐specified meaning, and any XLink‐related content or attributes have no XLink‐specified relationship to the element. Example: Sample type Attribute Declarations Following is a non‐normative attribute‐list declaration for type on an element intended to be simple‐type. For an element that serves as an XLink element only on some occasions, one declaration might be as follows, where the document creator sets the value to "simple" in some circumstances and "none" in others. The use of "none" might be useful in helping XLink applications to avoid checking for the presence of an href value. 1 Introduction XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations [XSLT] and XPointer [XPointer]. The primary purpose of XPath is to address parts of an XML [XML] document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a compact, non‐XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document. In addition to its use for addressing, XPath is also designed so that it has a natural subset that can be used for matching (testing whether or not a node matches a pattern); this use of XPath is described in XSLT. XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string‐value for each type of node. Some types of nodes also have names. XPath fully supports 230
XML Namespaces [XML Names]. Thus, the name of a node is modeled as a pair consisting of a local part and a possibly null namespace URI; this is called an expanded‐name. The data model is described in detail in [5 Data Model]. The primary syntactic construct in XPath is the expression. An expression matches the production Expr. An expression is evaluated to yield an object, which has one of the following four basic types: • • • •
node‐set (an unordered collection of nodes without duplicates) boolean (true or false) number (a floating‐point number) string (a sequence of UCS characters)
Expression evaluation occurs with respect to a context. XSLT and XPointer specify how the context is determined for XPath expressions used in XSLT and XPointer respectively. The context consists of: • • • • •
a node (the context node) a pair of non‐zero positive integers (the context position and the context size) a set of variable bindings a function library the set of namespace declarations in scope for the expression
The context position is always less than or equal to the context size. The variable bindings consist of a mapping from variable names to variable values. The value of a variable is an object, which can be of any of the types that are possible for the value of an expression, and may also be of additional types not specified here. The function library consists of a mapping from function names to functions. Each function takes zero or more arguments and returns a single result. This document defines a core function library that all XPath implementations must support (see [4 Core Function Library]). For a function in the core function library, arguments and result are of the four basic types. Both XSLT and XPointer extend XPath by defining additional functions; some of these functions operate on the four basic types; others operate on additional data types defined by XSLT and XPointer. The namespace declarations consist of a mapping from prefixes to namespace URIs. The variable bindings, function library and namespace declarations used to evaluate a subexpression are always the same as those used to evaluate the containing expression. The context node, context position, and context size used to evaluate a subexpression are sometimes different from those used to evaluate the containing expression. Several kinds of expressions change the context node; only predicates change the context position and context size (see [2.4 Predicates]). When the evaluation of a kind of expression is described, it will always be explicitly stated if the context node, context position, and context size change for the 231
evaluation of subexpressions; if nothing is said about the context node, context position, and context size, they remain unchanged for the evaluation of subexpressions of that kind of expression. XPath expressions often occur in XML attributes. The grammar specified in this section applies to the attribute value after XML 1.0 normalization. So, for example, if the grammar uses the character , , and have a similar meaning. In addition, XLANG provides support for looping with the element, which specifies that a given fragment of the process definition is executed until a specified condition is no longer true. This is particularly useful to support ebXML collaboration patterns such as review or modify which may have recurrent business transactions. Like in BPML, XLANG provides semantics to specify exceptions and exception handlers, with the construct. XLANG has introduced the notion of a context for local declaration of correlation sets and port references, exception handling, and transactional behaviour. A context provides and limits the scope over which declarations, exceptions, and transactions apply. 504
XLANG supports open transactions, but unlike BPML, it does not support coordinated transactions. XLANG transactions follow the model of long‐running transactions, which are associated with compensating actions in case the transaction fails. There is often an issue when specifying the outgoing port addresses: it is rarely possible to know in advance the address of the outgoing message. XLANG allows us to specify that the address bound to these outgoing ports will be supplied dynamically. As with the correlation set, we are confronted with the problem of locating this information in the content of documents. XLANG enables us to bind the address (and other parameters if necessary) to a property definition on a document. If this mechanism is more generic, it is also trickier, since it will strongly depend on the document formats, which may not have been designed to support the corresponding information. For instance, a purchase order will carry the contact information of the buyer, but it may not carry the URL to which the "acknowledge purchase order" should be sent. In general we recommend treating the ebXML header as a document and assigning all correlation sets and binding parameters to the ebXML header whenever possible. We also recommend to bring the corresponding CPA within the business process instance context in order to leverage at run‐time its information. Business Process Contracts This part of XLANG overlaps with ebXML BPSS. However, unlike their name would suggest, contracts do not support any business related semantics. It is merely a mapping between two port types which interact together. There is no notion of business transaction, non‐repudiation, or legally binding transactions. The concept is actually fairly difficult to use in real life since the two port types need to support unidirectional messages in order to establish a contract. Consequently, if your business relationship requires a request followed by a response, they cannot belong to the same contract. A contract can only map ports which are "unidirectional": an input only port will map to an output only port and conversely:
In the rare cases where this is applicable a contract definition would look like this: WSFL Introduction The Web Services Flow Language (WSFL) is an XML language for the description of Web Services compositions. WSFL considers two types of Web Services compositions: The first type specifies the appropriate usage pattern of a collection of Web Services, in such a way that the resulting composition describes how to achieve a particular business goal; typically, the result is a description of a business process. The second type specifies the interaction pattern of a collection of Web Services; in this case, the result is a description of the overall partner interactions. Flow Models 506
In the first case, a composition is created by describing how to use the functionality provided by the collection of composed Web Services. This is also known as flow composition, orchestration, or choreography of Web Services. WSFL models these compositions as specifications of the execution sequence of the functionality provided by the composed Web Services. Execution orders are specified by defining the flow of control and data between Web Services. For this reason, in this document, we will also use the term flow model to refer to the first type of Web Services compositions. Flow models can especially be used to model business processes or workflows based on Web Services. Global Models In the second case, no specification of an execution sequence is provided. Instead, the composition provides a description of how the composed Web Services interact with each other. The interactions are modeled as links between endpoints of the Web Services’ interfaces, each link corresponding to the interaction of one Web Service with an operation of another Web Service’s interface. Because of the decentralized or distributed nature of these interactions, we will use the term global model in this document to refer to this type of Web Services composition. Recursive Composition WSFL provides extensive support for the recursive composition of services: In WSFL, every Web Service composition (a flow model as well as a global model) can itself become a new Web Service, and can thus be used as a component of new compositions. The ability to do 507
recursive composition of Web Services provides scalability to the language and support for top‐down progressive refinement design as well as for bottom‐up aggregation. For these reasons, recursive composition has been a central requirement in the design of the WSFL language. Hierarchical and Peer‐to‐Peer Interaction WSFL compositions support a broad spectrum of interaction patterns between the partners participating in a business process. In particular, both hierarchical interactions and peer‐topeer interactions between partners are supported. Hierarchical interactions are often found in more stable, long‐term relationships between partners, while peer‐to‐peer interactions reflect relationships that are often established dynamically on a per‐instance basis. Language Overview Before getting into a more detailed description of WSFL, we will sketch two use cases for the application of Web Services composition. 2.1 Use Cases In the first use case, an enterprise wants to implement a business process for processing purchase orders using a set of Web Services. They would identify the: B usiness process (for example, check credit history of the customer, reject order, process order, ship goods) Business rules for sequencing of these steps (for example, first check credit, then depending on the outcome, either reject the order or process the order followed by shipment of the goods) Flow of information between the process steps (for example, take purchase order as input to the process, pass it on to check credit, and so on). 508
In this “bottom‐up” development scenario, they would find Web Services already offered by other vendors and companies that can be used to realize the various processing steps (for example, a credit‐checking service offered by a financial institution, a goods‐ production service offered by their favorite supplier, and a shipping service). They would then use WSFL to formally define the new business process.
A WSFL flow model defines the structure of the business process: WSFL activities (circles in the figure above) describe the processing steps, and WSFL data and control links represent the sequencing rules and information flows (eventually performing necessary data mapping) between these activities. For each activity, they would identify the WSFL service provider 509
responsible for the execution of the process step (for example, services offered by shipping company A or by goods‐supplier company B) and define the association between activities in the flow model and operations offered by the service provider using WSFL export and plug link elements. The resulting flow model is shown in the center of the figure above with ”swim lanes” representing the association of activities with service provider roles. The second use case is a variant of the previous example. Here, an enterprise wants to offer a Web Service that mediates between service requesters (customers) who want to order goods and service providers who produce and deliver goods. As in the previous example, the enterprise would define the business process for handling purchase orders as a WSFL flow model. In this case, however, they would not bind the activities to particular service providers. Instead they would identify the kind of service provider (role) they want for each activity (for example, some goods supplier for activity process order, some shipping service for activity ship goods). They would then define the WSDL Web Service interface of the flow model, that is, the WSFL Service Provider Type of the flow model. This interface has two facets: One facet defines the interface that a customer would use when requesting processing of a purchase order, that is, the operations that the Web Service provides for use by service requesters. For example, the new service would provide an operation that takes a purchase order as input and passes it on (through a WSFL flow source) to the activities in the flow model for processing. The other facet identifies the operations that the service requires from the other service providers. 510
For each activity, there is one (proxy) operation on the external interface of the flow model that the service would use to interact with a service provider implementing that activity. The resulting Web Service is depicted as the dark shape around the flow model in the figure above. This Web Service can now be advertised in a service repository where it would attract two kinds of parties: those who want to use services provided by the Web Service (in our case, customers who want to place orders) and those who want to play the role of a service provider (in our example, a shipping or a goods supplying service). To make this model work, the activities in the flow model must be connected to operations that actually perform the process steps represented by each activity. This is done by a WSFL global model (the outermost box in the figure above), which describes the interaction between service providers and requesters. Our enterprise would use WSFL service provider locators to define criteria for selection of a particular service provider and WSFL plug links to associate operations on service provider elements with the service‐requesting operations on the interface of the flow model. A Quick Tour of WSFL The purpose of a WSFL document is to define the composition of Web Services as a flow model or a global model. Both models have a declared public interface and an internal compositional structure. The composition assumes that the Web Services being composed support certain public interfaces, which can be specified as a single port type or as a collection of port types. We call this collection a service provider type. The following code is a simplified example of a WSFL service composition defining a flow 511
model called totalSupplyFlow. The syntax of many elements has been abbreviated in the interest of conciseness. The example assumes a set of WSDL port type and operation definitions as public interface of the service provider types referred to: the supplier and shipper service provider types are somehow assumed by the flow model; the totalSupply service provider type appears to be defined by the flow model, but it has been already defined somewhere else, which is perfectly valid. Note that the flow model imposes “sequencing constraints” for the execution of operations of the totalSupply service provider type. IBM Software Group Web Services Flow Language 10 512
The totalSupplyFlow flow model specifies how to collaborate with two service provider types in order to offer to their joint customers a complete business process. Each of the two service providers used within the flow model is represented by a separate element. One service provider is of type supplier and is referred to as mySupplier in the flow model. The other service provider is of type shipper and is called myShipper. Both service providers contain “binding” information as well. This information is provided by means of a element, which specifies the actual service that will be used when the model is instantiated. In this case, binding information is “static,” but more dynamic binding schemes are possible. The business process represented by the totalSupplyFlow flow model consists of three business tasks, called activities, that have to be performed in order to successfully complete the business process: A purchase order has to be processed, a shipment request must be 513
accepted, and money has to be received. Each of these activities is specified by a separate element.
In our code example, the activities cannot be performed in any order, but there is a sequencing constraint between them: the processing of the purchase order by the supplier must precede the acceptance of the shipping request by the shipper; the money can be received at any time. The precedence rule is specified by simply connecting the two corresponding activities. Two kinds of connections are established, a control connection (through a element), and a data connection (through a element). While the first connects the completion of one activity to the execution of another, the second connection represents a data exchange between the two. Note the element nested inside the data link: it specifies what information needs to be transferred between the two 514
linked activities. Also note that the separation of control flow and data flow is very helpful. For example, a service might only be enabled after the completion of another service without explicitly passing data from the former to the latter. Web Services interact in a peer‐to‐peer manner. This pattern is immediately reflected by the interacting operations. For example, if a flow sends out a message via a notification operation, this operation corresponds to a one‐way operation at a service provider. Pairs of corresponding operations in this sense are referred to as dual operations. In our example, the activity processPO has to send out a process order. For this purpose, the totalSupply service provider type declared by the flow model is assumed to include a port type totalSupplyPT with a sendProcOrder operation, which implements the activity. An element establishes this relation between an activity and its implementing operation. The service provider who is supposed to interact with an activity’s implementation (for example, to process the message sent) is defined through a element. To define the public interface of the composition, the element includes a declaration of the supported service provider type as an attribute of the flow model, and a mapping of operations of the port types of this service provider type to activities of the flow model. As indicated in the following figure, this mapping is specified by an element, which relates an activity of the flow model and an operation of its public interface. This mapping defines the effect of each operation by relating it to the execution of the internal composition. The public interface defines the interaction of a flow model with the “outside,” that is, it specifies which messages are sent and which are used. 515
Service Composition Metamodel This section describes at the conceptual level how Web Services are wired together into flows that represent business processes (see Section 3.1 “Flow Metamodel”). Section 3.2 “Lifecycle Interface” describes how instances of such a business process are manipulated as a whole. In Sections 3.3 “Business Process Lifecycle” and 3.4 “Activity Lifecycle,” we sketch a minimum set of states and the transitions between them that further describe a business process and each of its encompassed activities. Finally, Section 3.5 “Recursive Composition Metamodel” gives an overview on how new Web Services are composed out of other Web Services. 3.1 Flow Metamodel This section describes the main concepts of the metamodel underlying WSFL for specifying flows. This is done by describing its syntax as a special kind of directed graph (Section 3.1.1 “Activities”) and its semantics by showing how each of the syntax elements is to be interpreted in concert with the other syntax element (see Section 3.1.2 “Operational Semantics”). 3.1.1 Syntax This section describes the various ingredients of the metamodel in detail and explains their operational semantics. Activities Operations of Web Services are used within business processes as implementations of activities. An activity represents a business task to be performed as a single step within the context of a business process contributing to the overall business goal to be achieved. The operation used may be perceived as the concrete implementation of the abstract activity to 516
be performed. Refer to Section 3.5.4 “Which Operation Is the Activity Implementation?” for more details. Activities correspond to nodes in a graph. Each activity has a signature that is related to the signature of the operation that is used as the implementation of the activity. Thus, an activity can have an input message, an output message, and multiple fault messages. Each message can have multiple parts, and each part is further defined in some type system.
The figure above depicts an activity A with input message M and output message M’. Input message M has three message parts called µ1, µ2, and µ3. Output message M’ has two message parts, called µ4 and µ5. Message part µ3 is defined through an XML schema the root of which is a that contains some other complex type, a decimal simple type and a simple type that may hold multiple string fields. Control Links Activities are wired together through control links. A control link is a directed edge that 517
prescribes the order in which activities will have to be performed (that is, the potential “control flow” between the activities of the business process). The endpoints of the set of all control links that leave a given activity A represent the possible follow‐on activities A1, …, An of activity A. Transition Conditions Which of the activities A1,…, An actually have to be performed in the concrete instance of the business process (that is, the concrete business context or business situation) is determined by so‐called transition conditions. A transition condition is a Boolean expression that is associated with a control link. The formal parameters of this expression can refer to messages that have been produced by some of the activities that preceded the source of the control link in the flow. When an activity A completes, exactly those control links originating at A are followed to their endpoints the transition conditions of which evaluate to true. This set of activities is referred to as “actual follow‐on activities” of A in contrast to the full set {A1,…, An} of “possible followon activities.” It is said that “control flows from A to the actual successors of A,” or that the “control flow visits the actual successors of A,” or that “navigation proceeds from A to its actual successors,” or something similar like that. In the following figure, activity B might need to be performed after activity A completes. The transition condition of the corresponding control link is specified as an XPath expression that references the output message of A: Activity B will be performed (“control flows to B” or “navigation proceeds to B”) if, and only if, the integer value returned by A will have a value greater than 42. 518 The Origin of Flow Dynamics Note especially that this mechanism is the origin of the whole dynamics within the control flow of business processes: Activities produce actual data values for their output messages, and these values will be substituted as actual parameters of the formal parameters of transition conditions. Exactly those control links will be followed whose transition conditions evaluate to true in their actual parameters. And exactly the endpoints of those control links are the activities that have to be performed next “in the current business context.” Thus, whenever an activity completes, that is, the operation of the Web Service that implements the activity returns data, this actual data can be made the basis for deciding which activities have to be performed next. And these activities are typically highly dependent on the data returned. Control Links As Edges Control links are the first kind of edges in the graph structure that we use to represent models of business processes, or simply, flows. First of all, such an edge is directed, 519
pointing from its source activity to its target activity, that is, from an activity to its (or one of its) potential successor activities. Next, such an edge is “weighted” by a transition condition, determining the actual flow of control. We do allow at most one control link between two different activities. Finally, the resulting directed graph must be acyclic, that is, we do not allow loops within the control structure of a flow (however, see Section “Loops,” on how loops are supported in a controlled manner). Note that tools supporting the graphical construction of WSFL‐compliant flow models can choose to support drawing loops. But the loops supported by the tool must be able to be transformed into the restricted variant of loops supported by WSFL. This restricted variant basically corresponds to “do until” loops. Forks And Parallelism An activity (like activity A in the following figure) is called a fork activity if it has more than one outgoing control link. When activity A completes, all control links leaving A will be determined and all associated transition conditions (pAB and pAC in the figure) will be evaluated in their actual parameters. The target activities of all control links whose transition conditions evaluated to true are exactly the activities that are to be performed next within the flow. For example, if pAB evaluated to true but pAC evaluated to false, exactly activity B will be scheduled to be performed; if pAB evaluated to false and pAC evaluated to true, exactly C is to be performed next. In case both pAB and pAC get the truth‐value of true assigned based on the actual parameters, and both activities B and C will have to be performed next. (We will explain later what happens along paths that are determined by a control link whose transition condition 520
evaluated to false. See “Death‐Path Elimination”). In particular, it is very easy to achieve parallelism in the execution of flows: Simply introduce a fork activity and the “subgraphs” that are spawned‐off by the control links with a true transition condition will be performed in parallel.
Joins and Synchronization Typically, parallel work has to be synchronized at a later time. Synchronization is done through join activities. An activity is called a join activity (like activity F in the figure above) if it has more than one incoming control link. By default, the decision whether a join activity is to be performed or not is deferred until all parallel work that can finally reach the join activity has actually reached it (see “Join Conditions” for potential deviations from this default behavior). In the figure above, when pAB and pAC had been evaluated to true, B and D can be performed in parallel with C, and F cannot be performed until control passed from C to F and from D to F. At that time, the truth‐value of the transition conditions pDF and pCF are known; based on these truth‐values it can be specified whether F should be performed if, and only 521
if, both parallel executions successfully reached F (“pDF AND pCF”), or whether it suffice that at least one of the parallel executions reached F successfully (“pDF OR pCF”), and so on. Join Conditions Thus, the truth‐values of transition conditions of control links that enter a join activity allow for a more fine‐grained mechanism of synchronizing parallel work at join activities. This mechanism is introduced through join conditions: A join condition is a Boolean expression associated with a join activity, and the formal parameters of this expression refer to the transition conditions of the incoming control links of the join activity. Work along parallel paths reaches a join activity at different points in time. For example, activity C in the figure before might have been completed fast and the transition condition pCF is evaluated while B is still running, that is, the transition condition pDF gets evaluated at a later point in time. By default, the decision whether F is to be performed or not is deferred until pDF has also been evaluated, even if the join condition is “pDF or pCF,” for example, and is known to be true long before the truth‐value of p is known. Thus, join conditions are really a means to synchronize parallel work, that is, to wait until parallel work comes to an end and then decisions can be made how to proceed. Sometimes, a weaker semantics of synchronization is appropriate and supported by the metamodel of WSFL: As soon as the truth‐value of a join condition is known, the associated join activity is dealt with accordingly (that is, either performed or skipped). Control flow that reaches the corresponding join activity at a later time is simply ignored. Start and End Activities But what about activities that have no incoming control connector (like A, B, and X in the following figure), or outgoing control connector at all (like H, I, J, and X)? These kinds of activities are called start activities or end activities, respectively. In the following figure, activities A, B, and X are start activities, and activities H, I, J,
Conceptually, each activity has a join condition associated: A node with a single incoming control link can be perceived as having a join condition that consists of the transition condition of the incoming control link. A start activity can be perceived as having a trivial join condition that consists of the constant “true” predicate. With this convention in mind, an activity can be started whenever its join condition is fulfilled. In particular, the join condition of an activity with no incoming control link is fulfilled when the flow model is “started,” thus, the corresponding activities are “start activities” also from that perspective. When a flow model is instantiated, all of its start activities are determined and scheduled to be performed. Based on the start activities of a flow, the “regular” navigation through the graph representing the flow model continues. That means, when a start activity completes, its actual successors are determined based on the control links originating at the completed start activity. When an end activity completes, navigation stops at this point because there is no possible follow‐on activity and thus, no actual successor to determine. But navigation might continue in other parts of the graph, thus, a lot of activities of the overall flow might still be awaiting their execution. But if all end activities within the graph have been reached, the overall flow is done. When the last end activity completes, the output of the overall flow is determined and returned to its invoker; and then, the flow ceases to exist. 523
Exit Conditions The following figure summarizes the flow‐relevant fine structure of an activity introduced so far. An activity is linked to the operation of a port type as its implementation, and if the activity is a join activity, it has an associated join condition. What is also shown is the exit condition associated with an activity: An exit condition is a Boolean expression, the purpose of which is to determine whether or not the execution of the implementation of the activity completed the business task represented by the activity. The expression can refer to the output message of its associated activity or even to output of any activity that ran before on the control path of the subject activity; the expression of an exit condition is provided in XPath syntax like the expression of a transition condition is. The exit condition is evaluated once the operation of the implementing port type terminates. If the exit condition evaluates to true, the activity is treated as “completed.” If the activity is completed, navigation continues and the next activities to be performed are determined based on the just‐completed activity; otherwise, the activity is executed again.
For example, the exit condition can check particular reason codes or return codes of the activity implementation In doing so, the activity can be retried if a code indicates an implementation problem (for example, “automatic rollback due to detected deadlock”). Or the application already aggregates lower‐level reason codes and provides a return code that basically says whether the implementation 524
executed correctly or not. Or the exit condition checks a field that is implicitly set by a user (“The customer did not answer the phone call–I’ll try at a later time”). As all of these examples show, the exit condition allows to distinguish two events, namely the event that signals that the activity implementation returned from the event that signals that the associated piece of work (the business task) completed successfully. And navigation typically should continue only if the business task completed and not if the implementation has been interrupted for whatever reason. Loops But there is another important use of exit conditions, namely for looping: An activity is iterated until its exit condition is met. Often, this mechanism for realizing do‐until loops is used when an activity is implemented by another flow, that is, by means of the call lifecycle operation (see Section 3.3 “Business Process Lifecycle” and “Lifecycle Operation call”). Because the metamodel does not support cyclic graphs, cycles must be realized by separate flows that are iterated based on exit conditions. This enforces a block‐oriented specification of loops well known from structured programming. Supporting arbitrary loops would allow specifying situations that are ambiguous, difficult to model unambiguously, and much more difficult to comprehend. The following figure shows a cyclic graph. Assume that control flows from A to B to C, and D and E are actually executed. We further assume, that when D completes, navigation can proceed to B again. When B completes the second time, control flows to C, and may continue to E and D again. Many problems and questions come up, for example: B is a join node. When control flows from A to B (the first time) the truth‐value of the transition condition of the control link from D to B is unknown. The join condition of B must be an expression in ternary logic to specify the appropriate behavior. When C completes the second time, should control really flow to E again? Or does the intended loop just consist of B, C, and D? If the control flow should proceed to E, it might happen that E is still running because of its first invocation. What should happen in this situation? Should E be immediately interrupted and started again, or should the completion of E be awaited before its next invocation? When D completes and control flows back to B, and could also flow to F, should F
be really started? Or should only the “backward control link” be honored? If F should be started, the same questions occur as for E before.
Data Links There is a second kind of directed edges in the graphs of the metamodel, the so‐ called data links. A data link specifies that its source activity passes data to the flow engine, which in turn has to pass (some of) this data to the target activity of the data link. For example, the next figure depicts that activity A expects input data from activity B, which is indicated by a dashed directed edge (while we use solid edges to draw control links). To make this meaningful, a data link can be specified only if the target of the data link is reachable from the source of the data link through a path of (directed) control links. Thus, data always flows along control links.” This makes sure in an easy manner that a couple of error‐prone situations are avoided. For example, the spectrum of such situations extends from trying to consume data that has not been produced yet, to dead‐lock situations in which one activity requires data from another activity as input but the latter activity needs the output of the former as its own input. It is not required that data be always passed to an immediate successor of its producer. Many different activities might be visited along the path made from control links from the source of a data link to the target of the data link. An activity might be the target of multiple data links. For example, this allows aggregating input from multiple sources, or it allows specifying alternative input from activities from alternative parallel paths. To facilitate this, data links are weighted by so‐called map specifications. A map prescribes how a field in a message part of the target’s input message of a data link is constructed from a 526
field in the output message’s message part of the source of the data link. It even allows that multiple maps to be defined for the same message part target. This is needed, for example, when alternative paths in the control are specified and data needed further on can be produced along each of the paths. .
View more...