Essential Guide XML PDF

July 7, 2018 | Author: rachmat99 | Category: Xml, Xml Schema, Xslt, Ibm Rpg, Html
Share Embed Donate


Short Description

essential guide xml...

Description

THE ESSENTIAL GUIDE TO

XML

BY SHARON L. HOFFMAN — AUGUST 2005 XML syntax and how XML compares with languages used for related tasks.

XML in Context

is a key technology for sharing data between business entities because it bridges different ways of storing and referencing data. Although XML can be described as a language, the extensible nature of XML means that it’s more correctly classified as a standard. Many interrelated standards (for a list, see “Essential XML Standards” on page 4) complement XML and expand its capabilities. XML is also a fundamental building block for other standards. For example, many Web-services standards, such as Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL), are based on XML. To give you a sense of how you might use XML in your own applications, let’s start with a quick look at

XML

An XML document is made up of XML elements. Each element contains a starting tag, an ending tag, and (usually) data nested between the two tags. By choosing descriptive names for elements, you can make your XML documents more human-readable and therefore self-documenting. In Figure 1, the highlighted line is a single element called product_code. If a document contains more than one element of the same type, the tags will be repeated for each element as shown for the product_code and requested_qty elements in Figure 1. For more information about XML syntax see “Essential XML Syntax and Terminology” on page 3. Repeating the data description for every element means that XML documents are entirely self-contained — you won’t need to refer to a database layout, for example. However,, the overhead of repeating all However the element-desc element-description ription information Figure 1: Sample XML document quickly becomes unwieldy. As a result, most developers prefer using data description languages (e.g., SQL, DDS) bike component availability 9/1/2005 to define databases. However, XML shines Acme Company in data-transfer applications that involve Sharon Hoffman relatively small amounts of data (these are [email protected] typically single transactions such as an inventory inquiry or a purchase order). 12345 5 Data transfer is by far the most common 67892 XML application in iSeries environments. 25 However,, you can also use XML to add However meaning to text within documents. Used in this way, XML becomes a powerful

1

SUPPLEMENT SUPPL EMENT TO iSeries iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML

2

SUPPLEMENT TO iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML tool for organizing information and improving search capabilities. To understand the benefits of an XMLencoded document, you should consider the differences between XML and HTML. Although the two languages are syntactically similar because they have the same antecedents (see “Essential XML History” on page 5 for information), they have different strengths. HTML is best used to format information for display, while the descriptive information in XML tags makes it easier to deal with document content. For example, suppose you have a document containing a list of PC printers that contains information about the features of each printer model. If the document is stored in HTML, it’s difficult to create a search that finds all printers that support color printing, duplex printing, and can print at least 10 pages per minute. Conversely, if you store the same document using XML, you would probably create separate elements for each important feature (e.g., maximum_print_speed) and could easily develop an application that searches for all printers that meet your criteria. Of course, a database

is ideal for such a search, but XML provides database-like search capabilities for information that is stored in documents such as user manuals or marketing brochures. As you’ll see in the following section, the XML data can easily be converted into HTML for display purposes. Because XML documents are plain text, you can write XML using any text editor (e.g., Notepad). However, as you begin working with XML, you’ll quickly find that an XMLaware editor is a big time-saver. An XML editor should help you write XML by providing syntax-checking and document-generation capabilities. For example, if you begin to create a new element, some editors will automatically generate the ending tag for you. An XML document can stand entirely on its own, without any related documents. More often, though, an XML document is part of a larger application architecture that includes components that define the structure required for a particular type of XML document, solutions that reformat XML data (e.g., create an HTML document for display using data from an XML document), and applications that process

ESSENTIAL XML SYNTAX AND TERMINOLOGY 1

XML is case sensitive.

The following nesting is correct:

2

Generally, white space (e.g., indents, blank lines) in an XML document is ignored.

3

You can choose any element names you like as long as they conform to a few basic rules:

Sharon Hoffman

The following nesting is syntactically correct, although it doesn’t make much sense:

• Element names cannot contain spaces. • Element names must begin with a letter or an underline. • After the first character, element names can contain numbers, hyphens, periods, colons, letters, and underscores. (Colons are usually avoided in element names because they have special meaning within XML.) • Element names cannot begin with the letters xml, regardless of case (i.e., xml, XML, xMl, and Xml are all invalid).

4

The following nesting is syntactically incorrect: Sharon Hoffman

Elements can contain one or more attributes. In many cases, the XML designer may choose whether to use elements or attributes to define a particular structure. As a rule of thumb, attributes should be used for information that is not integral to the element.

5

An element cannot contain more than one attribute with the same name.

6

Both starting and ending tags are required for all elements except empty elements. Empty elements occur most often when an element is completely defined by its attributes.

7

Sharon

Elements must be properly nested (i.e., once an inner element tag is opened, it must be closed before any outer tags).

3

8

The outermost element in any XML document is referred to as the root element.

9

The root element may be preceded by a document declaration and processing instructions.

10

Built-in XML entities are used to include a character that has special meaning in XML (e.g., a greater-than sign) within XML content. You can also define additional entities as short-hand for text and structures that you use repeatedly.

11

An XML document that has correct syntax is well formed.

12

An XML document that conforms to the structure defined by its Document Type Definition (DTD) or schema is valid. It is possible for an XML document to be well formed but invalid, but the reverse is not possible.

SUPPLEMENT TO iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML developing the DTD or schema. Whether you use a DTD or a schema, there is typically a one-to-many relationship between the DTD or schema and the XML documents. For example, you could publish a DTD or a schema (or both) specifying the format for incoming inventory inquiries and, hopefully, many of your cus tomers would then begin to send you inventory inquiries in XML format. DTDs and schemas for external documents (versus documents that are inter nal to a particular company) are usually published online so that they can be shared more easily. Ideally, everybody would use the same structure for the same type of document (e.g., inventory inquiries), but XML documents. Understanding how these pieces work that’s not always the case — not even within a single industry. together is vital to understanding XML. Fortunately, many industry groups are working on standards The Big Picture that should help alleviate some of the Tower-of-Babel aspects An XML document is almost always associated with a of XML. You’ll find the latest information on industry-specific second document that defines the valid structure for a XML structures online at xml.org. particular type of documents. For example, an XML In addition to DTDs and schemas, other components can document might contain a particular inventory inquiry from be associated with XML documents. For example, if you XYZ Company, but the structural-definition document would plan to display an XML document in a Web page, you’ll define the format for all inventory inquiry documents. probably want to first convert the XML document into There are two standards for these structural-definition an HTML document. Similarly, you often might need t o documents: DTD is the older and simpler standard, whereas create multiple XML documents that contain the same XML schema is the newer standard. DTDs and schemas general information but use slightly different structures. serve the same purpose, but their complexity and capabilities If you need to convert lots of documents between the same vary significantly. two structures, it makes sense to automate the process. The Figure 2 contains a DTD that you could use to define the simplest way to do this is via an Extensible Stylesheet XML document in Figure 1, and Figure 3 contains the schema Language Transformations (XSLT) document that defines for the same document. Both the DTD and the schema were how input elements should be formatted in the output (XML generated using an XML editor (WebSphere Development or HTML) document. For example, if several of your vendors Studio Client for iSeries — WDSc, in this case). You’ll find accept inventory inquiries in XML, but each uses a slightly that creating a sample document (e.g., an inventory inquiry) different schema, you could develop a generic XML and using it to generate an initial version of the DTD or inventory inquiry, then create the variations using XSLT. schema is often the simplest way to create a structuralAs with DTDs and schemas, your XML editor should include definition document. While you may need to clean up the tools to help you create XSLT documents. generated code, it will give you a good starting point for An XSLT document works in conjunction with an XSLT Figure 2:A DTD generated by WDSc for the XML document in Figure 1

ESSENTIAL XML STANDARDS XML itself is a standard, but it also involves many related standards. Here are some of the most widely used XML standards. ◗ ◗ ◗



XLINK is a standard for defining hyperlinks in XML. XML Namespaces make it possible to create unique element names. XML Schemas define the rules for the specialized XML documents used to define the structure of  other XML documents. XPATH addresses each part of an XML document via a hierarchical structure (e.g., first_name within customer_name within quote_request).

◗ ◗

4

XQUERY is a relatively new standard that provides SQL-like query capabilities for XML documents. Extensible Stylesheet Language (XSL) formats XML documents for display. There are two components of the XSL standard: XSL Transformations (XSLT) and XSL Formatting Objects (XSL FO).

SUPPLEMENT TO iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML

Essential XML History The histories of individual computer languages are mostly just curiosities, but XML’s history provides a glimpse into its syntax as well. XML is part of the same family of languages as HTML and is based on Standard Generalized Markup Language (SGML). SGML is a direct descendent of Generalized Markup Language, which was developed by IBM researchers in the 1960s. The concept behind markup languages is to separate document content from document struc ture and display. Thus in both XML and HTML, the tags contain information about data — formatting information in HTML, and context information in XML. SGML became an ISO standard in 1986. HTML, which evolved somewhat independently but incorporates many SGML concepts, is slowly being brought back into compliance with the larger SGML standard. In 1996, developers began working on a simplified version of SGML that focuses on document structure rather than document format. That project is the basis for XML, which became a Worldwide Web Consortium standard in 1998.

The Essential XML Resources Charles F. Goldfarb’s All the XML Books in Print

IBM RESOURCES

Goldfarb, one of the developers of SGML, attempted to list all the XML books in print. Although the list was last updated in early 2004, it’s still a useful resource.

Developerworks XML site

xmlbooks.com

iSeries XML information home page www-1.ibm.com/servers/enable/site/xml/iseries/index.html

The CoverPages

www-106.ibm.com/developerworks/xml

The XML CoverPages include XML news, background material, and technical tips.

Two IBM white papers illustrate how to process XML documents using RPG or Cobol:

xml.coverpages.org

DevX.com

“Parsing XML documents using the new V5R3 ILE COBOL syntax”

XML FAQs, articles, discussion groups and more.

www-1.ibm.com/servers/enable/site/education/abstracts/3db2_abs.html

devx.com/xml

World Wide Web Consortium XML page

“XML Interface for RPG maps XML into DB2 UDB for iSeries”

w3.org/XML

www-1.ibm.com/servers/enable/site/education/ibo/record.html?xmlface

XML.com O’Reilly Media, Inc., a premier te chnical book publisher, maintains this XML information site. xml.com

ESSENTIAL XML PARSER CONCEPTS SAX parsers are event-driven and are best suited for applications that need to choose specific elements from a larger XML document. You’ll find the SAX parsers more intuitive if your programming background includes languages that have event-driven capabilities (e.g., Visual Basic, Java). DOM parsers read an entire XML document into an application where the elements can be referenced, much as an RPG program might reference fields in a record format. Therefore, DOM parsers have an advantage over SAX parsers when you need to process a high percentage of the elements in an XML document. In addition, DOM parsers generally feel more natural than SAX parsers if your programming background includes procedural languages such as RPG and Cobol.

Although most XML editors include an XML parser, you’ll also need an XML parser for production applications. XML parsers may be part of a Web application server, or they may be available as separate software options. There are two general standards for XML parsers: Document Object Model (DOM) and Simple API for XML (SAX). The only functional difference between DOM parsers and SAX parsers is that DOM parsers can modify an XML document, while SAX parsers are read-only (of  course, an application that uses a SAX parser can always write out a new XML document in a different format than the incoming XML document). The other differences between DOM and SAX parsers don’t affect their capabilities, but they can have an impact on ease-of-use, and in some cases, performance.

5

SUPPLEMENT TO iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML processor — software that applies the rules defined in the XSLT document to an incoming XML document and produces an output document in HTML, XML, or text format. An XSLT processor is typically bundled into a Web application server such as WebSphere Application Server (WAS) and can be accessed by calling APIs in an application. Most XML editors also include an XSLT processor for testing purposes.

parsers, see “Essential XML Parser Concepts” on page 5. As you begin developing in XML, you might not even realize that you’re using an XML parser. For example, when an XML editor validates an XML document against its associated DTD or schema, an XML parser is invoked to perform the validation. XML parsers, including those for iSeries, are typically free. The iSeries-specific XML parser support is packaged in the no-charge licensed program product, XML Toolkit for iSeries (5733-XT1). If you’re working with very low document volumes, it may be possible to assemble and disassemble XML documents using the tools built into an XML editor. However, for production processing of XML documents, you’ll usually need to develop code that moves data back and forth between a particular type of XML document (e.g., an inventory inquiry) and the associated database records. You can create an XML document using a variety of techniques. At one end of the spectrum, you could write an RPG program that creates an XML document as an iSeries database file by hand-coding the tags and their contents. Then, you could convert the database file to a stream file using the CPYTOSTMF (Copy to Stream File) CL command. Other options include using APIs to output a stream file from an RPG program, generating an XML document using the results of an SQL query, or writing a Java application that builds an XML document. Although you can write custom code to extract data from an XML document, it’s simpler to leverage the capabilities of an XML parser. For example, you might write code that invokes specific parser functions such as reading the data for a particular type of element (e.g., product_code).  Java is the language of choice for working with XML because it includes extensive support for accessing parser APIs. However, you can also invoke parser APIs using RPG or Cobol, and products are available that will automate part of the process of assembling or disassembling XML documents.

From XML to the Database and Vice-Versa In an iSeries environment, XML projects almost invariably involve extracting data from DB2 UDB for iSeries or moving data from XML documents into the database. While it’s possible to store entire XML documents in iSeries files, more often you’ll need to separate the data for one or more elements from its tags and store the data itself as a field or fields within existing iSeries database records. You’ll also find lots of requirements for the opposite task — creating XML documents using data from one or more database records. The underlying software that is used to separate an XML document into data and data-description components is an XML parser. An XML parser understands the rules of XML syntax, just as the parser that is part of the RPG compiler understands RPG syntax. For more about XML

Figure 3:An XML schema generated by WDSc for the XML document in Figure 1

Explore XML XML is a powerful tool for communicating data between applications using different databases and running on different platforms, and it is rapidly becoming the medium of choice for transaction-level data transfer. XML can also organize information within a document, thus making it easier to modify and search large amounts of text. For all its strengths, XML is still a relatively new technology with a maze of confusing, and sometimes competing, standards. To take advantage of  XML, it helps to have a clearly defined goal and the flexibility to experiment with various tools and techniques. It’s also useful to understand how other businesses are using XML. To explore the opportunities XML offers, visit the Web sites listed in “Essential XML Resources” on page 5. ■ Sharon L. Hoffman is a senior technical editor for iSeries NEWS.

6

SUPPLEMENT TO iSeries NEWS © 2005

THE ESSENTIAL GUIDE TO SOFTWARE XML

7

SUPPLEMENT TO iSeries NEWS © 2005

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF