HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML

Today the technical media talks a great deal about the Java platform and its importance in creating a ubiquitous Internet execution environment. While most of us have bought into this concept, other technologies that are emerging rapidly promise to smooth out the road to the computing promised land. XML is one of these technologies that needs to be taken seriously. There are many aspects of XML: Document Type Definitions (DTD), Style Sheets (XSL), Viewers, parsers, HTML 4.0 and data. Out of these, perhaps the most promising aspect of XML is its ability to represent data. Its ability to describe its document content via its markup mechanism allows it to behave like a universal data format for any number of applications.

Data representation using XML is a major step toward creating a ubiquitous data environment. XML allows authors to define their own tags, which in turn describe their content and make it possible to define a reusable data layer. Authors are able to leverage their document structure and meaning to allow the processing of special instructions on parts of their document. XML can also be used by two or more entities as a particular exchange format for transaction protocols. This allows XML documents to be manipulated without human interaction in batch mode. Some examples of exchange formats are defined by the Rossetta Net and Microsoft's BizTalk standards.

Why XML?
XML by itself has nothing to do with Java and vice versa. So why should the Java community care about XML? The answer lies in the data layer. The Java language alone doesn't provide a mechanism for standardizing data formats. Java programs need to rely on predefined, nonflexible, hard-coded formats for reading information. This makes it difficult to extend or add functionality to a program without breaking the existing code base.

Take a business scenario. Imagine that you do business with two partners, one on the West Coast, the other on the East Coast. The latter expects his purchase orders to contain three fields: part number, quantity and delivery date. The West Coast partner expects her purchase orders to contain part number, quantity, delivery date and preferred shipping carrier. Thus they each have different definitions of a purchase order. How will they converse? While this particular problem doesn't seem that complicated, multiply the number of partners by 10 or 100, each with his or her own definition of a purchase order. Now we have a problem!

The naive approach to dealing with this problem would be to have our Java code deal with the individual partners in a special way. The problem with this approach is that each partner that requires special information forces the modification of the Java code used to implement the business model.

The ideal solution is to create a generic Java program that doesn't have to deal with the individual requirements of each partner. This can be done using XML. A core exchange format can be set up between you and your partners, and the individual information required by each partner can be abstracted in a properties file. The file will be responsible for matching additional information to a specific partner. In this particular scenario each partner will deal with the information he or she understands; the remainder of the information will be ignored. As new partners join your "circle of friends," the only information that needs to be modified is the properties file and the XML data file. This is where the power and flexibility of the XML data format complements the power and flexibility of the Java runtime environment. Furthermore, a neat side effect is that the properties file could have been written using XML.

XML Documents and Dynamic Class Loading
When people talk about XML for data representation, the most basic concept they refer to is a document structure with data. This structure, similar to a populated C structure, outlines a tree whose nodes describe the content found on the leaves. Simple documents don't contain any behavior that defines how to access the content on the tree. Thus an XML document can be thought of as a data object with accessor methods. This idea can be heavily leveraged to implement exchange formats for transaction protocols. More complex XML documents leverage the concept of mobile agents to provide behavior to XML documents. This approach leverages URL links embedded inside the document as object repositories from which functionality can be downloaded over the Web and used to process specific document tags. It is here that XML leverages the power of Java to extend its data model to add behavior. The Java code contained in the URL links is downloaded via the URL class loader mechanism contained in the Java platform. Once the class bytecodes are downloaded over the Web, a class object is created and temporary object instances are created and used to evaluate the information contained inside the XML file. This enables the dynamic extension of program behavior. Another way in which Java components can be leveraged is to send mobile agents to evaluate information stored inside XML files by analyzing the tags contained inside the document.

How Does XML Speak Java
By now I hope you're convinced about the complementary roles of Java and XML. Let's move forward to explain how the two technologies come together. XML-Java parsers hold the key to the answer. XML data parsers written in Java provide two standard interfaces blessed by the W3C:

  • Document Object Model (DOM): The DOM provides a mechanism that allows users to access the information contained in the document in a tree fashion. Traversing the tree is left to the application writer. Users of the DOM usually care about the hierarchy and structure of the document.
  • Simple API for XML (SAX): The SAX provides an event-driven method for traversing the information in the document. Application writers can register callbacks that are invoked when the beginning and ending of a tag are parsed. Once inside the callback, the program is able to discriminate against the tag information. Users of the SAX care about specific tags inside the document, not necessarily it's hierarchy.
I'd like to illustrate the power of the SAX and DOM interfaces by providing an example. Imagine you and your partners agree on a format for exchanging purchase requests. The information needed by all partners in order to process the purchase request is buyer name, buyer address, order number, product ID, product name, quantity, delivery date, and requested price. This particular hierarchy identifies the root element of the document as the PurchaseRequest. It contains three additional elements called BuyerName, BuyerAddress and OrderNumber. Multiple OrderNumber tags can be added to the document to represent various orders from the same buyer. BuyerAddress is made up of four elements: StreetName, City, State and Zip Code. OrderNumber is made up of four elements and one attribute. The four elements are called ProductID, ProductName, DeliveryDate and RequestedPrice, and the attribute name is Quantity.

Although the Quantity attribute could have been expressed as an element, for our particular example it's more advantageous to define it as an attribute because it can be directly manipulated by the SAX API inside the callbacks attribute list of the OrderNumber tag. Figure 1 illustrates the document hierarchy. The XML document format is shown in Listing 1.

Figure 1
Figure 1:

Some partners may take the orders, process them and notify senders of the status of their order. These partners parse the information contained in the document in a batch manner and create objects that are used by their purchase order systems. Based on the information contained in the document, the purchase order system might createthree objects: a purchase request object, a buyer object and an order object. The purchase request object contains the buyer object and a list of order objects. The DOM interface is the correct mechanism to facilitate the creation of these objects from the XML document. Listing 2 shows the use of the DOM Java API to retrieve the document information needed to create the buyer object.

Some partners may want to evaluate requests whose item quantity is greater than or equal to 500. To facilitate processing, the application programmer may wish to evaluate the "Quantity attribute in the ŒOrderNumber'" tag independent of any other information in the document tree. In this case we use the SAX interface, which, among other things, allows us to register a document handler as a callback object that's triggered when a document tag is found. When the tag being processed is equal to "OrderNumber," the quantity attribute will be evaluated against the quantity rule. In this scenario it's irrelevant that the "OrderNumber" tag is contained inside the "PurchaseRequest" tag. Listing 3 shows the use of the SAX Java API to capture the "OrderNumber" tag from the document information and evaluate its Quantity attribute.

In this particular example orders fewer than 500 items will not be processed by the system. Those orders greater than or equal to 500 will be processed using the DOM API. However, in this case the SAX API will allow the application writer to filter the information and not overload the system with unprofitable requests.

The definition of a standard representation of the purchase order document enables the various partners to manipulate the information as they see fit, independent of each other. This flexibility can be extended by allowing sophisticated partners to add additional tags into the document hierarchy. As long as the main tag dependencies are kept, the sophisticated partners will be capable of leveraging the additional information on their transactions. Additional tag examples can be a "DeliveryDateOffset" tag that allows a partner to identify a range of days from the "DeliveryDate" tag by which the order can be supplied. If the information is present in the purchase request, the partner can leverage it. If it's not present, it can be ignored by the partner system.

The URL class-loading capabilities of the Java 2 platform supplement the XML data model by allowing a document to contain behavior in addition to data. This is accomplished by embedding URL links to Java classes inside a document. Class loading coupled with reflection form a powerful mechanism that allows Java programs to dynamically download functionality from a partner Web site in order to process new XML tags. However, this particular mechanism requires an adapter-based framework similar to the Beans model that allows application developers to dynamically define interactions between their legacy systems and the newly downloaded functionality. (This mechanism will be covered in a separate article.)

In this article we discussed the advantages of marrying the XML and Java technologies. XML is to Java as cream is to coffee; it makes the coffee drinkable. While Java by itself provides a great deal of dynamic behavior through its dynamic class loading and reflection mechanisms, by itself it's not the best way to deal with data format issues. XML takes Java to the next level by providing a flexible and extensible tag definition environment that is machine independent. Java applications coupled with XML data formatting are more capable of adapting to data format changes in a generic, nonprogrammatic way. This increases time to market and gives developers the ability to react more quickly to market changes.

Author Bio
Israel Hilerio, a member of the technical staff at i2 Technologies, Dallas, is a Sun Certified Java programmer with 10 years of programming experience, including three and a half in Java. He has Ph.D. and MS degrees in computer science and a BS in computer engineering. He can be reached at [email protected]


Listing 1: 

<?xml version="1.0"?> 
    <StreetName>909 E. Las Marias</StreetName> 
  <OrderNumber Quantity="500"> 
    <ProductName>Siega 400MHz Pentium PC</ProductName> 
    <DeliveryDate>November 19, 1999</DeliveryDate> 

Listing 2: 

Document doc; 
TXElement root; 
Buyer B; 
Parser p = new Parser("XMLParser"); 
try { 
    FileInputStream file = new FileInputStream(documentName); 
    doc = p.readStream(file); 
    root = (TXElement) doc.getDocumentElement(); 
    TXElement name = (TXElement) root.getElementNamed("Buyer 
    TXElement address = (TXElement)root.getElementNamed("Buy 
    TXElement street =   (TXElement)address.getElement 
    TXElement city = (TXElement)address.getElement- 
    TXElement state = (TXElement)address.getElement- 
    TXElement zip = (TXElement) address.getElementNamed("Zip 
    B = new Buyer(name, street, city, state, zip); 
catch (java.io.IOException e) { 

Listing 3: 

     Parser parser = ParserFactory.makeParser(parserClass); 
     saxTest handler = new saxTest(); 
catch (Exception e) { 

public void startElement(String name, AttributeList atts) { 
     if (name.equals("OrderNumber")) 
       int qty = Integer.parseInt(atts.getValue("Quantity")); 
       if (qty >= 500) 


All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.