What gets most people excited about Web services is that it provides
a vision of a future where disparate applications are hooked together
in innovative yet undiscovered ways to solve the next generation of
IT problems.
Web services is intended to create the synergy of many applications
working together. The whole is greater than the sum of its parts. Web
services must be resistant to change and quickly adaptable to new
types of users who want to use them on their own terms and with their
own data formats.
These transformational Web services, as I'll call them, will
rely heavily on XML to meet this demand. There's a good reason why
current Web services protocols such as SOAP and WSDL are defined
using XML. XML is an excellent format for storing and managing
information in a standard, extensible manner that is also flexible.
In this article, I'll look at why a native XML database can be a
powerful complement to certain types of Web services.
There are two types of Web services: RPC-based and
message-based. RPC-based Web services are synchronous in nature and
functionally similar to making a remote method invocation.
Message-based Web services are asynchronous and used primarily to
pass business messages (or documents) between services.
To make a J2EE analogy, an RPC-based Web service is very similar
to invoking a method on a session bean, and a message-based Web
service is very similar to placing a message on a
JMS queue.
Web services are described using an XML dialect called Web
Services Description Language (WSDL). WSDL provides a vocabulary for defining exactly how to access a Web service, whether
it's RPC or message-based, what parameters to provide, the URL to
use, and so forth.
Middleware providers have quickly adopted Web services and
included tools with their products for managing SOAP messages with
EJBs and servlets. Many of these tools focus on shielding the J2EE
application from the XML as much as possible, minimizing the amount
of SOAP-specific information that the developer must deal with. In
particular, RPC-based Web services can be implemented with EJBs and
servlets that have no idea they're being invoked by SOAP. The tools
provide facilities where an XML-to-Java mapping can take place and
the Web services implementation deals exclusively with Java data
types.
This is a worthy goal, as a lot of SOAP programming is low
level and tedious. However, it's important to remember that in some
cases there's real value to preserving the XML associated with a SOAP
message. By discarding the XML that composes the SOAP body, a Web
service is throwing away the ability to manage extensible
information. This extensibility (an inherent trait of XML) is key to
building a Web service whose interface is not tightly bound to a
particular document structure.
Message-based Web services provide convenient gateways for
applications that need to submit requests and then get notified of
the response at some point. There's loose coupling between these Web
services and the caller. Unlike RPC-based Web services, where the
caller must know each parameter of the service in detail,
message-style Web services have the flexibility to accept any
structured information in the SOAP message body. This makes them very
attractive for implementing Web services that deal with more complex
data whose structure might change over time or depending upon the
caller.
Unlike RPC-based, message-based Web services have direct
access to the different parts of the SOAP message that invoked them.
This provides a great deal of flexibility in how the Web service
chooses to process the message. It also means that the service might
be doing a lot of work with raw XML (e.g., DOM programming) if the
document is complex or large.
If many different partners use the Web service, it's
cumbersome for the developer to change the business logic of the Web
service every time a new partner is added. In addition, it's a
challenge to store the data found inside the XML document if every
caller is going to provide a document adhering to a different XML
Schema. To make life simpler, many applications simply discard the
XML after parsing out the relevant bits of information and storing
them in a relational database. But sometimes it makes sense to keep
not only the relevant data, but the XML document itself, as parsing
and storing destroy the structure of the document (and add
considerable overhead to the process).
This is where native XML databases fit in. A native XML
database operates very much like a familiar relational database.
However, instead of managing data as rows and columns, it manages
data as XML. A native XML database is built from the ground up to
store and manage XML in a parsed format. This means there's no
conversion taking place from the XML to some external storage format
(either rows and columns or "BLOBS" of text). Native XML databases
typically outperform relational databases when it comes to storing
and retrieving XML because they store the elements of a document in a
preparsed format.
An XML database is a perfect fit for the needs of the
message-based Web service outlined earlier. It provides a
transactional, secure place to hold incoming messages that won't bog
down the application's database of record. But there's another key
reason for using an XML database - native XSLT support.
XSLT is a language that describes stylesheets for
transforming XML into other formats. Most applications that process
XML in some fashion use XSLT primarily to format the XML before
presenting it to a user. XSLT can do far more than simply convert XML
into HTML or the like. XSLT is ideally suited to transform XML from
one dialect to another. When we consider the needs of a Web service
that must handle multiple incoming messages that may be in different
formats, XSLT is the natural choice for solving this problem. A
native XML database that has a built-in XSLT processor has a major
advantage over in-memory transformations, namely the ability to
handle large document sizes without excessive memory consumption.
Let's focus on a concrete example. Imagine a warehouse that
serves a variety of retail suppliers. It contains numerous inventory
types and is constantly expanding the types of inventory it contains.
At different times during the year the warehouse will focus on
certain types of inventory more regularly than others, which means
the population of users is constantly shifting. The warehouse exposes
the ability to request an order via a message-based Web service. I've
chosen to implement this example using the beta version of BEA
WebLogic Workshop.
As you can see in Figure 1, the Web service has a few
methods: checkAvailability(), which finds out whether an order can be
filled with current inventory, and requestShipment(), which starts an
order process. For now we can ignore shipNotice(), which is a
callback when the order is ready to be shipped.
Suppliers will invoke the requestShipment() method via SOAP
invocations. The signature of the requestShipment() as seen in
Workshop is shown in Listing 1.
This service lets suppliers provide their own identifier
(<supplier>), an identifier for their purchase order (<poNumber>), and, finally, the purchase order itself (<doc>). The
service is intentionally vague about what the purchase order should
look like so it can handle different kinds.
Currently, the warehouse provides inventory for two
suppliers, Acme and BigRetail. The Acme purchase order in its
entirety can be found in Listing 2, the BigRetail purchase order in
Listing 3. Notice that while they're structurally different, both
documents express basically the same information. What is important
is that both suppliers expect the Web service to behave the same way
for their purchase order.
I implement the method in Workshop by accepting a DOM node
for the document, a string for the poNumber, and an integer for the
supplier code:
public void requestShipment( Node doc, String poNumber, int supplier
) throws Exception
This method needs to store the purchase order somewhere and
then begin the process of filling the order (checking inventory
levels, etc...). Since these processes will need to refer to the
particulars of the purchase order, it would be unwieldy to write
parsing code to handle so many different types of documents. A
cleaner approach is to convert each incoming purchase order into a
"canonical" format and operate only on those documents (see Figure 2).
Let's examine the requestShipment method in detail (see
Listing 4). First the code takes the incoming purchase order and
serializes it into an XML string using Apache's
org.apache.xml.serialize.XMLSerializer class. This string is then
passed to the XML database's createXMLFile method. I've chosen to use
eXcelon's eXtensible Information Server (XIS) as the XML database.
I've "wrapped" a subset of the XIS's API using a Web service control
named XMLDatabase. This is a technique specific to Workshop and lets
us keep the database-specific API out of our Web service code. The
XIS presents a filesystem metaphor for storing XML in folders and
documents. The createXMLFile() method takes an XML database name as
the first parameter ("Warehouse"), a path to the filename indicating
where to store the XML (using the PO number as the filename in the
"incoming" directory), the XML string itself, and finally an argument
indicating whether or not to trim white space.
Once the original purchase order is stored in the database,
the method can perform a transformation to a canonical format. The
XSLT stylesheet to use for this transformation is chosen based upon
which supplier number was received. The transformed purchase order
document is returned by the applyTransform() method as a string
that's also stored in the database, this time in the "toBeShipped"
directory.
The code then invokes the processOrder() method (the details
of which I have omitted for brevity) that starts the business process
of filling this order. Once the system has started processing the
order, it can make the callback to issue the ship notice for the
client.
One detail I have not discussed is how to write the XSLT that
transforms the Acme and BigRetail purchase orders into our canonical
format. These stylesheets are not terribly complicated to write,
however, they can be rather tedious to construct. The XIS comes with
a tool called Stylus Studio that helps build stylesheets visually.
Figure 3 shows the XML mapper in Stylus for the Acme purchase order;
Listing 5 is the resulting XSL stylesheet code. These tools can be
helpful for large documents with many processing rules; however, for
simple documents, hand-coded XSLT can be more than adequate.
There are several benefits to using an XML database as the
intermediary for these types of Web services:
1. Everything is persistent: Every business document is
preserved in the database in the event of a system failure. This
includes the supplier's original purchase order that can be referred
to as needed if there is a dispute or question after an order has
been shipped.
2. The Web service does not get involved in transformations: The
role of the Web service is straightforward and does not change much
when a new supplier is brought on board. The way I have coded, the
requestShipment() method would require a new case statement to be
added for a new supplier. However, even this could be parameterized
by storing the stylesheet names in the database keyed by supplier
identifier.
3. Scalability is greatly enhanced: By offloading the XSLT
processing to an external entity, this Web service can be used by
many suppliers simultaneously without concern for memory consumption
issues. Unlike an in-memory XSLT processor, the database server will
process large documents, not the application server hosting the Web
service. For extremely large (multimegabyte) documents, this can have
a profound effect on scalability.
4. Extensibility is preserved: For example, if BigRetail decided
to change the <IDENTIFIER> element to include children (for instance
<SUB_ID>), this could cause major code changes at the warehouse. To
make matters worse, if BigRetail expected that the warehouse would
respond with an XML invoice using the same element-child structure,
the impact to the code could be widespread. By localizing the XML
transformations into XSLT, such extensions are quickly handled in a
single spot without massive code changes.
5. Knowledge can be gleaned from a supplier's behavior: Related
to the first point on persistence, because everything is retained by
the system, reports and analysis can be performed on the purchase
orders to gain knowledge about the business. For example, if a
particular supplier is repeatedly requesting the same item during the
same week each month, you might want to target that supplier with
discounts on related products the week prior to help increase sales.
Without retaining this information, such opportunities would be lost.
Summary
Transformational Web services can be easily constructed by
leveraging the strengths of a native XML database and XSLT. This
architecture lends itself to building Web services that adapt to new
types of incoming messages and are tolerant of change.
While the previously discussed example involving processing
purchase orders is relatively simple, you can imagine more complex
Web services that rely on the transformational power of XSLT and the
persistence of an XML database. For example, imagine a Web service
that provides a unified view of a loan application, taking financial
information from a variety of banks and institutions for an
individual and providing most of the application prepopulated.
Another example might be a Web service that acts as a
middleman and clearinghouse for exchanging documents pertaining to
complex legal transactions. It would be nearly impossible to provide
an infrastructure for such services without the flexible storage and
transformation facilities provided by XSLT and a native XML database.
Author Bio
Bill Dettelback is a systems architect with eXcelon Corporation.
Prior to joining eXcelon, he worked as a senior member of the
technical staff at AT&T building AI-based customer care applications.
Bill holds a BS and an MS in computer science from the New Jersey
Institute of Technology.
billd@exln.com
Listing 1: requestShipment method declaration
<requestShipment xmlns="http://openuri.org/">
<doc>{doc}</doc>
<poNumber>{poNumber}</poNumber>
<supplier>{supplier}</supplier>
</requestShipment>
Listing 2: Acme purchase order
<PurchaseOrder num='123' supplier='444'>
<PO_Hdr time='10:34:00'>
<customerID>JONES</customerID>
<address>123 AnyStreet</address>
<city>New York</city>
<state>NY</state>
<ZIP>10036</ZIP>
<country>USA</country>
</PO_Hdr>
<Line_Items>
<Item>
<itemNumber>4183</itemNumber>
<quantity>44</quantity>
<unitPrice>12.34</unitPrice>
</Item>
</Line_Items>
</PurchaseOrder>
Listing 3: BigRetail purchase order
<PURCHASE_ORDER>
<SUPP_ID>38203928</SUPP_ID>
<IDENTIFIER>765_832</IDENTIFIER>
<DATE_TIME>MARCH 23 2002 10:12:28</DATE_TIME>
<SUB_TOT>190.47</SUB_TOT>
<SHIP_INFO>
<SHIP_ADDR1>653 Frontier Blvd.</SHIP_ADDR1>
<SHIP_CITY>New York</SHIP_CITY>
<SHIP_STATE>NY</SHIP_STATE>
<SHIP_CODE>10036</SHIP_CODE>
<SHIP_COUNTRY>USA</SHIP_COUNTRY>
</SHIP_INFO>
<LINE_COUNT>1</LINE_COUNT>
<ITEM_DETAIL>
<LINE_ITEM>
<LINE_NO>1</LINE_NO>
<ITEM_NAME>Fetzer Valve</ITEM_NAME>
<ITEM_NO>3829</ITEM_NO>
<QUANTITY>3</QUANTITY>
<MEASURE>EACH</MEASURE>
<COST>63.49</COST>
<TOTAL>190.47</TOTAL>
</LINE_ITEM>
</ITEM_DETAIL>
<NOTES>
CONTACT PHIL UPON ARRIVAL
</NOTES>
</PURCHASE_ORDER>
Listing 4: requestShipment method definition
public void requestShipment( Node doc, String poNumber, int supplier ) throws Exception
{
// Store the incoming PO
OutputFormat format = new OutputFormat();
StringWriter stringOut = new StringWriter();
XMLSerializer serial = new XMLSerializer( stringOut, format );
serial.asDOMSerializer();
NodeList childs = doc.getChildNodes();
Element root = (Element) childs.item(1);
serial.serialize( root );
String filePath = "/incoming/" + poNumber.trim() + ".xml";
XMLDatabase.createXMLFile( "Warehouse", filePath, stringOut.toString(), false );
// Transform the PO into our internal format & store
String transformXSLT = null;
switch (supplier)
{ case 1:
transformXSLT = "acme.xsl";
break;
case 2:
transformXSLT = "bigRetail.xsl";
break;
default:
throw new Exception("Invalid supplier code " + supplier + " seen.");
}
String transformedPO = XMLDatabase.applyTransform
( "Warehouse", filePath, transformXSLT );
XMLDatabase.createXMLFile( "Warehouse", "/toBeShipped/"
+ poNumber.trim() + ".xml", transformedPO, false );
processOrder( poNumber );
callback.shipNotice( poNumber );
}
Listing 5: Acme purchase order XSLT generated by Stylus Studio
<?xml version="1.0" encoding="ucs-2"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" num="123" supplier="444">
<xsl:template match="/">
<xsl:for-each select="PurchaseOrder">
<PO>
<xsl:for-each select="PO_Hdr">
<POHeader>
<PONumber>
<xsl:value-of select="../@num"/>
</PONumber>
<timestamp>
<xsl:value-of select="@time"/>
</timestamp>
<supplierID>
<xsl:value-of select="../@supplier"/>
</supplierID>
<xsl:for-each select="customerID">
<customerID>
<xsl:value-of select="."/>
</customerID>
</xsl:for-each>
<xsl:for-each select="address">
<shipToLine1>
<xsl:value-of select="."/>
</shipToLine1>
</xsl:for-each>
<xsl:for-each select="city">
<shipToCity>
<xsl:value-of select="."/>
</shipToCity>
</xsl:for-each>
<xsl:for-each select="state">
<shipToState>
<xsl:value-of select="."/>
</shipToState>
</xsl:for-each>
<xsl:for-each select="ZIP">
<shipToPostalCode>
<xsl:value-of select="."/>
</shipToPostalCode>
</xsl:for-each>
<xsl:for-each select="country">
<shipToCountry>
<xsl:value-of select="."/>
</shipToCountry>
</xsl:for-each>
</POHeader>
</xsl:for-each>
<xsl:for-each select="Line_Items">
<ArrayOfPOLine>
<xsl:for-each select="Item">
<POLine>
<xsl:for-each select="itemNumber">
<lineNumber>
<xsl:value-of select="."/>
</lineNumber>
</xsl:for-each>
<xsl:for-each select="quantity">
<quantity>
<xsl:value-of select="."/>
</quantity>
</xsl:for-each>
<xsl:for-each select="unitPrice">
<unitPrice>
<xsl:value-of select="."/>
</unitPrice>
</xsl:for-each>
</POLine>
</xsl:for-each>
</ArrayOfPOLine>
</xsl:for-each>
</PO>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>