In the past there seemed to be two more or less exclusive routes to
integration: "roll your own" or buy an EAI product. Typically,
developers would choose the first option for maximum flexibility,
while project managers preferred the second, for consistency and
security.
Now, XML and Web services standards can offer lower cost options for
enterprise integration, and have helped to promote the emergence of a
new class of integration tool, the Enterprise Service Bus (ESB). Over
the last year the ESB has emerged as a "middle way" between these two
approaches, providing a developer-friendly integration platform, open
to all the latest Java standards and components, without sacrificing
the benefits of a packaged approach to integration.
This article goes behind the scenes to look at why an ESB can be a
useful part of your integration toolbox, and looks at some of the key
features and qualities that an ESB built using Java technology must
have to fulfill its promise as a capable and cost-effective XML
integration engine.
Growing Demand for Integration
Over the last decade and a half, the integration market has grown
from a few low-level file-transfer and screen-scraping products to a
massive software industry. Message-oriented middleware; extract,
transform and load (ETL) tools; and EAI integration brokers have
emerged, driven by the inexorable spread of the relational database
and packaged business applications.
At first, business applications addressed individual functions in the
organization - accounting, manufacturing, and payroll, for example.
But it didn't take long for IT departments to appreciate the benefits
of sharing information between these separate systems. Programs were
written to transfer key information, at first on a monthly or weekly
basis. Moore's law - which has increased the speed and decreased the
cost of computing power - has accelerated that process to hourly and
even real time. The "zero latency enterprise" has arrived.
More and more, today's businesses need agility - they want to be able
to modify systems quickly and cost-effectively by simply changing how
a few components interoperate, rather than by building or buying
whole new systems. Most of that demand is tactical, with rapid
project timescales and lower project costs; return on investment in
months rather than years is the order of the day. And now, even small
businesses can afford well-integrated computer applications -
something that was a costly luxury even for large organizations 20
years ago.
The economics of integration product development have also changed.
Application vendors are increasingly conforming to standards like Web
services and the Java Connector Architecture (JCA); a small industry
has grown up to offer adapters to so-called "Enterprise Information
Systems" (EIS; the generic term for applications and databases).
Removing the burden of EIS adapter development has moved the focus
towards the EAI vendors' traditional weak spots: transformation,
developer-friendly programmatic processing (in Java, for example),
and integration to external technologies. This has opened the
integration market to more nimble newcomers, and has led to the
emergence of the so-called Enterprise Service Bus category of
products.
What Is an ESB?
There is no standard definition of exactly what features an ESB
should support, but there is wide agreement over the general outline.
An ESB will offer most of the following:
Event-driven, document-oriented processing model: Based on
XML standards it delivers an asynchronous service-oriented architecture
Content-based routing and filtering
Complex transformation capability
Support for several standard interfaces: From a list
including COM, CICS, .NET, JMS, JCA, JDBC, and, of course, Web services
Distributed operation and management: Rather than a
centralized integration hub
Service-Oriented Architecture
Perhaps the oldest "new thing" in the equation is the
service-oriented architecture. First conceived alongside technologies
like DCE and CORBA, service-oriented architecture separates the core
of an application into a series of fairly coarse-grained components -
chunks of business logic that provide "services" to client programs.
These services can be shared between presentation (GUI) and
integration applications (for example, as steps in a business
process).
Processes communicate only through documented interface "contracts,"
which reinforce the design principles of modularity and
encapsulation. Hiding the details of how each module works - and
tying down its specification - makes it easier for external
developers to reuse the component.
A particular class of service-oriented architecture uses complete
documents - business messages if you like - to trigger each step in
an overall business transaction. That's because there may be no
single database that the process steps can all share. This style of
interaction can be called "document-centric processing," to
distinguish it from the database-centric model commonly used to build
internal systems such as ERP and CRM. The document - the message
passed from system to system - contains all necessary data items, and
takes the same role as input parameters in a well-structured function
call. Indeed, it is more flexible; because the document encapsulates
all necessary data, there is no need for each step to know which
specific data items are needed by the next step.
Using message-oriented middleware, it's easy to plug services
together. And because it's easier, that makes it cheaper too.
Interfaces tend to be simpler and cleaner. An enterprise can assemble
systems from best-of-breed pieces, which may be hosted in different
departments or even companies. There is a clear separation of
concerns; components - each of which can be owned, developed, and
deployed autonomously - can be separately scaled, replicated, or
replaced. Different services can have different development life
cycles, languages, and platforms. An enterprise can mix and match its
30-year-old mainframe systems - using IBM WebSphere MQ to kick off
CICS transactions - with J2EE/Unix, Windows/.NET, and any other
legacy processing (see Figure 1).
Easier integration also promotes more frequent reuse rather than
redevelopment of components. When millions of dollars are invested in
software assets, the last thing a business wants is to trash them
just because it doesn't match their current preferred platform.
Content-Based Filtering and Routing
A key requirement of any integration platform is to identify which
data to process and where it needs to be sent. Selection rules -
typically specified using XPATH - identify the key sections of an XML
document. These sections can be extracted and routed to processing
components, and the results delivered to the next processing stage,
based on its content. For example:
/order/item[price>10.80]
will select all order items with a price greater than 10.80, while
/order/[count(item)=1]
selects all orders that are for just a single item.
There's real value in exposing these selection rules as configuration
metadata rather than simply hiding them in code. Business
requirements are changing faster than applications can be created and
modified. Rules offer a way of encapsulating the business semantics
and promoting them to the surface, where they are much easier and
cheaper to manage.
Transformation Capability
As it moves from system to system, data needs to be validated,
translated, and enriched by calculations or lookups. These operations
may be performed by the ESB itself, or they may be delegated to
external systems (by synchronous or asynchronous calls). The ESB
offers an environment in which these actions can be processed
independently (and concurrently) using "actions," which are
processing rules written in a range of scripting or declarative
languages. XSLT is one common approach. But although XSLT engines are
easily available at no cost, the engines can be rather slow and XSLT
itself struggles to achieve the kind of complex transformations
needed for application integration. ESB vendors need to add their own
more powerful transformation engines so that documents can be
processed swiftly and efficiently and then reassembled into one or
more output documents that can be routed - based on their content -
to the next step in an orchestrated sequence.
One important transformation capability is "any to XML" - translating
legacy formats (EDI, structured files, and relational data) into XML
for processing and onward transmission, and from XML back to legacy
formats where required. Because of its semantic power, XML provides
an excellent interchange and internal processing format even for
non-XML inputs and outputs. The ESB executes these syntactic
transformations at the edge - as documents enter and leave the engine
- ensuring that a single technology can be used for core processing.
Support for Standard Interfaces
Of course, the ESB will support Web services interfaces, together
with JMS for enterprise messaging, JCA for applications, and JDBC for
databases. Widely adopted "industry standards" - proprietary
interfaces like IBM WebSphere MQ messaging and CICS transaction
monitor, and Microsoft's MSMQ messaging, for example - may also be
supported. Increasingly, applications and services will communicate
pragmatically, based on widely used protocols such as POP3/ SMTP and
instant messaging, as well as HTTP, SOAP, and MOM. This makes it
easier to manage and share infrastructure between applications and
humans.
Once any middleware (such as a MOM) has been deployed, it can be hard
to replace or upgrade. That's why so many organizations have two or
more existing middleware products deployed, and why it is such a big
mistake to tie an ESB to any single vendor's MOM. An ESB needs to
adapt to the realities of today's businesses: bridging between
multiple MOMs, as well as bridging between asynchronous (messaging)
and synchronous (RPC - Remote Procedure Call) domains. An ESB is
there to support and integrate whatever is already in use, not to
replace it.
Distributed Operation and Management
Early integration products tended to follow a hub-and-spoke
architecture - a straightforward approach that simplifies integration
topology and centralizes configuration and management. In a batch
environment, the slight risk of failure could be insured against by
operating a standby hub for cold failover.
As software pervades the modern business, and interchange frequency
increases, the centralized approach is no longer tenable and the
costs of interruption are too great. Businesses need distributed
networks with no single point of failure - to optimize network
traffic, to provide alternate routes in case of failure, and to
distribute the processing load. Offering services closer to
participating applications - where possible - spreads the processing
load as well as reducing network bandwidth utilization.
Of course this complicates system management, so it must be possible
to manage today's globally distributed ESB from a single operational
console, and to adjust its configuration without bringing down the
entire integration network. Downtime must be kept to a minimum -
remember that "five nines" (99.999%) availability allows for just
five minutes for downtime per year, and even four nines leaves less
than an hour.
What's in It for Developers?
Developers have often disliked integration products, preferring to
use frameworks of their own devising. The main reasons given for this
are:
Original cost of licenses and training: Understandable when products were inflexible, and product licenses came in at six- or
even seven-figure numbers.
Inflexibility of the purchased product: Sometimes it seems
you have to spend as much time working around the product as working
with it.
Poor interoperation with other development and management
infrastructure: When all your deliverables must go through
development and testing into production configurations, you need a
product that's open to development environments and automated
configuration management.
However, there are long-term implications in adopting a "roll your
own" policy. Will the organization be able to support the framework
going forward? Use of a product can deliver significant productivity
gains, reducing development time and cost by allowing the developer
to focus on business logic rather than framework building or
infrastructure coding, so that he or she needs only to write a
fraction of the code normally required. Moreover, the code is fully
open and transferable and no new skills are required, enabling rapid
adoption and return on investment. Now that license costs have
fallen, and products are much easier to work with, there's really no
excuse for taking this kind of risk.
Existing developers and business users can rapidly deliver new
solutions with minimum disruption to existing systems and maximum
leveraging of existing assets and skills. XML-based ESB products can
complement and readily integrate with familiar IDEs and configuration
management systems; this is much less intrusive on the development
process.
Challenges for XML Processing Architectures
One perceived impediment to the exponential uptake of XML is that it
can deliver a rather verbose representation of a chunk of data.
Mushrooming quantities of data being exchanged present a serious
challenge to corporate infrastructure. This is not caused just by the
addition of all those "human readable" tags. XML succeeds precisely
because its structure is so flexible and extensible. Just as the
introduction of the relational model made it easy to add new tables
and columns to a database, so XML's syntactic power allows
organizations to refine and specialize XML Schemas to handle all of
their specific information needs. If necessary, entire document
sub-trees can be added onto a standard schema without breaking
third-party functionality. By encapsulating all contextual
information needed by a series of business processes, XML encourages
the development of loosely coupled, "document-oriented" processing.
New systems are easier to plug in as messages can be extended to
include all required data - there's no need for a multiplicity of
calls back and forth for additional items.
First-generation XML integration pilots successfully demonstrated the
potential of the technology, but they are under increasing strain as
the quantity, size, and complexity of XML documents continues to
grow. XML cannot be adopted for core systems until these problems are
recognized and addressed. Among the causes of poor scalability are:
Whole document parsing: You need to parse entire documents
just to extract a snippet of content for routing and filtering. As
documents get bigger, this results in increasing latency.
Multiple scanning: Documents are often re-parsed at every
stage in a business flow, with the same document being scanned
several times by parsers, XSLT routines, conversion to object
networks, etc. This is exceptionally resource intensive, with a major
impact on performance and throughput. Working around this problem by
passing documents as object networks just tightly couples every step
to the chosen representation, aggravating the cost of development and
maintenance.
Single-threaded execution: A processing step cannot start
until the previous step is completed; latency increases as everything
daisy-chains at the pace of the slowest step.
Cut and paste development: Dealing with multiple nonidentical sources results in identical elements of processing being replicated
into many processing steps; XSLT processing is hard to modularize.
The number of similar transformations adds hugely to the development
and maintenance costs, and impedes new business development by
lengthening time to market.
Routes to Scalability
What is needed is a processing architecture that helps the
architect/developer organize:
Document streaming: Ensures that XML documents are processed as each element arrives, thereby ensuring very low latency; this
approach handles large messages as efficiently small ones.
Selective Processing: Dramatic performance improvements come by processing (and handing around) only relevant fragments rather
than the entire XML document.
Multithreading: The engine manages pipelining of sequential steps, parallel execution of independent steps, and load balancing of
identical steps while processing these multiple XML fragments (see
Figure 2).
Single scanning: Extracting all interesting document
fragments in one pass up front, rather than repeatedly re-reading the
same document structure.
A broker that can manage these techniques without requiring expert
coding and configuration massively reduces the risks of project
failure and consequent business damage caused by inadequate
performance. Core applications, previously out of bounds by reason of
data volume and complexity, can now be confidently addressed.
Conclusion
There has been a tectonic shift away from complex, proprietary,
centralized, and costly integration brokers toward more lightweight,
distributed, standards-based and inexpensive enterprise service bus
technology; this has been driven by the adoption of open standards
like XML, Web services, and J2EE.
The ESB offers a powerful and extensible integration platform that
supports your development aims without forcing you to adopt
proprietary technology, retrain your staff, or radically change your
development methodology. By leveraging the substantial community
investment in XML and other technologies, ESBs can equal or better
the capabilities of earlier integration brokers at a far lower cost,
bringing integration capabilities to smaller businesses and more
marginal projects. Easy adoption of familiar technologies further
reduces the cost of uptake and helps maximize return on investment
for any size of business.
Author Bios
Nigel Thomas offers independent product marketing consultancy in the
application infrastructure software marketplace. He recently spent
two years as director of product management for SpiritSoft's Java
messaging, caching, and integration products. Prior to that, he spent
five years with EAI pioneer Constellar as product architect, and then
as director of product management for the flagship Constellar Hub
product.
nigel.thomas@lyntonresearch.com..
Warren Buckley is chief technology officer and cofounder of
PolarLake, and has been working in the area of XML and Web services
since 1998. Warren was responsible for the development of a number
of patent-pending approaches to optimizing XML processing that
underpin PolarLake's enterprise-strength application integration
products. He was previously CTO of XIAM Ltd, a leading mobile
middleware company, and systems architect for Bank of Ireland Group
Treasury.
warren.buckley@polarlake.com.
All Rights Reserved
Copyright © 2004 SYS-CON Media, Inc.
E-mail:
info@sys-con.com