In addition, it was suggested that traditional Online Transaction Processing
systems (OLTP) don’t suffer from such limitations, rendering them more suitable
for the emerging e-commerce applications that may require such guarantees.
This article discusses this question and shows that there’s nothing
inherently wrong with these new models that prevents applications from using
them to obtain end-to-end transactionality. However, before addressing the
question of whether or not any specific transaction system can be used to
provide end-to-end transactional guarantees, it’s important to realize the
following: end-to-end transactionality is not some holy grail that people
have been searching for in myths and legends; it’s a solution to one specific
problem area, not a global panacea to all transaction issues. The ability
or lack thereof to guarantee end-to-end transaction integrity does not in
and of itself prevent a specific transaction system provider from tackling
many other equally important issues in today’s evolving world of e-commerce
and mobile applications.
What Is End-to-End Transactionality?
Let’s consider what such end-to-end guarantees are. Atomic transactions
(transactions) are used in application programs to control the manipulation
of persistent (long-lived) objects. Transactions have the following ACID properties:
• Atomic: If interrupted by failure, all effects are undone
(rolled back).
• Consistent: The effects of a transaction preserve invariant
properties.
• Isolated: A transaction’s intermediate states are not
visible to other transactions. Transactions appear to execute serially, even
if they’re performed concurrently.
• Durable: The effects of a completed transaction are persistent;
they’re never lost (except in a catastrophic failure).
A transaction can be terminated in two ways: committed or aborted (rolled
back). When a transaction is committed, all changes made within it are made
durable (forced on to stable storage, e.g., disk). When a transaction is
aborted, all the changes are undone. Atomic transactions can also be nested;
the effects of a nested transaction are provisional upon the commit/abort
of the outermost (top-level) atomic transaction.
Commit Protocol
A two-phase commit protocol is required to guarantee that all the transaction
participants either commit or abort any changes made. Figure 1 illustrates
the main aspects of the commit protocol: during phase one, the transaction
coordinator, C, attempts to communicate with all the transaction participants,
A and B, to determine whether they’ll commit or abort. An abort reply from
any participant acts as a veto, causing the entire transaction to abort.
Based upon these (lack of) responses, the coordinator decides whether to
commit or abort the transaction. If the transaction will commit, the coordinator
records this decision on stable storage and the protocol enters phase two,
where the coordinator forces the participants to carry out the decision.
The coordinator also informs the participants if the transaction aborts.
Figure 1:
When each participant receives the coordinator’s phase-one message,
they record sufficient information on stable storage to either commit or
abort changes made during the transaction. After returning the phase-one
response, each participant that returned a commit response must remain blocked
until it has received the coordinator’s phase-two message. Until they receive
this message, these resources are unavailable for use by other transactions.
If the coordinator fails before this message is delivered, these resources
remain blocked. However, if crashed machines eventually recover, crash recovery
mechanisms can be employed to unblock the protocol and terminate the transaction.
A participant’s role depends on the application in which it occurs.
For example, a J2EE JTA XAResource is a participant that typically controls
the fate of work performed on a database (e.g., Oracle) within the scope
of a specific transaction.
Note: The two-phase commit protocol that most transaction systems
use is not client/server-based. It simply talks about a coordinator and participants,
and makes no assumption about where they’re located. Different implementations
of the protocol may impose certain restrictions about locality, but these
are purely implementation choices.
What Is an End-to-End Transaction?
In this section I use the Web as an example of where end-to-end transactionality
integrity is required. However, there are other areas (e.g., mobile) that
are equally valid. Transposing the issues mentioned here to these other areas
should be relatively straightforward.
Atomic transactions, with their “all-or-nothing” property, are a well-known
technique for guaranteeing application consistency in the presence of failures.
Although Web applications exist that offer transactional guarantees to users,
these guarantees extend only to resources used at or between Web servers;
clients (browsers) are not included, even though they play a significant
role in certain applications, as mentioned earlier.
Therefore, providing end-to-end transactional integrity between the
browser and the application (server) is important, as it allows work that
involves both the browser and the server to be atomic. However, current techniques
based on CGI scripts can’t provide end-to-end guarantees. As shown in Figure
2, the user selects a URL that references a CGI script on a Web server (message
one), which then performs the transaction and returns a response to the browser
(message two) after the transaction is complete. Returning the message during
the transaction is incorrect since it may not be able to commit the changes.
Figure 2:
In a failure-free environment, this mechanism works well. However,
in the presence of failure it’s possible for message two to be lost between
the server and the browser, resulting in work at the server not being atomic
with respect to any browser-related work. Thus, there’s no end-to-end transactional
guarantee.
Is End-to-End Transactionality Possible?
Yes. There’s nothing inherent in any transaction protocol (two-phase,
three-phase, presumed abort, presumed nothing, etc.) that prevents end-to-end
transactionality. The transaction engine (essentially the coordinator) has
very little effect on this, whether or not it’s embedded in a proprietary
service or within an industry standard application server. What does make
end-to-end transactionality difficult is that it requires a transactional
participant to reside at each “end,” but it does not require a transaction
coordinator and its associated baggage to reside at each “end.”
A contract exists between transaction coordinator and participants,
which states (in brief and with many simplifications):
• Once the transaction coordinator has decided to commit or roll back
the transaction, it guarantees delivery of this information to every participant,
regardless of failures. Note: There are various optimizations to this,
such as a presumed abort protocol, but in essence the contract remains the
same.
• Once told to prepare, a participant will make sufficient durable information
for it to either commit or cancel the work that it controls. Until it determines
the final outcome, it should neither commit nor cancel the work itself. If
a failure occurs, or the final outcome of the transaction is slow in arriving,
the resource can typically communicate with the coordinator to determine
the current progress of the transaction.
In most transaction systems the majority of the effort goes into designing
and developing the transaction-coordinator engine, making it as performant
and reliable as possible. However, this by itself is insufficient to provide
a usable system: participants are obviously required. Although any contract-conformant participant implementation can
be plugged into the two-phase protocol, typically the only ones that most
people use are those that control work performed on (distributed) databases,
e.g., the aforementioned XAResource. This tends to result in the fact that
many people equate transactions with databases only, and hence the significant
amount of resources required for these participant implementations. However,
this is not the case: a participant can be as resource hungry as necessary
in order to fulfill the contract. Thus, a participant could use a local file
system to make state information durable, for example, or it could use nonvolatile
RAM. It depends on what the programmer deems necessary.
With the advent of Java it’s possible to empower thin (resource scarce)
clients (e.g., browsers) so they can fully participate within transactional
applications. Transaction participants tailored to the application and environment
where they’ll work can be downloaded (on demand) to the client to ensure
that they can fully participate within the two-phase commit protocol. There’s
nothing special about specific transaction system implementations that makes
them more easily adapted to this kind of environment. The specialization
comes from the end-point resource – the client-side participant.
OLTP vs OO-TP
Are “traditional” online transaction processing (OLTP) engines more
suited to end-to-end transactionality guarantees than “newer” object-oriented
transaction systems? The quick answer is no. Why should they be? It doesn’t
matter whether a transaction system is supported by a proprietary remote
procedure call (RPC) and stub generation techniques or by an open-standard
remote object invocation mechanism such as a CORBA ORB; once the distributed
layers are removed, they all share the same core – a two-phase commit protocol
engine that supports durable storage and failure recovery. How that engine
is invoked, and how it invokes its participants, is immaterial to its overall
workings.
The real benefit of OO-TP over OLTP is openness. Over the past eight
years there’s been a significant move away from proprietary transaction processing
systems and their support infrastructure for open standards. This move has
been driven by users who traditionally found it extremely difficult to move
applications from one vendor’s product to another, or even between different
versions of a product from the same vendor. The OMG pioneered this approach
with the Object Transaction Service (OTS) in 1995, when IBM, HP, Digital,
and others got together to provide a means whereby their existing products
could essentially be wrapped in a standard veneer; this approach allowed
applications developed with this veneer to be ported from one implementation
to another, and for different implementations to interact (something else
that was extremely difficult to do reliably).
Therefore, it’s inaccurate to conclude that OLTP systems are superior
in any way to OO-TP equivalents because of their architecture, support environment,
or distribution paradigm. OLTP systems are typically monolithic closed systems,
tying users to vendor-specific choices for implementations, languages, etc.
If the experiences gained by the developers of efficient and reliable implementations
of OLTP are transposed to OO-TP, then there’s nothing to prevent such an
OO-TP system from competing well. The advantages should be obvious: open
systems allow customers to pick and choose the components they require to
develop their applications without worrying about vendor lock-in. Such systems
are also more readily ported to new hardware and operating systems, allowing
customers even more choice for deployment.
The OTS
The OTS architecture provides standard interfaces to components that
are possessed by all transaction engines. It doesn’t modify the model underlying
all the existing different transaction monitor implementations; it mandates
a two-phase commit protocol with presumed abort and all implementations of
the OTS must comply with this. It was intended as an adjunct to these systems,
not as a replacement.
No company that has spent many years building up reliability in such
a critical piece of software as transactions would be prepared to start from
scratch and implement again. In addition, no user of such reliable transaction
software would be prepared to take the risk of transitioning to this new
software, even if it were “open.” In the area of transactions, which are
critical fault-tolerance components, it takes time to convince customers
that new technology is stable and performant enough to replace what they’ve
been using for many years.
Although the CORBA model is typically discussed in terms of client/server
architecture, from the outset its designers did not want to impose any restrictions
on the type of environment in which it could run. There’s no assumption about
how “thin” a client is, or how “fat” a server must be, in order to execute
a CORBA application. Many programmers these days simply use the client/server
model as a convenient way in which to reason about distributed applications.
But at their core these applications never have what would traditionally
be considered a thin client; services that a user requires may well be colocated
with that user within the same process. CORBA was the first open architecture
to support the configurable deployment of services in this way, correctly
seeing this separation of client and service as just that: a deployment issue.
Nothing in the CORBA model requires a client to be thin and functionally
deficient.
The OTS is comparable to CICS, Tuxedo, and DEC ACMS. It differs only
in that it’s a standard and allows interoperation. There’s nothing fundamentally
wrong with the OTS architecture that prevents it from being used in an end-to-end
manner. The OTS supports end-to-end transactionality in exactly the same
way CICS or any other “traditional” OLTP would – through its resources.
To ORB or Not to ORB?
There also appears to be some confusion as to whether CORBA implementations
(ORBs) are sufficient for mission-critical applications. In the mid-’90s
when implementations first appeared on the market, their performance was
not as good as handcrafted solutions. However, that has certainly changed
over the past few years. The footprint of some ORBs is certainly large, but
there are other ORBs that have been tailored specifically for real-time or
embedded use. Companies such as IBM, IONA, and BEA have seen ORBs develop
over the years to become a critical part of the infrastructure that they
have to control and therefore they have their own implementations. Other
companies have licensed ORB implementations from elsewhere.
The crash failure of an ORB doesn’t typically mean that it can recover
automatically and continue applications from where it left off. This is because
the ORB doesn’t have necessary semantic and syntactic information to automatically
check point state for recovery purposes. However, by using suitably defined
services such as the OTS and the persistence service, it’s possible for applications
to do this themselves or for vendors to use these services to do this for
applications.
It’s been said that OLTP systems provide this kind of feature out-of-the-box,
and it may be true. However, it’s an unfair comparison: an OLTP system does
just one thing and does it well – it manages transactions. A CORBA ORB is
meant to provide support for arbitrary distributed applications, the majority
of which probably won’t even need fault tolerance, let alone transactions.
However, for those applications that do need these capabilities, it’s entirely
possible to provide exactly the same recovery functionality using OMG open
standards.
J2EE
Although the J2EE model is client /server based, as with CORBA there’s nothing to prevent a client from being
rich in functionality. It’s a deployment choice that’s made at build-time
and runtime (obviously, the capability for being so rich is required to be
built into the client, and even if it were present, it would be up to the
user to determine whether or not such functionality was required or possible).
Note: Although J2EE didn’t start out as an infrastructure that used
CORBA, it quickly became evident that the OMG companies’ experiences in developing
CORBA were extremely important to any distributed system. As a result, over
the past few years J2EE has gotten closer and closer to the CORBA world,
and now requires some critical ORB components in order to run.
The typical way in which J2EE programmers use transactions is through
the Java Transaction API (JTA), which is a mandated part of the specification.
The JTA is intended as a higher-level API for programmers to try to isolate
them from some of the more complex (and sometimes esoteric) aspects of constructing
transactional applications. The JTA does not imply a specific underlying
implementation for a transaction system, so it could be layered on CICS,
Tuxedo, etc. However, because the OTS is now the transaction standard for
most companies and it allows interoperation between different implementations,
it was decided that the preferred implementation would be based on this (called
the JTS, just to place it firmly in the Java domain). The JTS is currently
optional, but it may eventually become a mandated part of the J2EE specification.
Application Server Means Thin Client?
No. As shown above, this is essentially a deployment issue. It’s certainly
correct to say that most J2EE programmers currently use a thin(-ish) client,
with most of the business logic residing within the server; however, this
is simply because this solution matches 90% of the problems. Closer examination
of all application server applications would certainly reveal that although
thin clients are the norm, they are by no means the complete picture.
The application server has nothing fundamentally wrong with its model
either. Not to say that the client-side of an application server application
has to be wafer-thin. If the client wants to embed functionality such as
a two-phase aware transactional resource within itself, that’s entirely possible.
In fact, a client could just as easily be embedded within an application
server, if the footprint allowed. The reasons for not doing this have more
to do with the footprint size than any architectural issue.
Conclusion
Can end-to-end transactional guarantees be provided by modern transaction
systems such as JTA? Yes, as we’ve shown there’s nothing inherent in these
models that prevents them from providing such guarantees. The two-phase commit
protocol doesn’t know anything about clients or servers, doesn’t make assumptions
about the locality of the coordinator or participants, and doesn’t require
any semantic knowledge of the applications. End-to-end transactional guarantees
are simply a deployment view on the relative locality of different participants.
Are OLTP systems more suited to end-to-end guarantees than their modern
OO-TP cousins? As we’ve shown, since they’re both based on two-phase commit
protocols, there’s nothing in either model that would mean they are any more
or less ideal for any specific problem domain. However, there are obvious
design and implementation decisions that can be made when building a transaction
system using either model, which may mean that specific instances are not
best suited for end-to-end transactional solutions. It’s important to realize
that this is an implementation choice only.
References
• Little, M.C., Shrivastava, S.K., Caughey,S.J. and Ingham, D.B. (1997).“Constructing
Reliable Web Applications Using Atomic Actions.” Proceedings of the Sixth
Web Conference. April.
• Little, M.C. (1997). “Providing End-to-End Transactional Web Applications
Using the Object Transaction Service” OMG Success Story.
• “CORBAservices: Common Object Services Specification.” OMG Document
Number 95-3-31. March 1995.
• “Java Transaction API 1.0.1 (JTA).” Sun Microsystems. April 1999.
• “Java Transaction Service 1.0 (JTS).” Sun Microsystems. December 1999.
Recently, the question was asked whether or not the models on which current
transaction systems are based (e.g., JTA and JTS) are powerful enough to
support end-to-end transactional guarantees for applications.
Author Bio
Dr. Mark Little is an engineer/architect for HP Arjuna Labs in Newcastle
upon Tyne, England, where he leads the HP-TS and HP-WST teams. He is
HP’s representative on the OTS Revision Task Force and the OASIS Business
Transactions Protocol specification.
mark_little@hp.com