SOAP's Two Messaging Styles
Balancing their abilities against your needs
Source Code for this article
To RPC, or not to RPC: that is the question. Whether 'tis nobler in the mind to suffer the control and dependency of coupling, or to take arms against a sea of troubles, and by opposing, end them?
The Simple Object Access
Protocol (SOAP) offers two
messaging styles: RPC (Remote
Procedure Call) and document
style. One is for creating tightly coupled,
inter-object style interfaces for
Web services components; the other
is for developing loosely coupled,
application-to-application and system-
to-system interfaces. Some of
you may have questions about the differences
in the styles or the problems they are
designed to solve. My goal here is to answer
those questions. I'll first present the two styles
in enough detail for you to gain an appreciation
of their relative strengths and weaknesses;
I'll then look at guidelines for their use.
The first question you may have is what is
an RPC? An RPC is a way for an application
running in one execution thread on a system
to call a procedure belonging to another
application running in a different execution
thread on the same or a different system. RPC
interfaces are based on a request-response
model where one program calls, or requests a
service of, another across a tightly coupled
interface. In Web services applications, one
service acts as a client, requesting a service;
the other as a server, responding to
that request. RPC interfaces have two
parts: the call-level interface seen by
the two applications, and the underlying
protocol for moving data from
one application to the other.
The call-level interface to an RPC
procedure looks just like any other
method call in the programming language
being used. It consists of a
method name and a parameter list. The
parameter list is made up of the variables
passed to the called procedure and those
returned as part of its response. This is true on
both sides of the interface. Both sides believe
they are calling, or are being called by, a locally
running procedure. Wiring in between
hides the complexity of moving data between
the two applications.
For Web services, SOAP defines the wiring
between the calling and called procedures. At
the SOAP level, the RPC interface appears as a
series of highly structured XML messages
moving between the client and the server
where the <Body> of each SOAP message contains
an XML representation of the call or
return stack.
The transformation from call-level interface
to XML and back occurs through the
magic of two processes – marshaling and serialization.
Figure 1 illustrates the major components
and steps involved in this process.
The process begins with the client calling a
method implemented as a remote procedure.
The client actually calls a proxy stub
that acts as a surrogate for the real procedure.
The proxy stub presents the same
external interface to the caller as would the
real procedure, but instead of implementing
the procedure's functionality, implements
the processes necessary for preparing
and transporting data across the interface.
The proxy stub gathers the parameters it
receives through its parameter list into a
standard form, in this case, into a SOAP message,
through a process called marshaling.
The proxy stub encodes the parameters as
appropriate during the marshaling process
to ensure the recipient can correctly interpret
their values. Encoding may be as simple
as identifying the correct structure and
data type as attributes on the XML tag
enclosing the parameter's value or as complex
as converting the content to a standard
format such as Base64. The final product of
the marshaling process is a SOAP message
representation of the call stack.
The proxy stub serializes the SOAP message
across the transport layer to the server.
Serialization involves converting the SOAP
message into a TCP/IP buffer stream and
transporting that buffer stream between the
client and the server.
The server goes through the reverse
process to extract the information it needs. A
listener service on the server deserializes the
transport stream and calls a proxy stub on the
server that unmarshals the parameters,
decodes and binds them to internal variables
and data structures, and invokes the called
procedure. The listener process may be, for
example, a J2EE servlet, JSP (JavaServer Page),
or Microsoft ASP (Active Server Page). The
client and server reverse roles and the inverse
process occurs to return the server's response
to the client.
You may be curious about the distinction I
make between marshaling and serialization,
having seen the terms used interchangeably. I
distinguish between them because with Web
services different standards define the rules
for the two processes. SOAP defines the rules
for marshaling and encoding data into XML
messages, but doesn't specify how data is
actually serialized across the interface. SOAP
can bind to any protocol (usually either HTTP
or Simple Mail Transport Protocol [SMTP]) for
serialization, which means the specifications
for those protocols actually define the serialization
rules.
Section 7 of the SOAP specification defines
the rules for marshaling RPC calls into XML
messages (the most recent version of the
SOAP 1.2 specification moves this information
to the Adjuncts section, but the rules remain
the same). Section 7 says to encode RPC
method calls and responses as hierarchical
XML elements, or structures, where the rootlevel
element name is the method name in
the case of the request and an arbitrary value
in the case of the response, the structure's
child elements are the method's parameters or
return values; and each parameter or return
value's elements are the data value or values it
represents.
Section 5 of the SOAP specification defines
SOAP's built-in rules for encoding data values.
Encoding is necessary any time the recipient
needs to interpret an element's value as something
other than a literal string, i.e. as an integer,
floating point number, or MIME type.
XML Schema offers an increasingly popular
alternative that has all but obsolesced Section
5 encoding. Listings 1 and 2 illustrate the two
options for a skeletal RPC method call; the
encodingStyle attribute tells the recipient
which scheme is being used.
With this background on RPC style in
place, the next question is how does document-
style messaging differ? The difference is
primarily in the control you have over the
marshaling process. With RPC-style messaging,
standards govern that process. With document-
style messaging, you make the decisions:
you convert data from internal variables
into XML; you place the XML into the
<Body> element of the encapsulating SOAP
document; you determine the schema(s), if
any, for validating the document's structure;
and you determine the encoding scheme, if
any, for interpreting data item values. The
SOAP document simply becomes a wrapper
containing whatever content you decide. For
example, the SOAP document shown in
Listing 3 contains an XML namespace reference,
http://www.xyz.com/genealogy, that
presumably includes all the information a
receiving program needs for validating the
message's structure and content, and for correctly
interpreting data values.
Figure 2 illustrates the steps in a typical
document-style message exchange. If you
compare the steps involved in this process
with those involved in processing an RPCstyle
message from Figure 1, you will notice
they are essentially parallel processes.
The SOAP client uses an Extensible
Stylesheet Language Transformation (XSLT)
and the DOM parser, or some other means,
to create an XML document.
The SOAP client places this XML document
into the <Body> of a SOAP message.
The SOAP client optionally includes a
namespace reference in the message that
other applications can use for validating the
encapsulated document's format and content.
The namespace reference may be
included as an attribute either on one of
the SOAP elements or on the XML document's
root element. If the document does
not include a namespace reference, the
client and server must agree on some other
scheme for validating and interpreting the
document's contents.
The SOAP client serializes the message to
the SOAP server across either an HTTP or
SMTP bound interface.
The SOAP server reverses the process,
potentially using a different XSLT, to validate,
extract, and bind the information it needs
from the XML document to its own internal
variables. The roles reverse and the two follow
inverse processes for returning and accessing
any response values. The rules guiding the
marshaling process are the primary difference
between this process and that for RPC-style
messages. With document-style, you as the
SOAP client's author create those rules.
Strengths and Weaknesses
Now that we've looked at both styles in
some detail, we can discuss their relative
strengths and weaknesses.
RPC-style messaging maps to the objectoriented,
component-technology space. It is
an alternative to other component technologies
such as DCOM and CORBA where component
models are built around programmable
interfaces and languages such as Java and
C#. RPC-style messaging's strength in this
space lies in its platform independence. It
offers a standards-based, platform-independent
component technology, implemented over
standard Internet protocols. One of the benefits
of this style's XML layer is that clients and
servers can use different programming languages,
or technologies, to implement their
respective side of the interface, which means
one side can choose one set of technologies,
such as J2EE's JAX-RPC, while the other
chooses a completely different set, such as
.NET's C#. RPC-style messaging's standards
heritage can be an important consideration in
hybrid environments (one using multiple
technologies such as J2EE and .NET) and can
provide a transition path between different
technologies.
RPC-style messaging's weaknesses include:
Strong coupling: If you change the number,
order, or data types of the parameters to the
call-level interface, you must make the
change on both sides of the interface.
Synchronicity:Most programming languages
assume synchronous method calls:
the calling program normally waits for the
called program to execute and return any
results before continuing. Web services are
asynchronous by nature and, in comparison
to technologies such as DCOM and
CORBA, long running. You may want to take
advantage of Web services' asynchronous
nature to avoid the user having to wait for
calls to complete by developing asynchronous
RPC calls, but that adds another level
of complexity to your application. Some
tools hide this complexity using callbacks,
or other techniques, to enable processing
overlap between the request and the
response. Check to see if the tools you're
using let you choose between synchronous
and asynchronous RPC calls.
Marshaling and serialization overhead:
Marshaling and serializing XML is more
expensive than marshaling and serializing a
binary data stream. With XML, at least one
side of the interface, and possibly both,
involves some parsing in order to move
data between internal variables and the
XML document. There is also the cost of
moving encoded text, which can be larger
in size than its binary equivalent, across the
interface.
How do these drawbacks compare to those
found in other component technologies? The
coupling and synchronicity issues are common
to RPC-based component technologies.
so they are really not discriminators when
making comparisons between these technologies.
The marshaling and serialization overhead
is greater for RPC-style messaging and
places this messaging style at a relative disadvantage.
However, with today's high-speed
processors and networks, performance is generally
not an issue.
Document-style messaging is clearly an
option in any situation where an XML document
is one of the interface parameters. It
is ideal for passing complex business documents,
such as invoices, receipts, customer
orders, or shipping manifests. Documentstyle
messaging uses an XML document
and a stylesheet to specify the content and
structure of the information exchanged
across the interface, making it an obvious
choice in situations where a document's
workflow involves a series of services where
each service processes a subset of the information
within the document. Each service
can use an XSLT to validate, extract, and
transform only the elements it needs from
the larger XML document; with the exception
of those elements, the service is insensitive
to changes in other parts of the document.
The XSLT insulates the service from
changes in the number, order, or type of
data elements being exchanged. As long as
the service creating the document maintains
backwards compatibility, it can add or
rearrange the elements it places into a document
without affecting other services.
Those services can simply ignore any additional data. Document-style messaging is
also agnostic on the synchronicity of the
interface; it works equally well for both synchronous
and asynchronous interfaces.
Document-style messaging's weaknesses include:
No standard service identification mechanism:
With document-style messaging, the
client and server must agree on a service
identification mechanism: a way for a document's
recipient to determine which service(
s) need to process that document. SOAP
header entries offer one option; you can
include information in the document's header
that helps identify the service(s) needed.
WS-Routing makes just such a proposal.
Another option is to name elements in the
<Body> of the message for the services that
need to process the payload the elements
contain. You might ask how that differs from
schema-based RPC-style messaging. You
would be right in assuming there is little or
no difference except possibly in terms of the
number of "calls" that can be made per message.
A third option is to perform either
structure or content analysis as part of a service
selection process in order to identify the
services needed to process the document.
Marshaling and serialization overhead:
Document-style messaging suffers from the
same drawbacks as RPC-style messaging in
this area. However, the problem may be more
severe with document-style messaging.
Document-style messaging incurs overhead
in three areas: in using DOM, or another
technique, to build XML documents; in using
DOM, or SAX, to parse those documents in
order to extract data values; and in mapping
between extracted data values and internal
program variables. Tools generating equivalent
RPC-style interfaces optimize these
transformations. You may have trouble
achieving the same level of efficiency in your
applications using standard tools.
Given these drawbacks, you may ask
whether document-style messaging really is
an alternative. The answer is yes. There are
two compelling reasons to use documentstyle
messaging. One is to gain the independence
it provides. Its strength lies in
decoupling interfaces between services to
the point that they can change completely
independently of one another. The other is
that document-style messaging puts the full
power of XML for structuring and encoding
information at your disposal. The latter is
one reason many consider document-style
superior to RPC-style messaging.
Summary
Given their relative strengths and weaknesses,
what guidelines should you use in choosing
between the two messaging styles? RPC-style
messaging's strength is as a bridging component
technology. It is a good option for creating
new components and for creating interfaces
between Web services and existing components
– you simply wrap existing components
with RPC-style Web services interfaces.
RPC-style messaging is also an excellent
component standard in situations where you
are using multiple technologies, such as J2EE
and .NET, and want to develop sharable
components. So, there is clear justification
for adopting an RPC style as a standard in
these roles.
Document-style messaging's strengths are in
situations where an XML document is part of the
data being passed across the interface, where you
want to leverage the full power of XML and XSL,
and in instances where you want to minimize
coupling between services forming an interface,
such as in application-to-application and systemto-
system interfaces. So, there is clear precedent
here as well.
Neither style is a panacea. You must consider
the relative strengths and weaknesses of each
against your requirements. With these guidelines
in mind, however, it is safe to adopt either based
on your specific needs.
About the Author
Rickland Hollar is a senior applications architect for the Central
Intelligence Agency with over 30 years experience in the
industry. Prior to joining the CIA, he was president of a
Virginia- based software development firm.
rick_hollar@yahoo.com
SOAP's Two Messaging Styles by Rickland Hollar
WSJ Vol 03 Issue 11 - pg.12
Listing 1: Options for a skeletal RPC call
<?xml version="1.0"?>
<env:Envelope
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-Instance">
<env:Header>
...
</env:Header>
<env:Body>
<SomeMethod>
<param enc:arrayType="xsd:ur-type[2]">
<item xsi:type="xsd:int">100</item>
<item xsi:type="xsd:int">20</item>
</param>
</SomeMethod>
...
</env:Body>
</env:Envelope>
Listing 2: Options for a skeletal RPC call
<?xml version="1.0"?>
<env:Envelope
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
env:encodingStyle="http://www.xyz.com/sm">
<env:Header>
...
</env:Header>
<env:Body>
<sm:SomeMethod xmlns:sm="http://www.xyz.com/sm">
<param>
<item>100</item>
<item>20</item>
</param>
</sm:SomeMethod>
...
</env:Body>
</env:Envelope>
Listing 3: Sample XML namespace reference
<?XML version="1.0" ?>
<env:Envelope
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Body>
<xyz:family xmlns:xyz="http://www.xyz.com/genealogy">
<parents>
<father age="35">Richard</father>
<mother>Kim</mother>
</parents>
<children>
...
</children>
<siblings>
...
</siblings>
</xyz:family>
</env:Body>
</env:Envelope>
All Rights Reserved
Copyright © 2004 SYS-CON Media, Inc.
E-mail:
info@sys-con.com