|
With the widespread use of component technology, it has become increasingly important to employ components in distributed computing environments. Currently, a handful of distributed component platforms exists, including the Distributed Component Object Model (DCOM), Common Object Request Broker Architecture (CORBA), and Remote Method Invocation (RMI). A new addition to this list is Web services, a framework that has recently received considerable attention. "It's safe to say that every one of the software platform vendors out there will have to either support, or provide tools for, Web services at some point," said Simon Yates, an analyst with the industry research firm Forrester Research. Although the Web services market is still in its infancy, analysts say it will have a huge bearing on the way future applications are developed and deployed. This article consists of four major parts, the first two of which are presented below; the next two will be covered in the next issue.
The first part examines Web services details, including what Web services are, the Web services Definition Language, and the Universal Description, Discovery, and Integration service. The second part defines fundamental characteristics of distributed architectures, including their basic features, support provided, employment of the distributed service, and inherent issues in the distributed architecture. The third part uses those characteristics to compare Web services with other distributed environments. Finally, the major applications for Web services will be discussed along with Web services' readiness assessment for those tasks.
Description of Web Services
What Are Web Services?
Web services were introduced as a result of a combined effort by Microsoft, IBM, and Ariba. Web services are components accessible from standard Internet protocols. They combine the best aspects of component-based development and the Web. Like components, Web services represent black-box functionality that can be reused without worrying about how the service is implemented. Unlike current component technologies, Web services are not accessed via object model-specific protocols, such as DCOM, RMI, or IIOP. Instead, Web services are accessed via ubiquitous Web protocols and data formats, such as Hypertext Transfer Protocol (HTTP) and Extensible Markup Language (XML). Furthermore, a Web Service interface is defined strictly in terms of the messages the Web service accepts and generates. Consumers of Web services can be implemented on any platform in any programming language, as long as they can create and consume the messages defined for the Web services interface.
Several essential activities must occur in any environment that supports Web services:
- A Web service needs to be created, and its interfaces as well as
invocation methods must be defined.
- A Web service needs to be published to one or more intranet or
Internet repositories for potential users to locate.
- A Web service needs to be located in order to be invoked by
potential users.
- A Web service needs to be invoked to provide any benefit.
- A Web service may need to be unpublished when it is no longer
available or needed.
Web services architecture requires three fundamental operations: publish, find, and bind. Service providers publish services to a service broker. Service requesters find required services using a service broker and bind to them.
To perform these three operations and bind in an interoperable manner, there must be a Web services stack that embraces standards at each level. Figure 1 shows a conceptual Web services stack. The upper layers build upon the capabilities provided by the lower layers.
Figure 1
Two foundation technologies of Web services are:
Web Service Description Language
Universal Description, Discovery, and Integration
Web Service Description Language
Web Service Description Language (WSDL) is an XML-based Interface Definition Language (IDL) for network-based services, operating on messages containing either document-oriented or procedure-oriented information. It is used to specify operations, messages, implementation details, access protocol, and contact endpoints. The operations and messages are described abstractly and then bound to a concrete network protocol and message format to define an endpoint. WSDL is extensible to allow descriptions of endpoints and their messages regardless of what message formats or network protocols are used to communicate; however, the only bindings that exist today describe how to use WSDL in conjunction with SOAP 1.1, HTTP GET/POST, and MIME.
A WSDL document defines services as collections of network endpoints, or ports. In WSDL, the abstract definition of endpoints and messages is separated from their concrete network deployment or data format bindings. This allows the reuse of abstract definitions: messages, which are abstract descriptions of data, being exchanged, and port types, which are abstract collections of operations (see Figure 2).
Figure 2
The three most important sections in the service definition are:
Message and Port Type: What operations the service provides.
Binding: How the operations are invoked.
Service: Where the service is located.
In addition, any complex data type that the service uses must be defined in a type section immediately before the message section. Let's take a look at each section in more detail.
- Types: A container for data type definitions using some type system.
- Message: Corresponds to a single piece of information moving between the invoker and the service. A regular round trip remote method call is modeled as two messages, one for the request and one for the response. Each message can have zero or more parts, and each part can have a name and an optional type. When WSDL describes an object, each part maps to an argument of a method call. If a method returns void, the response is an empty message.
- Operation: An abstract description of an action supported by the service defining a specific input/output message sequence. The message attribute of each input/output must correspond to the name of a message that was defined earlier. If an operation specifies just input, it is a one-way operation. An output followed by input is a solicit/response operation, and a single input is a notification.
- Port Type: Corresponds to a set of one or more operations. When WSDL describes an object, each operation maps to a method and each Port Type maps to a Java interface or class.
- Binding: Corresponds to a Port Type and is implemented using a particular protocol such as SOAP or CORBA. The type attribute of the binding must correspond to the name of a Port Type that was defined earlier. Because WSDL is protocol-neutral, you can specify bindings for SOAP, CORBA, DCOM, and other standard protocols. If a service supports more than one protocol, the WSDL should include a binding for each protocol that it supports.
- Port: Represents the availability of a particular binding at a specific end-point. The binding attribute of a port must correspond to the name of a binding that was defined earlier. This allows a single component to support multiple communication protocols and effectively be an inter protocol bridge.
- Service: Modeled as a collection of ports.
Although, as we mentioned above, WSDL is protocol-independent and can support any type of binding, SOAP 1.1 binding is becoming the most popular one today.
Advertisement and Discovery of Web Services
As the number of Web services grows, locating a specific service may become a tricky and complex issue. How can a particular service on a particular system be advertised? How can information presented on a particular site be announced to search engines?
The Universal Description, Discovery, and Integration (UDDI) specification defines a way to publish and discover information about Web services. The UDDI specification describes a conceptual cloud of Web services and a programmatic interface that defines a simple framework for describing any kind of Web services. The specification consists of several related documents and an XML schema that defines a SOAP-based programming protocol for registering and discovering Web services. Using the UDDI discovery services, businesses individually register information about the Web services that they expose for use by other businesses.
This information can be added to the UDDI business registry either via a Web site or by using tools that make use of the programmatic service interfaces described in the UDDI Programmer's API Specification. The UDDI business registry is a logically centralized, physically distributed service with multiple root nodes that replicate data with each other on a regular basis. Once a business registers with a single instance of the business registry service, the data is automatically shared with other UDDI root nodes and becomes freely available to anyone who needs to discover what Web services are exposed by a given company.
The core component of the UDDI specifi-cation is the UDDI Business Registration,
an XML file used to describe a business entity and its Web services. Conceptually, the information provided in a UDDI Business Registration consists of three components: "white pages" including address, contact, and known identifiers; "yellow pages" including industrial categorizations based on standard taxonomies; and "green pages," the technical information about services that are exposed by the business. Green pages include references to specifications for Web services as well as support for pointers to various file- and URL-based discovery mechanisms if required.
The UDDI specification consists of an XML schema for SOAP messages and a description of the UDDI API specification. Together, these form a base information model and interaction framework that provides the ability to publish information about a broad array of Web services. The UDDI XML schema defines four core types of information, which provide the sort of information that a technical person would need to know in order to use a business partner's Web services. These are:
Business information: Many partners will need to be able to locate information about your services and will have as starting information a small set of facts about your business, either your business name or perhaps your business name and some key identifiers, as well as optional categorization and contact information. The core XML elements for supporting, publishing, and discovering information about a business - the UDDI Business Registration - are contained in a structure named "businessEntity." This structure serves as the top-level information manager for all information about a particular set of data related to a business unit. The overall businessEntity information includes support for "yellow pages" taxonomies so that searches can be performed to locate businesses that service a particular industry or product category, or who are located within a specific geographic region.
Service information: Technical and business descriptions of Web services - the "green pages" data live within substructures
of the businessEntity information. Two structures are defined: businessService
and bindingTemplate. The businessService structure is a descriptive container that is used to group a series of Web services related to either a business process or a category of services. Examples of business processes that would include related Web services information include purchasing services, shipping services, and other high-level business processes. These businessService information sets can each be further categorized, allowing Web services descrip-tions to be segmented along combinations of industry, product and service, or geographic category boundaries.
Binding information: Within each businessService lives one or more technical Web services descriptions (binding Temp-lates). These contain the relevant data needed by application programs to connect to and then communicate with a remote Web Service. This information includes the address to make contact with a Web service, as well as support for optional information which can be used to describe both hosted services and services that require additional values to be discovered prior to invocation. Additional features are defined that allow for complex routing options such as load balancing.
Service specification: Often it is not enough to simply know where to contact a particular Web service. For instance, if it is known that a business partner has a Web service that allows me to send them a purchase order, knowing the URL for that service is not very useful unless a great deal is known about the format the purchase order should be sent in, what protocols are appropriate, what security is required, and what sort of response will result after sending the purchase order. Integrating all parts of two systems that interact via Web services can become quite complex.
As an application program or programmer interested in specific Web services, information about compatibility with a given specification is required to make sure that the right Web services are invoked for a particular need. For this reason, each bindingTemplate element contains a special element that is a list of references to information about specifications. Used as an opaque set of identifiers, these references form a technical fingerprint that can be used to recognize a Web service that implements a particular behavior or programming interface.
The UDDI API is divided into two logical parts, the Inquiry API and Publishers' API. The Inquiry API is further divisible into two parts - one part used for constructing programs that let you search and browse information found in a UDDI registry, and another part that is useful in the event that Web service invocations experience failures. Programmers can use the Publishers API to create rich interfaces for tools that interact directly with a UDDI registry, letting a technical person manage the information published inside UDDI registry.
Basic features of distributed systems
In order to compare Web services to other distributed systems implementations, we must first define the major characteristics of distributed systems in general.
The basic features of distributed systems are:
Request and response. Because a remote method call is inherently delivered over a network communication infrastructure, it is typically divided into a request (asking the service) and a response (returning results to the client). In principle, from the client's view, the request and response corresponding to a remote method call can be done as one atomic action (synchronous call), or they can be separated, where the client issues the request and, as a future action, issues a wait for the response (deferred-synchronous call). Sometimes the response part may be empty (no out parameters and no functional value). In this case, the corresponding method is usually termed a one-way method. A one-way method can be called asynchronously, where the client does not have to wait until the call is finished. In a distributed environment the "exactly-once" semantics of remote calls are practically impossible to achieve. Distributed platforms in practice today ensure the "at-most-once" semantics of a synchronous and deferred-synchronous call (exactly-once semantics in case of a successful call, at-most-once semantics otherwise). Best-effort semantics are ensured for a one-way method.
Remote Reference. One of the key issues of remote method calls lies in referencing remote objects. Classically, in a "local" case, a method call is made as a method invocation on the reference to the target object. However, in a distributed environment we face the issue that an object reference should identify a remote object over the network. It is obvious that classical addresses will not do as the references since they do not contain information about a server in which the target object resides. By convention, a reference that contains all information necessary to access a remote object is termed a remote reference. In addition, representation of a remote reference must span the differences in hardware architectures of the nodes where the objects involved in a particular remote method call reside.
IDL interface. In principle, a client's code and the server object that is subject to a remote call from the client can be implemented in different languages and run on heterogeneous architectures. To accommodate these differences, the interfaces of a server object are specified in an architecture-neutral Interface Definition Language (IDL). Typically, IDL provides constructs for specification of types, interfaces, modules, and (in some cases) object state. However, there is no means for specifyingimplementation. Usually a mapping from IDL to standard programming languages, such as C++, Java, COBOL, etc. is a part of an IDL definition.
Object proxy. To bridge the conceptual gap between the remote and local style of references, both in the client and server code, the actual manipulation of remote references is typically encapsulated in wrapper-like objects known as client-side and server-side proxies (Figure 3).
Figure 3
The client-side proxy and the corresponding server-side proxy communicate with each other to transmit requests and responses. Basically, the client-side proxy supports the same interface as the remote object. The key idea behind proxies is that the client calls a method m of the client-side proxy to achieve the effect of calling m of the remote object. Thus, the client-side proxy can be considered a local representative of the corresponding remote object. Similarly, the key task of a server-side proxy is to delegate and transform an incoming request into a local call form and to transform the result of the call to a form suitable for transmitting to the client-proxy. Thus, a server-side proxy can be considered as the representative of all potential clients of the remote object.
Marshalling. Both the request and response of a call are to be converted into a form suitable for transmitting over the network communication infrastructure. Typically, serialization into a byte stream is the technical base of such conversion. By convention,
this conversion is referred to as marshalling/unmarshalling.
Providing and Employing Distributed Service
Providing and employing distributed service entails the following:
Registering services with a broker. In order to be remotely accessible, any service provided by a server needs to be registered with a broker. As a result of the registration operation, the broker creates a remote reference to the service. The remote reference is returned to the server and can be registered with a naming and/or trading service (see below).
Naming. Because brokers supply remote references in an internal broker format, a distributed object platform typically provides a naming utility to enable the use of ordinary names. A naming defines a name space and tools for associating a name with a remote reference. Typical operations supported by a naming service are resolving a name into a remote reference and associating a name with a remote reference.
Trading. If a client does not know a name of the component it can ask a trading utility (analogous to yellow pages) to provide a list of references to remote services which possess the properties indicated by the client as the search key.
Binding. As mentioned above, the client can receive a remote reference via naming or trading, or as a result of another remote method call. It is a general rule of almost all distributed object platforms that when a client receives a remote reference to a service, a proxy is created (if it does not already exist) in the client's address space, and the client is provided with reference to the proxy.
Static invocation. When a client is compiled with the knowledge of the requested service interface (e.g., in the form of an IDL specification), calls made by remote methods can be encoded statically within the client code as calls of the proxy's methods. Static invocation requires recompilation of both client and server in the case of IDL changes.
Dynamic invocation. In principle, the client and the server can be compiled separately; thus, they may not always be current with respect to the static knowledge of available interfaces. To overcome this obstacle most of the implementations provide broker-hosted metadata about interface and mechanisms for building requests dynamically.
Inherent issues in the distributed architectures
The following are inherent issues in the distributed architectures:
Garbage collection of server objects. Server objects not targeted by any remote reference can be disposed and should be handled by a garbage collector. Garbage collection in a distributed environment is a more complex issue, comparing to the single address space.
Transactions. Transactions are an important tool in making distributed object applications robust. The following consequences of working with distributed objects with respect to transactions should be emphasized. Employing multiple databases is inherent to a distributed environment. This implies that a two-phase commit must be applied. As objects themselves possess state, they can also be considered as resources taking part in transactions.
Concurrency in server objects. Many server implementations employ multithreading; thus, a server object may be subject to invocation of several of its methods simultaneously. Naturally, synchronization tools have to be applied in the code of the server's objects in order to avoid race conditions, etc. In addition, several threads running on the server may call the broker at the same time, perhaps to register a newly created object.
Reactive programming. The advantage of easily passing remote references as parameters makes it relatively simple to employ event-based (reactive) programming style without introducing specific callback constructs. Most distributed object platforms define abstractions to support this style of programming.
In this first part of a two-part article, we have defined what Web services are
and described their two foundational technologies - WSDL and UDDI. We then defined the fundamental characteristics of distributed architectures in general. Based on these characteristics, in part two of this article we will offer an in-depth comparison of Web services with existing distributed architectures. We will also discuss potential applications for Web services and assess their readiness for real applications.
Author Bios:
Boris Lublinsky is a Regional Director of Technology for the Central Region at Inventa, where he oversees engagements in EAI, B2B integration, and component-based development of large-scale Web applications. Previously, he was a Technical Architect at Platinum Technology and SSA, where he developed execution platforms for component-based systems. Boris has over 20 years of experience in software engineering and technical architecture. blublinsky@hotmail.com
Michael Farrell Jr. is a Senior Engineer at Inventa Technologies in Chicago, where he serves as technical leader on B2Bi, EAI, and component-based application development projects. Michael is also a certified trainer and delivers courses on a number of systems integration and enterprise applications. Prior to Inventa, he was a Systems Engineer at Intel Corporation, where he implemented distributed applications to support the semiconductor fabrication process. Michael has two degrees
from the College of Engineering at The University of Iowa. mfarrell@iowa.dhs.org
All Rights Reserved
Copyright © 2004 SYS-CON Media, Inc.
E-mail: info@sys-con.com
Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.
|