Developing distributed components with Java and DCOM (distributed
component object model) simplifies developing distributed applications. If you know CORBA or RMI, DCOM is easy to learn. Microsoft's Java Virtual Machine makes developing COM and DCOM components painless.
Overview of COM and DCOM
The Component Object Model provides a means to create extensible services called components. As components mature (evolve) and add new features and functions, the services they provide remain backward-compatible with older incarnations of the components they replace. This enables older applications (COM clients) to treat new components like older components. Thus, when you upgrade the component, older client applications continue working.
COM uses polymorphism to accomplish the extensible component architecture. COM compares to LPC roughly the same way C++ compares to C, one being procedural and the other object-oriented. A remote procedure call (RPC) is to DCOM as C is to C++. DCOM groups data and methods into objects that you can use through various interfaces. Think of DCOM as COM with a longer wire. The terms DCOM and COM are thus used interchangeably throughout this text.
COM is designed from the ground up to support distributed computing. Just by changing a few Windows NT Registry settings, you can use a legacy COM client with a DCOM server, or the client can request a specific server.
The major difference between DCOM and COM is that DCOM uses RPC. You can transparently use DCOM with COM clients that predate the release of DCOM. You can also use most existing COM servers that predate the release of DCOM as DCOM servers again, just by changing a few registry settings.
At some point you'll probably have to deal with COM/DCOM. Knowing COM is a good skill if you have to interface with commercial off-the-shelf components and applications or existing in-house applications that use DCOM. Because Windows NT is prevalent in the client/server market, and DCOM is heavily integrated with the NT operating system, it's important to understand DCOM because of the proliferation of Windows NT and the size of the COM component market.
DCOM currently ships with the Microsoft Windows NT 4.0 and Windows 98 operating systems. It's also available for download for the Windows 95 operating system. In addition, there are efforts to make DCOM available on a number of UNIX platforms. Many of the Java application server providers, such as BEA's WebLogic Java server and Bluestone's Sapphire, provide DCOM support.
Microsoft has made sure that their JVM integrates well with DCOM. Thus Java classes are treated like COM objects. Also, Sun provides an ActiveX bridge to expose JavaBeans as ActiveX controls. (An ActiveX control is a type of COM component.) In addition, Halcyon provides a Java DCOM server.
With their JVM you can use the Java classes you create as scriptable COM components. Java classes can be scripted using Visual Basic, VBScript, JScript, Perl, Python and other scripting languages. These classes can be used inside a Web browser or an Excel spreadsheet, or as part of an Active Server Page. Essentially, you can use your Java classes anywhere you can use Automation (late binding). Automation enables users to take control of components and applications through easy-to-use scripting languages such as Visual Basic.
With Microsoft's JVM you can use COM components and ActiveX control (which are written in other languages) as Java classes and JavaBeans. You can use your COM components virtually anywhere you can use Java.
You may wonder why all this matters to Java. It's simple, really. There are a lot of COM components out there, and chances are you'll need to integrate them in one of your projects. For that matter, there are a lot of COM component developers in the market, and you may need to integrate their skills in your next project.
Saying there are a lot just doesn't cut it I wanted numbers. So I did a little research on the demand for COM/DCOM and related technology skills versus CORBA and RMI skills. I looked up want ads under "Information Systems" in the employment sections for five major cities. I searched for keywords regarding DCOM, CORBA and RMI. What I found is shown in Table 1 and Figures 1 and 2.
Whatever our backgrounds, we all have have one thing in common at one time in our lives we were looking for work. I was looking when I found my current job. The demand for COM/DCOM skills is anywhere from 20 to 400% greater than for CORBA skills. With this in mind let's review what COM is.
The Component Revolution
Declaring one technology the winner and any of the others the loser is impossible. There's something else of greater importance that all these technologies supply: the plumbing for the component revolution.
These technologies enable the component revolution, which allows companies to assemble frameworks of components into working solutions. Most information technology shops have the option to buy commercial off-the-shelf components on the basis of what functionality they provide, not on the basis of what distributed object technology they were built with. They have this option because there are enough tools to form a bridge between any two technologies at least half a dozen ways.
The component revolution is based on the following precepts:
All distributed object architecture must provide the following basic features:
- Interface definition/negotiation is important to distributed systems. It allows distributed objects the opportunity to communicate and evolve separately without breaking the existing contract.
- Directory services provide a means of finding, activating and connecting to remote objects.
- Marshaling is a way to make the object appear to be in a local process, yet communicate the invocation of methods along with their parameters over process and machine boundaries. It allows access to interfaces from remote sites and moves data to and from the client and server process. It's just a means of formatting data between clients and components so they can communicate clearly at the bit and byte level.
- Object persistence is saving an object state to a persistent storage, such as a flat file or database. It's also how to connect to a unique instance of an object, e.g., when the object is already running in another process.
- Security is needed to protect access to components at various levels.
Defining an interface between a client and a component is like defining a contract between two people. The interface exposes a collection of methods that define what behavior and functionality the component will provide.
In DCOM you don't deal with objects directly. Instead, you deal with interfaces to the objects. A DCOM interface is a collection of methods that define a service contract. Actually, what you get is an interface pointer that points to a vtable (a vtable is a collection of pointers to methods). Java doesn't support interface pointers or any pointers, for that matter. However, Microsoft allows Java developers to access COM objects in a natural way.
The interface defines the behavior of an object independent of any one implementation of an object. DCOM is a binary interoperability agreement for how clients interact with interfaces via pointers and local and remote proxies; proxies act as surrogate objects that are involved in marshaling the parameters to and from the components.
The Microsoft JVM is a precursor to COM+. How DCOM is handled in Java is a precursor to the way DCOM will be handled in other languages with the introduction of COM+. COM+ will make DCOM programming a lot easier. Let's compare getting a pointer to an interface in Java to doing the same thing in C++:
CoCreateInstance (CLSID_HelloDCOM, NULL,
(void **) &pHelloDCOM);
Here's the equivalent Java code:
IHelloDCOM helloDCOM = (IHelloDCOM) new HelloDCOM();
As you know, there are no pointers in Java. So instead of dealing with pointers, the JVM handles all the low-level complexity. It also allows you to cast an interface to an object instead of using the IUnknown interface negotiation, which in many cases makes programming COM in Java much easier than doing it in C++. Actually, Java's multiple-interfaces inheritance model maps nicely to working with IUnknown.
DCOM provides standard interfaces for dealing with objects. One such interface is IUnknown. Every DCOM object must support IUnknown. Also, Java classes, via the JVM, support a lot of other standard interfaces. So what's a COM object? A COM object is a component that supports one or more interfaces; a COM interface refers to a collection of related methods.
There are standard interfaces and there are user-defined interfaces. COM objects are accessed only through interfaces. A COM class implements one or more interfaces, and COM objects are runtime instantiations of COM classes.
IDispatch is a standard interface that all COM objects that support automation must have. Java classes in the JVM, by default, support automation via the IDispatch interface.
COM works with many computer programming languages. However, there's a special language for describing interfaces called the Interface Definition Language (IDL).
A DCOM stub equates to a CORBA or RMI skeleton. A DCOM proxy equates to an RMI or CORBA stub. A stub in CORBA-speak is the client; a stub in DCOM-speak is the server.
DCOM's IDL, unlike CORBA's, doesn't support inheritance, which is a key ingredient to object-oriented design. Instead, DCOM supports containment, delegation and aggregation. It also uses interface negotiation (IUnknown), which provides the key feature of inheritance, that is, polymorphism. Thus DCOM can support many interfaces.
The good news is that you don't have to know COM IDL to do Java DCOM programming. One of the keys to COM's success is ease of use. Java DCOM isn't tied to IDL the way CORBA is. In fact, there's nothing special about IDL. It's just a C-like language for creating proxies and stubs. Even if you use the Microsoft Java SDK with no fancy IDE, you don't have to write any IDL. And with the release of COM+, COM IDL, like Latin, may be a dead language in a few years.
COM uses the registry and the COM library to perform an object lookup. When a COM client tries to create a COM object, the COM libraries look up the associated COM class implementation in the registry. (This is somewhat analogous to the way RMI uses the RMI Registry or CORBA uses COSNaming.) The COM class implementation is executable code called the server. The executable code that the COM class is associated with could be a dynamic link library, an executable file or a Java class. The COM libraries load the COM server and work with the server to create the object (the instance of the COM class) and then return an interface pointer to the COM client. With DCOM, the COM libraries are updated to create COM objects on remote machines.
To create remote objects, the COM libraries read the network name of the remote server machine from the registry to create remote COM objects. Alternatively, the name can be passed to the COM libraries' CoCreateInstanceEx function call. We'll cover a code example that uses this call with the name of the server passed as a parameter.
For remote components (i.e., DCOM components) the COM libraries use the service control manager (SCM, pronounced "scum") to perform object activation. In this scenario, when a COM client attempts to create a COM component, the COM library looks up the COM object in the Windows NT Registry as usual. What it finds in the registry is information on how to instantiate the COM object just as before. However, if the COM class configuration in the registry specifies a remote server, the COM library will collaborate with SCM. SCM's job is to contact the SCM on the remote server. The remote SCM then works with the COM library on the remote machine to instantiate the object and return an instance to the client application. Unlike CORBA, DCOM has no object ID. Instead, if you want to connect to the same unique instance of an object, you use a moniker.
With the release of Windows NT 5.0, COM adds a central store for COM classes. All activation-related information about a component can be stored in the Active Directory of the domain controller. The COM libraries will get activation information such as the remote server name transparently from the Active Directory. Reconfiguring the component will be a simple matter of changing the setting for the component in the Active Directory. The Active Directory then proliferates these changes to all the clients connected to the portion of the Active Directory that contains the component's information. This further closes the gaps between CORBA's activation model and DCOM's.
Interface negotiation is the ability to ask a COM object at runtime which other interfaces it supports. Because all COM objects must implement the IUnknown interface, all COM objects support interface negotiation. Thus COM clients can access any COM object and use QueryInterface to determine which interfaces the COM object supports. The ability to query the interface supported allows COM clients to decide at runtime which interface to use.
QueryInterface allows the COM object to pass an interface pointer to other COM objects that don't even have to be on the same machine. COM uses QueryInterface to aggregate many COM objects. It allows components to evolve over time and yet still be backward-compatible with older clients, while new clients are allowed to access new features through new interfaces.
This interface negotiation feature gives COM architectural appeal. COM objects describe their features at a high level of abstraction. This permits COM clients the ability to query the COM object to see whether it supports a particular interface (a feature set). Compare this to a CORBA object's single interface model. The ability of a COM client to request the feature set of a COM object allows for the flexibility you'd expect from a component object model. In other words, COM objects should be allowed to mature and develop new features without breaking old clients, yet allow new clients access to those features.
RMI is currently lacking a solid default directory service. However, third-party tools that implement Java naming and directory interface (JNDI) give RMI a robust directory service. CORBA has an advanced directory service, COSNaming, that provides a transparent location of objects depending on your CORBA vendor's COSNaming implementation. DCOM's current directory service lacks a truly distributed transparent nature like CORBA's COSNaming. This lack of support seems to be the result of different approaches to solving similar problems rather than to a missing feature or an architectural advantage.
In Windows NT 5.0, however, DCOM can be used in connection with the Active Directory. Activation-related information about a component is stored in the Active Directory of the domain controller. The COM libraries then get activation information such as the remote server name transparently from the Active Directory. The Active Directory will proliferate configuration changes to all the clients that are registered to receive a component's information.
When a client makes a method call on a COM interface, the COM objects in the other process can be down the hall or on the other side of the globe. The differences between local and remote access are abstracted from the COM clients. Marshaling involves taking an interface pointer in a server's process, making that interface pointer available to the client process and setting up interprocess communication (either RPC or LPC). Next, marshaling must take the arguments to an interface method call as passed from the client and serialize those arguments to the remote object's process.
Custom marshaling is fundamental for certain applications. COM offers standard marshaling for the built-in standard COM interfaces. With standard marshaling COM furnishes a generic proxy and stub that communicate through standard RPC for each standard COM interface. Custom marshaling is not a trivial matter with Java and DCOM.
By default, all objects are passed by reference, which means that when the client calls a method of a remote interface, the call is marshaled over the wire. If you want your objects to be passed by value, you need to do custom marshaling.
You probably won't ever need to write your own custom marshaler because DCOM/Java integration centers around IDispatch. IDispatch is a built-in interface, and COM provides a marshaler for it. In addition, Microsoft provides a special optimized marshaler for Java COM objects.
By comparison, RMI provides good support for marshaling in both ease of use and the overall feature set. With RMI, if an object defines a remote interface it's passed by reference. However, RMI can pass objects by value.
Imagine defining a remote hashtable type of class that contains results to a query. Every time your client accesses the remote hashtable object the call goes over the wire, which can really slow things down because of the latency of the network. RMI gives you another option. If you pass a parameter to a remote method and the parameter (1) doesn't implement a remote interface and (2) is an instantiation of a class that implements Serializable, then the parameter will be marshaled over the network. If the code for the parameter isn't available on the client machine, RMI will load the class from the remote machine. Not only are the values moved across the network, but the code that accesses those values is moved across the network as well. In essence, you've moved code and data so that the object has been relocated to the client's process.
RMI has an architectural advantage with reference to marshaling. Neither CORBA nor DCOM approaches this technique of moving the code from one JVM to another, but both allow you to pass by value. By default, DCOM, like CORBA, uses pass by reference, whereas RMI allows both pass by reference and pass by value. In addition, RMI allows you to pass code.
Future versions of CORBA will have support for pass by value. It's possible to create your own pass-by-value support with DCOM, but it isn't as straightforward as the RMI approach. To perform pass by value in DCOM, you need to define your own custom vtable interface and write your own custom marshaler for the custom vtable, which involves using C programming and Raw Native Interface (RNI). There are ways around the DCOM marshaling issue. For example, you could pack all class data in a string and then write your own unpacker, but it isn't an elegant solution.
CORBA has a fairly straightforward persistence mechanism which Java and DCOM don't seem to have for reconnecting to unique instances of an object. DCOM does provide a flexible way to manage persistence, yet it's not as implicit as the CORBA technique, so it's more complex to implement.
As mentioned earlier, CORBA provides an objectId (called an object reference) to connect to specific instances of an object. Conversely, COM objects, by default, are stateless objects. COM objects don't have object identifiers. Instead they use monikers to connect to a particular instance.
COM's instance-naming mechanism is extremely flexible but at the price of complexity. An IMoniker specifies an instance for COM. Monikers also referred to as instance names for COM objects are themselves COM objects. This explains their flexibility and their complexity (compared to the CORBA approach). The standard COM interface for these naming objects is IMoniker.
If the COM object the moniker is referring to isn't already running in a server process, IMoniker can create and initialize a COM object instance to the state it had before. On the other hand, if the COM object that IMoniker is referring to is running in an existing COM server process, IMoniker can connect to the running instance of the COM object via the COM server.
Security and Administration
As far as security goes, DCOM has some clear architectural advantages with its tight integration with the NT security model. This gives DCOM an edge in administration and ease of development. The same or similar tools that are included with the OS can manage DCOM security. In other words, if you know how to administer Windows NT, you can easily learn to administer DCOM.
Interoperability and Bridging
It seems RMI is moving closer to interoperating with CORBA a big plus for RMI and CORBA. Of course, RMI interoperating with CORBA will degrade some of its functionality (you'd have to give up its most innovative feature: its ability to transfer code and data in a pass-by-value call).
There's already a lot of bridging technologies from one distributed object architecture to another. For example, IONA has a CORBA/COM bridge that takes a CORBA object and makes it appear as an ActiveX control, which can then be embedded easily in a Visual Basic program (or a Visual J++ or Delphi program, for that matter). Here's another example: the forthcoming CORBBeans implementation will allow CORBA distributed objects to look like JavaBeans on the client. In effect, this gives CORBA a local component model and will make CORBA "toolable" on the client. Making CORBA toolable makes it easier to use in applications like Visual Basic by using Sun's ActiveX bridge to bridge the CORBA bean to look like an ActiveX control.
Comparing DCOM to RMI and CORBA
I don't think it's fair to advocate any one distributed object framework (DCOM, RMI or CORBA) over another; each one has advantages that give it an edge for certain types of applications. Also, using one distributed object framework doesn't preclude using another.
Ease of Development and IDL
Java's transparent DCOM support (in JVM) clearly gives it an architectural advantage: namely, you don't have to learn another language to create DCOM/Java components. Conversely, when you develop a CORBA component you typically start by creating an IDL file and then deriving your client and server from another class. It should be noted, however, that there are tools such as Inprise's Caffeine that help reduce CORBA complexity by allowing you to define your interfaces in Java.
Typically, you don't need IDL to create Java DCOM components. But there are times when you do need to create IDL files; for example, when you want to provide custom marshaling or create vtable components. Concerning comparisons of the IDL languages (Microsoft's IDL to CORBA's IDL), it's been stated that CORBA's IDL seems more thought out and easier to use.
CORBA may have a cleaner IDL syntax because it doesn't extend an existing IDL as Microsoft extends RPC IDL for DCOM (see Figure 3).
Conversely, Java's RMI has no IDL; it doesn't need one because it provides only Java-to-Java communication. You define your remote interfaces in Java, then create an implementation class in Java that implements the remote interface you defined. Although it doesn't have an IDL to deal with, as CORBA does, the inheritance model of defining a remote object is a bit more complicated than the DCOM approach. RMI is a bit less complicated than the CORBA approach (unless you use something like Inprise's Caffeine). Again, I've seen demonstrations of IDEs that make RMI development fairly trivial.
Various companies seem to be working hard to make CORBA and RMI development easier, so any advantage DCOM has in ease of development may be short-lived.
Using a Model That Works
Splitting hairs over architecture issues may be the wrong way to pick a distributed object framework. Instead, the component model you use may depend heavily on the talent pool at your company. If you have a department full of Visual Basic programmers, you should consider using mostly DCOM, and RMI and CORBA if you have to connect to third-party components and frameworks. Conversely, if you use Java a lot on both the middle tier and the client, you might consider RMI, and use COM only when you want to capitalize on a huge install base of applications that have ActiveX Automation support. CORBA is the obvious choice if you need to connect to a lot of legacy applications that support it. Since COM custom marshaling is nontrivial and it's easy to pass objects by value with RMI, use RMI if you want to move a lot of objects around the network.
DCOM is an excellent tool for creating distributed applications as well as for enabling the next revolution in history: the component revolution. Using the Microsoft Java SDK, you can easily write both DCOM clients and servers, and you can integrate with existing applications and in-house components developed by Visual Basic, Delphi and Visual C++ developers. You can still use CORBA, DCOM and RMI from the JVM, so you don't have to select just one distributed object technology.
In this article we covered:
In Part 2 we'll cover using DCOM from the Microsoft JVM with hands-on examples, and details on just how easy it is to create COM/DCOM servers in Java.
- How DCOM compares to RMI and CORBA
- Why DCOM may be important to you
- What DCOM architecture looks like
- How to use the Microsoft Java SDK to create DCOM objects
In the book Java Distributed Objects by Bill McCarty and Luke Cassady-Dorion (Sam Publishing), the subject of DCOM is covered in more depth. The book also covers RMI and CORBA in detail (with an emphasis on CORBA). I wrote Chapter 20 on DCOM, which covers Java and DCOM in more detail and relates how to create callbacks in DCOM and how to use JActiveX to create Java wrappers around existing COM components. I also have an example that uses late bound calls using IDispatch.
About the Author
Rick Hightower, a senior software engineer at
LookSmart, a category based Web directory,
has been writing software for a decade, from
embedded systems to factory automation solutions. Rick recently worked at Intel's Enterprise Architecture Lab, where he researched emerging middleware and component technologies. Rick can be reached at