HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML

Over the past several years EJB technology has entered the software development mainstream. This new level of recognition and greater popularity brings an increase in design activities in the EJB space, such as best practices and design patterns.

Most of the EJB design practices created so far are aimed at improving the overall performance of EJB-based applications. It turns out that the majority of these practices were taken directly from object-oriented development (OO) and moved to the realm of EJB design, without consideration for the specifics of EJBs. This article emphasizes these specifics and how they impact the design of EJBs and EJB-based applications.

What's So Special About EJBs?
EJB technology was introduced as a distributed components technology. The key to understanding it lies in the meaning of the words distributed and components. Let's start with distributed, then examine components.

Distributed Aspects
EJBs are accessed through the Java Remote Method Invocation (RMI), regardless of whether they're local or remote to the client. Although some of the application server implementations (e.g., WebSphere) optimize local communications to make them faster, most EJB communications are still network-based. Although the distributed aspect of communications is transparent to the user in actual method invocations, it has a profound effect on execution performance. The situation is further complicated by the fact that actual communication with the bean is based on interception (see Figure 1) and is implemented in two steps:

  1. A request for the bean's method of execution is first sent to the container in which the bean resides.
  2. The container fulfills the required intermediate steps (security, transactions, etc.) and then forwards the request to the bean.

Figure 1
Figure  1:

For the method on the EJB to be invoked, the remote reference to the home interface must be obtained. This is usually done through an additional network call to the Java Naming and Directory Interface (JNDI). The home interface can then be used to get the actual EJB reference. These operations introduce additional network calls (see Figure 2).

Figure 2
Figure  2:

To summarize, the execution method on the EJB is an expensive network process. Thus having low granularity methods on the EJB typically lead to poor performance of the overall system.

The introduction of local interfaces in EJB 2.0 is one attempt to improve overall performance. Local interfaces provide a way for beans in the same container to interact more efficiently - calls to methods in the local interface don't involve RMI. Although the local interfaces represent EJBs in the same address space and don't use distributed communications (e.g., no RMI between colocated beans), the container is still involved in every interaction to provide the required intermediary steps. In addition, even in the case of a local interface, a networking call to the JNDI is required for the client to obtain a reference to the local home interface, through which a reference to the local interface can be resolved. In reality, the specification doesn't define how vendors must implement local interfaces since they're only logical constructs and may not have the equivalent software counterparts. Additional delays can still be present in local communications.

The only effective way to improve the overall performance of EJB-based applications is to minimize the amount of method invocations, making the communications overhead negligible compared with the execution time. This can be achieved only by implementing coarse-grained methods.

Component Aspects
To define the component characteristics of EJBs we have to first define what components are. Although component-based development (CBD) has been around for at least 10 years, they're still not clearly defined. In general, components are for composition. Composition enables the reuse of prefabricated "things" (components) by rearranging them into ever-new and changing composites. Beyond this observation there's a lack of consensus on the definition of a component within the software industry. Microsoft has even invented the Component Object Model (COM), thus implying in the name some relationship between components and objects.

The Object Management Group (OMG) has defined distributed objects and built distributed components on top of them, leading people to think that components are tightly linked with objects. Many people assume that components are nothing more than super objects - a huge misconception. Components are a software implementation of business artifacts, intended to simplify the creation of business applications. Objects are software constructs, intended to simplify code creation; they're not necessarily related to the business content of an application.

Instead of trying to come up with a precise definition for components, we'll define the core concepts the industry is using as a "unified" description of a software component:

  • A software implementation of a well-defined application (business) aspect.
  • Should implement a collection of related functions or services; a relationship is determined by the analysis done from the perspective of intended usage. The component should provide a complete but not necessarily exhaustive set of functions.
  • Must be identifiable, meaning it can be addressed by another component, possibly via a network.
  • Should be treated as a whole so that it's not necessary to worry about all its pieces. This requires that components can be individually designed, developed, and deployed.
  • Should separate its interface from the implementation used to support it. A component might be thought of as a "black box" implementation of the business construct with a well-defined interface.
  • Component-based development (CBD) is not object-oriented development. This means that CBD does not necessarily require OO development. CBD can be implemented with equal success in both OO and procedural languages. CBD is merely a way of decomposing systems. It's a way to manage complexity better.
Most people consider the potential for reuse to be the main driving force for using a CBD approach. To be independently deployable a component has to be self-contained - separated from its environment and other components. Coupled with the requirement to implement well-defined application aspects, this provides the widest possibility for reuse.

Managing complexity is another major advantage of CBD. Components allow for the natural decomposition of a complex system into smaller chunks, which are usually much simpler and easier to manage. In addition to horizontal partitioning, introduced by layered architecture, the adoption of components introduces vertical partitioning.

The description of a component provided earlier does not specify the internal implementation of the component. This means that in principle, components can be implemented using lower granularity components (e.g., IBM's advanced components for WebSphere). This is similar to the system-analysis paradigm in which large systems are believed to consist of smaller systems, recursively, until the size of the system becomes manageable.

This recursive definition lets you think about components as a unifying concept for the software system as a whole as well as individually. The introduction of components also forces a multilevel design: the components and their internals. A compound component is made up of several components.

The following is a summary of the benefits of CBD:

  • Containment of complexity: Using CBD allows for the natural decomposition of a system. First, create a high-level design of the components and their interfaces. Then focus your development project on one or a small number of components. This effectively allows for the reduction of scope and better risk management of every project. Besides, smaller and better-focused development teams are usually more productive.
  • Opportunity for massive parallel development: Project boundaries defined around stable component definitions encourage parallel development in-house and via outsourcing. The outsourcing of maintenance may occur as well, since component providers may supply maintenance for their components.
  • "Black box" component implementation encourages flexibility: A component that supports a well-defined interface can be substituted with another one that supports either the same interface or one derived from the original interface. This simplifies modifications to current behavior and enhances functionality.
  • Incremental testing: Components facilitate unit testing and support progressive build testing.
  • Encapsulated components act as firewalls to change: The ripple effect from change is much smaller, simplifying system maintenance.
  • Greater consistency in usage: Components impose a standard architecture for applications.
What Does This Mean for EJB Development?
It's now apparent from our distributed and component discussion that superior EJB design is very different from OO design. The problem is that this point was never fully carried across to developers, many of whom still consider EJB to be a Java class that adheres to the EJB interface specification. The individual deployment of EJBs is the only component characteristic supported and emphasized by the EJB environment.

Simply because of its name, Enterprise JavaBeans, EJB connotates a relationship with another popular technology from Sun Microsystems - JavaBeans. To make things worse and confuse people even more, many popular Java IDEs (e.g., JBuilder) use a single workspace or "bean tab" for both JavaBean and EJB development, thus suggesting a strong correlation between the two distinct technologies.

One of the examples of such correlations are setter and getter methods, which are required by the JavaBean specification to access internal variables. Setter and getter methods were introduced by OO practitioners in order to provide access to encapsulated object variables and eliminate coupling between internal representation and external access. This practice was blindly moved into EJB development, after which time many additional patterns - most notably the Fašade and Value Object patterns - were introduced to improve design performance, which was less than optimal to start with.

Experience has proven that using setter and getter methods in distributed systems is a bad habit. Further, one of the rules for distributed computing is the introduction of self-contained method signatures to minimize network traffic and improve overall performance, which setter and getter methods rarely embody. The main characteristics of a self-contained method signature is that it accepts all the variables required for the method execution and returns all the results of the execution. In other words, self-contained methods don't require additional methods for either setting required data or retrieving results. Furthermore, because components are an implementation of application (business) artifacts, the methods that they support are supposed to be meaningful business methods, which setters and getters rarely are.

Our point is that a single EJB must be a large granular piece of software that's internally composed of a potentially large number of Java classes. It has to represent meaningful business artifacts and support meaningful business methods. This is the only feasible way of creating high-performance EJB applications with reusable beans.

Impact on Systems Design
The implementation of EJB-based components dictates a new approach to the design of EJB-based systems. It impacts the separation of responsibilities between session and entity beans as well as the design of the beans.

Entity beans are often introduced as persistent data components (enterprise beans) that know how to persist their own internal data to a durable storage area such as a database or legacy system. This definition reduces entity beans mostly to object/relational mapping and often leads to a design in which entity beans are used purely as a data access layer (we've even seen a comparison of entity beans with serializable Java objects, which serialize themselves into a database). In this approach entity beans become fairly small, with a one-to-one correspondence between an entity bean and a database table that leads to a very low granularity implementation.

This causes not only increased network communications, but also negatively impacts database communications due to the increased usage of finder methods. The standard implementation of a finder method is a database query for the key value. As the number of entity beans of the same type grows, this lookup, which is a separate operation from the actual population of the entity bean, becomes more and more expensive.

Some implementations, for example, WebLogic, allow for the optimization of finder methods by combining them with the load. This alleviates the problem somewhat, but is not part of the standard. Also, as the variety of entity beans grows, the amount of finder method invocations also grows, making the overall application's performance even worse.

In addition, the granularity of entity beans has a profound effect on database design. Prior to the introduction of entity beans (and the componentization of software in general), database design was performed for the application as a whole. This usually led to a database design with a strong emphasis on enforcing data relationships by supporting entity relationships and multiple constraints. With the introduction of entity beans (e.g., components) the situation has to change. Because entity beans are reusable, individually deployable components, the only thing a database can enforce is that the relationships within the data are supported by the individual components (beans). Introducing relationships in data that's supported by multiple entity beans will break the beans' autonomy, so it doesn't seem to be a feasible solution.

The relationship between the data of multiple entity beans must be implemented on a higher level by the session beans as part of the internal business-process definition. The lower the entity beans' granularity, the less relationships can be enforced in the database and the greater the programming effort that's required to support them.

The last thing to consider here is the fact that business rules that govern enterprise processing can be divided into two broad categories:

  • Accessing data: These rules govern how data has to be stored in the database, operations that can be done with this data, and possible constraints. These rules are usually part of the business artifact and tend to be very stable and applicable for multiple implementations both within and between enterprises, and to provide a high potential for reuse.
  • Processing data: These rules govern business processes within the enterprise. They define both the conditions and the sequence of the components' execution. They tend to change fairly frequently and are rarely reusable.
Entity beans must incorporate two major things: persistent (enterprise) data and business rules that are associated with the processing of this data. Ideally, entity beans should be viewed as an implementation of reusable business artifacts and adhere to the following rules:
  • Have large granularity, which usually means they should contain multiple Java classes and support multiple database tables.
  • Be associated with a certain amount of persistent data, typically multiple database tables, one of which should define the primary key for the whole bean.
  • Support meaningful business methods and encapsulate business rules to access the data.
A session bean should represent the work being performed for the client code that's calling it. Session beans are business-process components that implement business rules for processing data.

Business processes implemented by session beans within the EJB environment should define business and corresponding database transactions. It's not advisable to use a client's transactions in the EJB environment due to potential problems with the long-running transactions that can cause database lockup. Entity beans that participate in the transaction are effectively transactional resources due to their stateful nature. In reality, however, application server vendors don't treat them as such and basically "clone" entity beans when more than one user wants to access the same information. They rely on the underlying database to lock and resolve access appropriately. Although this approach greatly improves performance, it provides the potential for database lockup.

At the beginning of a transaction the container invokes a load method on the entity bean that's performing the database read, thus acquiring read lock on the set of tables. At this point another clone of the same bean can acquire the same data and obtain another read lock. After that first transaction has ended, the container invokes a save method on the first bean that tries to write data back to the database. The database would attempt to promote the lock to the write operation, but would not be able to because there's another read lock for the same data. As a result a database deadlock would occur.

The severity of this situation can vary, depending on the locking mechanism of the database in use and the duration of the transaction. Either way, it's not a desirable response.

Summary
Our main stipulation in this article is that EJB design is very different from OO design and it's impossible to blindly apply OO design principles to EJBs.

A simple example is designing for reuse. In OO systems the main driver is to reuse code constructs, and the best results can be achieved by creating objects of very low granularity. In component-based development and thus EJB development, the main driver is to create reusable business artifacts, thus components must be of fairly large granularity.

The creation of coarse EJB components that consist of multiple Java classes will eliminate much of the network traffic occurring in today's EJB implementations. It will also allow for two levels of reuse: traditional OO reuse on the Java classes level that provides a component's internal functionality, and the component's reuse on the EJB level.

Acknowledgment
Special thanks to Michael Farrell Jr. and Tung Mansfield for their contributions to this article.

Author Bio
Boris Lublinsky, regional director of technology at Inventa Technologies, oversees engagements in EAI and B2B integration and component-based development of large-scale Web applications. He has over 20 years of experience in software engineering and technical architecture. blublinsky@hotmail.com

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: info@sys-con.com

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.