HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML

This series of articles will walk you through the details and some of the decisions that must be made when implementing container-managed persistence in Enterprise JavaBeans.

Of course, there is the usual discriminator. These articles are not based primarily on the EJB specification and what you can and cannot do with EJBs; instead, they concentrate on information derived from hard-earned experience you'll find useful when dealing with EJBs.

What's a Primary Key?
The primary key in an EJB is the subset of its attributes, which are guaranteed to be unique. Informing the container of the contents of an entity's primary key allows it to store and later, using the PK, retrieve the same entity. Primary keys always provide a handle to the entity, regardless of whether it's in memory or storage.

There, that sounded abstract enough, on to reality. Persistence mechanisms in EJB containers, at least those that are efficient and widely accepted, are closely tied to databases. Furthermore, although there are a few object-oriented databases in the market, their acceptance is limited compared to their relational ancestors.

In essence, relational databases manage reads, writes, and searches on tables that are made up of columns and rows. Entity beans map cleanly to tables; each column maps to an attribute; each row maps to an entity. This is not true in coarse-grained approaches where one entity may be responsible for multiple rows in multiple tables, but that approach is no longer a necessity due to performance gains made in EJB 2.0's handling of a finely grained object model.

What's a Good Primary Key?
Choosing which unique part of an entity's attributes the primary key should be composed of is not an easy task. It gets exponentially harder to guarantee amid changing requirements. For example, the first and last name attributes in an entity modeling Employees might be considered unique at design time, but this might not hold true in the long run. Primary keys that are subsets of their attributes are troublesome because uniqueness tends to fade as data accumulates and new attributes are added to the entity.

At times, adding new attributes can mean adding a new differentiator, which must be factored into the logical primary key of the entity. As a result, earlier guarantees of uniqueness are no longer valid. If we were to add a middle initial attribute to the example entity we used earlier, it would have to be added to the primary key, resulting in a lot of refactoring. Figure 1 shows the difference adding an attribute to an entity can have on both a single and compound primary key.

Figure 1
Figure 1

Although it's possible to use a string instead of an integer, this approach has several problems, such as slightly slower performance when doing lookups based on strings, string concatenation not being an effective way to extend the primary key, and the fact that containers support only autoincrementing integral primary keys.

Another issue with multiattribute primary keys arises when working with some container-managed relationships. When dealing with a many-to-many relationship between two entities, the underlying database table that's modeling this relationship consists of columns that match the primary keys from both entities. If the primary keys of both entities are compound and a common attribute name is shared between them, the database layer cannot differentiate between them at the column level.

Because these hard lessons have been learned multiple times over, using an automatically generated integer is something I would highly recommend. Since it's autoincremented by the container as the primary key for an entity, not only is it the easiest to implement, it's also the most flexible over time. The only caveat to using automatically generated integral primary keys is the container cannot enforce uniqueness. If we were to create three different entities with the same attributes and used an autoincremented integer as the primary key, the container would not complain about duplication since the autoincremented integer primary key would still be unique. In some cases, this may be valid and duplicates of the logical primary key might be supported by the business logic, while other scenarios might not allow duplication.

Enforcing Logical Primary Keys in the Database
One way to avoid this pitfall is to add constraints to the database that don't allow this to happen. Even though it makes the existence of a database underneath the persistence layer visible, ruining encapsulation, it leverages what databases do much better than EJB containers: it keeps track of data. A quick detour through database constraints from an EJB perspective might be helpful.

Although most EJB containers are able to generate the underlying persistence schema, very few people use it directly in production, mostly because the tables created by the container contain no constraints. Not only are constraints important in terms of disallowing bad data, they also provide important performance hints to the database. All relational databases are able to define the columns in a table that make up the primary key. They also enable us to define unique indexes. In case you're wondering, primary keys are specialized unique indexes.

If logical uniqueness is not enforced on the entity layer via a multiattribute primary key, enforcing it on the database layer is a useful and effective method of guaranteeing uniqueness. Since both the logical primary key and the autoincrement attribute need to be uniquely independent of each other, defining the primary key on the database level to be logical and defining a unique index for the actual autoincremented primary key achieves a constraint that disallows duplicate entries, and an index that allows for fast lookups when searching by the actual primary key.

The programmatic alternative is to create an entity finder based on the logical primary key, then check for the nonexistence of an entity every time before creating it, thus guaranteeing that no duplicates are generated.

Summary
In an evolving marketplace, business requirements keep changing. As promises of matchlessness weaken to assurances and less, and modeled entities take on more and more properties, something as important as the primary key of an entity should be as constant as possible. This can best be achieved by a single primary key that's autogenerated, an integer, and not dependent on the portion of the entity that's sure to change over time.

Author Bio
Saad Rehmani is senior software engineer at a small startup that does big things. His current responsibilities include extensive work with J2EE in general and EJB 2.0 in particular. Before realizing how awesome Java was, Saad was heavily involved in various projects ranging from kernel modules to pseudo-realtime state propagation between clusters. [email protected]

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.