HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML

These paradigm changes have greatly increased my power to express program logic, such that my programs have gotten smaller, simpler, and much easier to understand, while supporting ever-increasing user capabilities. When I started programming, I worked with simple command-line interfaces and text-based “green screens.” Next I produced “fat-client” graphical user interfaces, and now I work on Web-enabled user interfaces. Again, each paradigm switch has greatly increased user power, flexibility, and ease of use while the code required to produce the interfaces has decreased and is much simpler to understand.

Data Storage and Retrieval Problems
Unfortunately, I haven’t seen the same kinds of advances in data retrieval and storage. In fact, I think we’ve declined in that area as an increasing number of data source/data sink technologies such as XML, guaranteed messaging, and directory services have come into mainstream development. Besides the user interface, of course, relational databases used to be the sole data source/sink technology I dealt with. As such, programming environments of the recent past provided first class support. My PowerBuilder and Oracle Forms developer friends have extolled the virtues of these environments over the somewhat primitive JDBC support in Java. My only defense has been the promise of reusable logic in my Java objects that transcend the hard-coded data mappings between PowerBuilder or Oracle Forms screens and the database. Unfortunately, it takes a great deal of JDBC code to map the data involved in the Java objects to the database. Add XML documents, queued messages, and LDAP directories to the mix, and things get even worse. Each of these technologies requires a different Java API, a new learning curve, and a great deal of code to implement. In a recent code survey at my workplace, I found that over 50% of a major application was devoted to nothing but data retrieval, translation, and storage. That left under 50% of the system to do the real work, namely providing a user interface and logic to do something useful with the data users provide us.

Another problem I encountered as I tried to modify and extend the systems at work was the hard-coded data mappings that proliferated throughout. I couldn’t add inheritance hierarchies or new classes and relationships. Related classes required inefficient secondary database queries as I moved from one class to another. It was practically impossible to get the existing data mapping code to recognize the need to instantiate the correct subclasses of an object in a class hierarchy as instances were read from a data source.

My most discouraging finding of all was the large number of critical defects in the data storage, manipulation, and retrieval code. There was little care in the placement of transaction boundaries, allowing for all kinds of data integrity problems under less than ideal system operating circumstances. Resources like database connections, statements, and result sets were not being freed correctly, resulting in problems as the application ran over an extended period of time. When processing message queues, the code was committing transactions to the database without a synchronization strategy, such as a two-phase commit, to properly remove messages in the same unit of work. XML documents were not being parsed or generated in an “extensible” way, thus eliminating the crucial X in XML.

To solve the problem, I started looking into new Java technologies and APIs like XML data binding, Java data objects, and message-driven EJBs. Each of these technologies had limitations as I tried to hook them up to the logic in my application. Where should I put the logic for objects that crossed data source/data sink technologies? For example, information for my Customer class came in from both the user interface and a message queue, was created or updated in the database, and output as XML documents to the user interface or other enterprise systems. Pretty much every data mapping technology I tried, including the more traditional commercial object/relational mapping frameworks on the market, had either a heavy or exclusive bias to a particular data source/sink technology. I was forced to create multiple Customer classes, one per data source/sink technology (for example, DatabaseCustomer, XMLCustomer, MessageCustomer). Then I’d either have to duplicate the application logic concerned with processing a customer or I’d need to have one Customer class with the logic and transformations to and from the other Customer classes. None of these designs are object-oriented. In responsibility-driven design, a Customer class shouldn’t have any logic in it to communicate with a data storage or retrieval mechanism. Instead it should perform the responsibilities of a Customer as abstracted from the problem domain. Other classes in the system should be responsible for the data mapping.

JLF Prototype Data Mapping Framework
Being somewhat of a framework buff, I wondered if I could design a framework that abstracted the dirty details of data source/sink technologies, but provided much of the power and flexibility of the native JDBC, XML, JMS, and JNDI APIs. I came up with the data mapping portion of an open source framework called Java Layered Frameworks (JLF), located at http://jlf.sourceforge.net . This framework works to minimize the amount of code in your application needed to map your Java objects to any number of different data sources/sinks. It also helps you execute complex mappings in a relatively efficient way. For example, when using a JDBC data source/sink, JLF can help reduce the number of SQL statements sent to the database, and it can cache relatively static data so you don’t have to read the same data every time you use it.

JLF Data Mapping Overview
JLF is a set of layered frameworks designed to help Java application developers create their applications quicker and with less code. These frameworks include the following capabilities:

1. Configuration framework

2. Logging framework

3. Utility library

4. Data mapping framework

5. HTTP request processing framework

The configuration framework basically initializes JLF by identifying where property files are located. Java property files configure the operation of the remainder of the frameworks in JLF, and the configuration framework helps the other frameworks to find those property files.

The logging framework is an evolution of my JLog logging framework. It helps to instrument events and log errors in your application so you can detect and correct defects more quickly.

The utility library portion of JLF contains code that performs some common coding tasks in Java. Examples include properly creating hash values for complex objects and using the Reflection API.

The data mapping framework is the main framework in JLF and the focus of this article. It’s designed to help you map data in your Java objects to any number of different data source/sink technologies. Most of the capabilities of the current version of the framework deal with the JDBC API, but JLF accommodates other types of data sources and sinks as well (for example, output to XML documents or input from servlets). It’s also extensible to fit any number of other transactional or nontransactional data source/sink technologies.

The framework layers described above are shown graphically in Figure 1. Each layer shows where the Java package is implemented in parentheses, so you know which package to import in your code.

Figure 1

To use the data mapping framework in JLF, you must understand three core concepts:

1. Data mapped objects: These are the Java classes you create for your application. They hold the data you want to map to your data source/sink.

2. Data mappers: The JLF framework provides these objects for you to map your data to and from the data source/sink.

3. Data location property files: These are the Java property files you create. They tell the data mappers how to map data between the data mapped objects and the data source/sink.

All three concepts go hand-in-hand to accomplish data mapping. We’ll now go through each concept in further detail.

Data Mapped Objects
Any Java classes that you want JLF to map to a data source/sink must be subclasses of JLF’s DataMappedObject class. This class contains all the core code to help you define and access variables, relationships, and inheritance hierarchies, so the framework can map these for you. Instead of defining instance variables in your object, define DataAttributeDescriptors. When you want to create relationships between DataMappedObjects in your design, create RelationshipDescriptors. If you have an inheritance hierarchy in your DataMappedObject subclasses, create a hierarchy table so JLF can instantiate the proper types of objects automatically. Figure 2 shows the primary classes in the JLF framework you use to define your DataMappedObjects.

Figure 2

Once you’ve defined your DataMappedObject subclasses with the proper attributes, relationships, and an optional hierarchy table, the data mapped object framework goes to work. It creates DataAttributes and relationships as it maps data back and forth between your Java objects and the database. These two classes of objects help the data mapping framework coordinate the data flowing to and from the database.

DataAttributes are used to replace instance variables in your classes. You may wonder why you can’t simply use instance variables like any other JavaBean class would. The answer is twofold. DataAttributes help the data mapping framework efficiently map the data to a database, and they also help to do optimistic locking. In the first case, if you don’t change a value in your object after it’s read from the database, there’s no need to send an update SQL statement when you store your object back to the database. Since you’ve made no change to the object, sending a SQL statement to the database uses up database resources to change a row to the same values it already contains. Not only would this consume precious database resources, it would also delay application response time to the application user.

The data mapping framework, in the execution of an update() method, first checks to see if anything has really changed in the object before it executes the SQL update statement. If you use simple instance variables in your design, the JLF data mapping framework would have a much more difficult time discovering if you’ve updated your object. Second, the most efficient way to use a database in a very high-volume transactional system is almost always to use optimistic locking. To use this, execute a locking query before you update or delete an object in the database. The locking query makes sure another process hasn’t modified the object since you originally read it from the database. One common way to do this locking query is to check the values of the object in the database and make sure they haven’t changed since the original query. With a simple instance variable in your objects, there’s no initial value to do the locking query before you update the row with the new value. DataAttributes keep the original value read from the database, as well as the new value that you wish to change the object to.

DataAttributes have different subclasses to help overcome the limitations of Java native types. For example, Java string variables do not have a limit on the number of characters you can store in them. When using a relational database, you almost always define a maximum string length for any of the character columns in your database. The StringAttribute subclass of DataAttribute allows you to define and enforce a maximum string length. Use LongAttribute for int and long variables, DoubleAttribute for float and double variables, DateAttribute for Dates, and, of course, StringAttribute for strings.

Relationship objects help you efficiently map related DataMappedObjects to a database. They help to introduce different database mapping optimizations. For example, you can use them when you deem it more efficient to use one query to populate any number of related Java objects. On the other hand, in cases where you rarely traverse a relationship, you don’t want to take the time to populate the objects on the other side of the relationship until you know you need them. Otherwise you’d be inefficiently pulling back large quantities of unused data from the database. The data mapping framework uses relationship objects to “lazy read,” or read on demand, such objects when you deem that approach to be more efficient.

Figure 3 shows how the DataAttribute and Relationship objects described earlier work with DataMappedObjects.

Figure 3

Data Mappers
The data mapping framework uses a data mapping “plug-in” called a DataMapper. DataMappers map objects to and from a particular data source/sink technology. The goal behind the data mapping plug-in design is to hide the complexity of mapping data to and from that technology. For example, say your Java application needs to map data in its objects to a relational database using the JDBC API, to XML documents using an XML-parsing API, from HTML input forms via the Servlet API, and then send messages to queues using the JMS API. You’d have to learn the complexities of four different and complex APIs to get your work done. You’d also need to write a lot of code, as each API is different and requires completely different code to execute the mapping.

The data mapping framework hides this complexity from you. The code to map your objects to a relational database looks almost identical to the code that maps your objects to an XML document or from the input parameters of a servlet. The DataMapper plug-in deals with the appropriate Java API, so under ideal circumstances your code has no technology-specific API code in it. There will always be cases where the framework doesn’t do what you need it to do when using, for example, the JDBCDataMapper. In those cases you write a little bit of JDBC code and hopefully the JDBCDataMapper will do the rest of the work for you. Data mappers in the JLF framework, including the JDBCDataMapper, are shown in Figure 4.

Figure 4

Data Location Property Files
Each DataMapper looks to a property file for information on how to map objects to the data source/sink technology it supports. These property files are called data locations. They describe how to get to a particular data location and map the data between Java objects and that location. To open a connection to a JDBC data location, the data mapper needs information such as the database URL, the appropriate JDBC driver, and perhaps a user ID and password. Once the connection is established, the data mapper needs to know which SQL statements to send to CRUD (create, read, update, delete) the data. You also tell the data mapper how you want to efficiently map your relationships – reading them in the same query as the original object, or perhaps lazy reading them on demand. In a future article, I hope to explain how each of the data mappers works to rid you of the burdens of data mapping API code.

Conclusion
Enterprise Java software developers, undergoing due diligence in their object design, have a difficult task at hand. Java’s APIs for dealing with data sources and data sinks are quite different from technology to technology. The JDBC, XML Parsing, JNDI, and JMS APIs have only the Java programming language in common. As a result, object designers typically hard-code the data mapping between their Java classes and the data source/sink technology they currently deal with. In most cases, this hard-coding is tedious, error-prone, and takes quite a bit of code to carry out.

Inheritance hierarchies, involved in almost any nontrivial object design, are typically abandoned because of data mapping difficulties. In addition, if the data source/sink design changes, it has a direct impact on the Java code (for example, the Java code is tightly coupled to the design of a database). When the same Java class needs to communicate with another data source/sink technology, it’s often easier to start from scratch rather than incorporate a second data source/sink mapping into the current class.

The JLF data mapping framework tries to address all these problems by separating the design of your Java classes from the mapping of them to and from a data source/sink. JLF abstracts the details of executing different technology mappings using data mappers. It provides default implementations of JDBC, XML (currently write only), and servlet data mappers and is hopefully extensible for you to add your own. This should leave you free to concentrate on good object design instead of dealing with all of Java’s data mapping APIs.

Author Bio
Todd Lauinger is a freelance author, conference speaker, teacher, mentor, and consultant. He is currently employed as a software construction fellow at Best Buy Co., Inc., in Minneapolis, MN.

[email protected]

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.