Since this issue of JDJ is devoted to Linux, it's only appropriate to focus on architectural issues. In our case this means examining the various software architectures that use databases in Java applications as well as some details of the low-level operations that occur between a database, a JDBC driver, and a Java application.
At a fundamental level, Java-based database applications can come in two flavors: two-tiered or three-tiered. The tiers don't indicate how many servers or components are involved; instead, they represent conceptual levels into which the different parts of the overall
application are categorized. In fact, a given level may have a single component (e.g., the corporate database), several components (e.g., replicated Web servers), or a million components (e.g., a Java applet running in a Web browser). Each of the two alternative architectures has specific advantages and disadvantages and hence specific application areas in which they excel.
Two-tier architectures (sometimes referred to as client/server frameworks) became somewhat prevalent with the growth of corporate LANs and WANs. In this architecture a server (such as the main database server) interacts directly with multiple clients (see Figure 1). The user interface can be an applet, a graphical application, or a command-line application. This approach has several advantages, in particular a very tight coupling between the client (which is running in its own JVM) and the server. As a result, applications can be prototyped rapidly since everything, including a complete understanding of the hardware and software system, is "in- house." In addition, these systems can use specific performance tweaks in the user interface since the client has explicit knowledge of the server (e.g., the UI knows whether it's talking to an Oracle or SQL Server database).
On the other hand, this tight coupling results in disadvantages that actually limit the applicability of the architecture, especially with the growth of Internet- and Web-enabled technologies. To see why, consider the implications of providing access to your corporate database via the Web - with everything from Denial of Service (DOS) attacks to crackers altering payroll records suddenly becoming possible. The simplicity of reengineering code, or even monitoring outgoing network traffic, makes these scenarios very real possibilities as soon as client software with intimate knowledge of secure servers is released.
Another important disadvantage of the tight coupling is system maintenance. Since the user interface is tightly coupled to the server, changes in the server (e.g., changing the database engine) break the entire application. Finally, this model is verboten when it comes to applet-database systems, since the applet can only communicate directly with the server from whence it came (i.e., the Web server). Even if they're running on the same hardware system, the database will by necessity be running on a different port than the Web server, and the Security Manager prevents the applet from communicating directly with the database server. This can be circumvented by explicitly providing this permission for the applet code, which, while possible in a corporate LAN where the operating systems are "ghosted," is impossible over the Internet. (Would you give an anonymous user full control of your computer?)
The alternative architecture is the three-tiered approach. Essentially, this architecture breaks the tight coupling by introducing a middle-layer between the user interface and the server. As a result, the client no longer has intimate knowledge of the server and instead communicates directly with a middle-layer server that handles the communication with the server (see Figure 2). As in the two-tier case, the user interface can be an applet, a graphical application, or a command-line application. Notice the flexibility of this approach, which mimics the famous Model-View-Controller architecture used in GUI applications. The application server hides the details of the database's server and provides caching performance boosts.
At this point, anyone who has considerable experience in C/C++ (i.e., experience using pointers) will surely question the performance of this approach, which has essentially added an extra level of indirection. By
using resource caching, however, this approach can actually provide performance improvements as well as simplify security restrictions (e.g., the applet can now communicate with a servlet running in the original Web server). Finally, this approach can also easily provide fault tolerance and replication capabilities through the addition of more components in the middle layer.
When integrating databases into Java applications, the insulation that Java provides often obscures important database concepts familiar to database developers. Primary among these is the concept of a transaction. Simply put, a transaction is a logical unit of work used by databases to record a user's interactions with a database. If everything completes successfully, a user will commit any resulting changes so that they're persisted. On the other hand, if a problem arises, a user can roll back the database to the state prior to the start of the transaction. While this is easily visualized in a single-user mode, real systems need to support multiple users concurrently. As expected, this complicates the database interactions since different users can be accessing or modifying the same data simultaneously, which can produce dirty, nonrepeatable, or phantom reads (see Table 1). Preventing these conditions requires controlling or limiting access to the data, formally known as locking. Different database systems provide different levels, or granularity, of locking, which can significantly impact the performance of an application. (Database locking is remarkably similar to thread-level locking.)
Fortunately for us, JDBC allows a developer to enable or disable autocommit mode, commit or roll back a transaction, and use the ability of a database system to control the isolation (i.e., locking) level of a transaction. These abilities are all provided as methods in the connection interface. Remember that when using JDBC, by default, a database connection operates in autocommit mode; that is, every statement sent to the database is in its own transaction. If you don't want this feature enabled, you need to explicitly disable it.
// Disable autocommit mode
// Do some SQL processing
// If things worked we can commit
// Otherwise we need to abort
// Process results.
Another database concept, batch processing, has been enabled with the release of the JDBC 2.0 API. This feature allows a developer to instruct a database to process multiple SQL data manipulation language statements (i.e., execute a data update) in a single batch process, which provides an additional performance boost. To utilize this feature, you need to turn off autocommit mode:
// Disable autocommit mode
stmt = con.createStatement() ;
build your batch SQL command:
stmt.addBatch("INSERT INTO Contacts
after which we can execute our batch statement.
Values('Bill', 'A','Gates')") ;
stmt.addBatch("INSERT INTO Contacts
Values('Scott', 'A','McNealy')") ;
int counts = stmt.executeBatch() ;
To handle batch statement processing, a new exception was added: the BatchUpdateException, which will be thrown if either one of the SQL statements in the batch returns a result set (instead of an update count) or if one of the SQL statements in the batch doesn't successfully execute. Of course, you still need to catch the SQLException in case the other JDBC statements encounter an error condition.
// Process results.
// Process results.
Grasping Your Results
In addition to the transaction nuances mentioned earlier, with the introduction of JDBC 2.0 API a Java developer can also control the type of result sets that a database will return as well as the level of concurrency they must support. To understand this more clearly, picture a Java application communicating with a database via an appropriate JDBC driver. The application executes a query that the driver sends to the database. The database creates a cursor to iterate over the query results (which, depending on the locking, can be sensitive to other transactions) and communicates this to the application via the driver. The complete details of this process are transparent to the application and clearly a lot of performance operations can be implemented, such as caching of results. For drivers that support it the JDBC API actually provides the ability to specify an a priori fetch size as well as a fetch direction (i.e., top-down or bottom-up). For example, the following line tells the current JDBC driver to fetch 20 rows at a time from the database starting from the last row.
Furthermore, the details of the result types are now controlled during the statement creation.
stmt = con.createStatement(
rsType, concurrencyLevel) ;
The first argument, rsType, can take three values:
- TYPE_FORWARD_ONLY, which is the default value, indicates that a result set will utilize a forward-only cursor.
- TYPE_SCROLL_INSENSITIVE indicates that the result set will use a bidirectional cursor that's insensitive to changes made in the database while it's open.
- TYPE_SCROLL_SENSITIVE indicates that the result set will use a bidirectional cursor that's sensitive to changes made in the database while it's open. To explicitly refresh the contents of a TYPE_SCROLL_SENSITIVE result set, you can call the refreshRow method:
The second argument, concurrencyLevel, controls the concurrent access to the results of a query and can take two values: CONCUR_READ_ONLY, which is the default, indicates that the result set cannot be updated, and CONCUR_ UPDATEABLE indicates that the result set can be updated. This point might seem confusing, but another new feature introduced in the JDBC 2.0 API is the ability to update a result set directly using Java commands as opposed to dynamically creating and executing a new SQL UPDATE statement. To accomplish this, result sets maintain two cursors: one for the current row, which you can update directly, and the second for a special, "phantom" row that's used to insert new rows.
For example, the following code changes the last name column for the current row to Gates and inserts a new row for the author:
rs.updateString("Last_Name", "GATES") ;
rs.updateRow() ; // Save the change
rs.updateString("Last_Name", "Brunner") ;
rs.updateString("First_Name", "Robert") ;
rs.insertRow() ; // Save the change
Hopefully this article has provided some more insight into the JDBC architecture and how you can use it to build Java database applications. Now you can start writing code and, as always, if you have any questions, check the JDBC documentation.
Robert Brunner is a member of the research staff at California Institute of Technology, where he
focuses on very large (multiterabyte) databases, particularly on KDD (Knowledge Discovery in Databases) and advanced indexing techniques. He has used Java and databases for more than three years and has been the Java database instructor for the Java Programming Certificate at California State University, Pomona, for the past two years. He can be reached at [email protected]