I define a stand-alone database application as one that is installed and maintained primarily by the end user. Deployment may be on an isolated computer or small network for shared database access. Examples of stand-alone database applications are numerous in shareware, consumer software and general-purpose business programs.
When it comes to finding a stand-alone database solution for Java, there's good news and bad news. The good news is that solutions are available. The bad news is that you may have difficulty choosing the best one. No single solution may meet all your needs, and many products are just now emerging. Several Java-related issues such as application tiers, JDBC, portability and RMI further complicate the situation. This article will help you choose a stand-alone database solution for Java.
Multiple tiers provide a way to advantageously partition and distribute the tasks in a database application to reduce client maintenance costs and improve performance. These improvements come at the cost of a more complex operating environment, however, since each tier may require special software or additional administration. Figure 1 illustrates typical tiered architectures.
Type 1 - JDBC to ODBC Bridge: This driver translates JDBC method calls to ODBC function calls and provides access to any ODBC data source via JDBC. Multiple levels of translation can slow performance, and deployment is complicated because of the ODBC driver required on each client computer. Multiuser access is complicated since many ODBC drivers are not networked
Note that not all drivers fit into one of these four types. Consider, for instance, a native API driver that is 100% Java and directly accesses a local database. This type of driver is well suited for access to desktop databases but doesn't fit into one of JavaSoft's defined types. Some vendors do supply unique JDBC drivers that provide advantages in certain scenarios.
- Type 2 - Native API, Partly Java Driver: This driver translates JDBC into calls to a native database API. Performance is generally better than with Type 1 drivers because of one less translation layer. Deployment problems remain since Type 2 drivers require the database vendor's proprietary library on each client computer. Classic two-tier applications often use this type of driver.
- Type 3 - Network Protocol, All Java Driver: This driver translates JDBC calls into a DBMS-independent network protocol that a middle-tier server translates into a DBMS-specific protocol. This flexible driver is well suited for distributed three-tier architectures. Performance can be slow because the middle tier may itself use a Type 1 or Type 2 driver to access the database. The additional tiers make deployment complex and the network protocol may be proprietary even though it's database independent.
- Type 4 - Native Protocol, All Java Driver: This driver converts JDBC calls directly into the network protocol used by a specific DBMS. Native database calls are made directly over the network so performance is usually good. The database vendor usually supplies this type of driver since native DBMS network protocols are generally proprietary.
Two Java applications can communicate over a network using a standard technique called Remote Method Invocation or RMI, which is defined by JavaSoft and is available free. As its name implies, RMI lets a client application invoke the methods of objects running on a remote server.
Shared database access is often implemented with RMI, so an overview is in order. To use it, first create a public interface (called the remote interface) to define the remote methods. Then create classes that implement the remote interface, called (not surprisingly) implementation classes. The remote objects are instances of these classes running on the server. Next, use JavaSoft's RMIC compiler to create client-side object stubs and server-side object skeletons based on the implementation classes. The stubs forward RMI calls from the client over the network to the server skeletons, which in turn forward them to the remote objects. The remote objects execute their remotely invoked methods and work gets done.
To get it all running, start the RMI Registry on the server. This basic naming service lets clients obtain references to remote objects. Each remote object must register itself with the RMI Registry when instantiated. Client applications connect to the registry to look up references to the registered remote objects. Once obtained, the reference is used to invoke the methods of the remote object. By default, RMI uses TCP/IP sockets for communication, although other transport methods can be implemented.
JDBC-compliant solutions can be networked by using the RMI--JDBC bridge. This free product comes with a Type 3 JDBC driver for use in your application. This driver connects to the RmiJdbc server that is included. The server uses any JDBC driver to connect to the database of your choice. Figure 3 shows the RMI to JDBC bridge.
There are many things to consider when evaluating database solutions. The criteria that are important for a specfic application depend heavily on the requirements and constraints unique to that problem. In this article, the requirement that the database is for stand-alone use with Java affects which criteria are important. The criteria most affected by this requirement are discussed below.
The type of database used impacts development effort and application complexity. Those with no object orientation require you to map your objects to their structure, which takes time and requires code. You may also lose the ability to preserve and exploit relationships between objects. A DBMS is beneficial if it provides functionality that doesn't have to be built into the application.
To fully leverage the object-oriented nature of Java, some type of object orientation in the database is desirable. This generally means using an OODBMS or ORDBMS of some sort. Some solutions provide tools and frameworks to help map objects to relational structures and to put a more object-oriented face on relational databases. The capabilities of a DBMS of any type can be valuable as well.
The database can be in a proprietary format or what I call a standard format. A proprietary format is just that, and is usually unique to a solution. By "standard" format I mean a common or established format that is accessible by a number of DBMSs and tools. Which option is best depends on your application.
Standard databases provide access to legacy data, use familiar file formats and have mature administration tools. They also provide a way of transitioning to Java without learning a new database. Two drawbacks include a lack of power in the native database, and having to rely on an outdated format. Another disadvantage is that your only options for accessing some common desktop databases may be to use a JDBC-ODBC driver or to roll your own solution.
Many newer solutions designed for Java use proprietary database formats. This is not necessarily a bad thing. After all, in an evolving environment today's proprietary format might be tomorrow's standard. Proprietary formats often provide significant added value, such as transaction processing, disaster recovery, object orientation and replication. One drawback of proprietary formats is the potential lack of development and administration tools. In addition, proprietary formats may still be immature and subject to the risks associated with early adoption.
JDBC compliance provides many benefits but may not be necessary for all applications. A major advantage is the ability to make applications modular by decoupling them from a specific database solution, allowing you to plug in a different database implementation without changing the application. Another benefit is database scalability to address performance or functionality issues. JDBC also supports adding tiers for distributed deployment.
One drawback of JDBC-compliant solutions can be performance. Type 1 drivers have multiple levels of translation that inhibit performance. Likewise, Type 3 drivers use middle tiers that in turn may use Type 1 or Type 2 drivers. Another problem is deployment complexity, since there may be various libraries and executables required on the client machines or on middle-tier servers. This is true for Type 1, Type 2 and Type 3 drivers. And finally, if you need to access a common desktop database, you may not be able to find any drivers except the Type 1 ODBC to JDBC bridge.
Shared database access can be implemented in a number of ways with Java. Some solutions provide for multiple transactions, multithreading or both. Other solutions implement multiuser control through classic table and record locking, or feature more object-oriented methods of controlling concurrent object access. Still others require that the application implement and enforce all multiuser control.
However shared access is implemented, there's one thing most multiuser solutions for Java have in common: they require some type of server to provide network access to the shared database. This means that a multiuser Java database application will most likely have a two-tier architecture with a server component. Many solutions come complete with a server that supports shared database access, while others require that you create your own server. A two-tier architecture also requires communication between the clients and server. Sharing databases over a network requires a transport protocol. Java uses TCP/IP sockets by default with other methods supported.
How shared access is implemented significantly impacts development effort and deployment complexity. One of the biggest issues to consider is the amount of work needed to share a database. This may include writing a database or object server, developing application-level concurrency control or extending the solution using RMI. Another issue is how much work the user must do to install and administer the database. Components such as the database server, RMI registry and ODBC drivers need to be installed, configured and maintained.
Also important to look at is the difference between the single-user and multiuser versions of the solution. Some solutions offer an easy migration path while others require supporting two different versions of the application. There may also be cost differences since some solutions offer an entry-level version that supports single-user access, with a more expensive enterprise- level version required for shared access.
One risk is that a multiuser solution may not really be multiuser. Ideally, the solution should support multithreading and multiple transactions. Some solutions may emulate concurrent access but be serialized at a low level. This becomes an increasing problem as transaction rates rise.
Robustness encompasses both resistance to failure and ease of recovery once a failure occurs. Applications typically provide some robustness but the database must be robust as well. This is especially important for stand-alone databases that need to run with little or no user administration. Ideally, failures shouldn't happen in the first place. However, given that some failures are inevitable, a robust solution should make recovery as easy as possible.
One type of common failure is data inconsistency, which results in incorrect information even though the database itself is still functional. Common causes are partial updates, improper validations and communication failures. Guarding against inconsistency may be the responsibility of the DBMS, the application or both. Solutions that support transaction processing and enforce database structure offer better protection against this type of failure.
Another kind of failure is corruption of the database files or their structure, which renders them unreadable or inaccessible. Some solutions provide low-level recovery functions for repairing corrupted databases. If the database is in an open format, other repair tools may exist. Some solutions even sport transaction logging with full automatic recovery. Depending on the solution, you may have to build some error recovery into the application to ensure adequate reliability. DBMSs generally provide increased robustness through their functionality. Be wary of solutions that are easy to corrupt through simple mistakes.
Since it's one of the major advantages of Java, you must consider the portability of all of the components of the solution. This includes the database server, tools, code generators, data files and any special libraries or drivers. The main requirement for portability is that the solution be 100% Java. Most solutions are 100% Java on the client side but not necessarily on the server side. Depending on the application's architecture and target platforms, the portability of any server-side components may be less important. If libraries or drivers are required on the client, they must be portable or available for all target platforms.
The risk associated with not choosing a portable solution is that your application may not run on all platforms or be able to be developed on specific platforms. The importance of this depends on the target platforms for the particular application.
A solution's modularity can be gauged by how easy it is to implement in an existing application and, once implemented, how easy it is to remove it or substitute another solution. The easier these things are, the more modular the solution. Modularity provides flexibility, which allows unexpected issues to be addressed quickly and easily. JDBC-compliant solutions are inherently modular because of the modularity of JDBC. Solutions that are not JDBC-compliant or that present abstraction layers on top of JDBC tend to be less modular since the application code is coupled with the solution. A solution's modularity also depends on how you implement it in your application.
While modularity is desirable, other factors may be more important - for instance, using an application generator or persistence framework to implement object persistence. The solution may not be modular since the application is tightly coupled to the solution, but the productivity benefits may outweigh the potential drawbacks.
When selecting a solution, be alert for limitations that have implications for the application. Many Java database solutions suitable for stand-alone use are relatively new and not fully implemented. If limitations are found, there may be upgrade paths that provide relief. A JDBC-compliant solution, for example, lets you easily change the database solution to address limitations.
The database implementation may have limitations such as missing data types, inadequate indexing capabilities, partial SQL implementation or lack of tools. Any solution worth considering probably provides enough database functionality for most applications, but look more closely if there are special requirements. Also, look for performance limitations. Factors like application architecture, multiuser implementation and JDBC driver type all affect performance. Often performance limitations aren't evident until you subject the application to real operating conditions.
Significant Value Added
Some solutions provide significant value added for both the developer and the end user. Features like error recovery, object orientation, replication, tools, code generators and even report writers are examples of added value that come with some solutions. Since many solutions are relatively new, you should ensure that important features and additions work as advertised. The portability of add-ons and tools may also be an issue.
Look beyond the solution vendor for other sources of added value. Standard databases often have value added from existing tools and support for the format. JDBC-compliant solutions allow use of generic JDBC-based tools.
Moving to Distributed Deployment
The application may be stand-alone now, but it might not always be so. It's worth considering how easy it is to migrate a solution to distributed deployment. Some solutions easily scale from one tier to three or more with little if any change to the application. Others may provide a migration path through a related set of products. Some solutions aren't easy to migrate beyond a one- or two-tier architecture. JDBC-compliant solutions can support multiple tiers by employing a Type 3 driver to communicate with a middle tier.
To develop and support any database application, you need administration tools to create, modify, delete, query and otherwise maintain the database. The quality of these tools will affect your development and maintenance efforts. The database format (standard or proprietary) and JDBC compliance largely determine the choice of tools.
If the database uses a proprietary format, you may be dependent on the vendor for maintenance tools. An exception is if the solution is JDBC-compliant. There are an increasing number of JDBC-based tools available to maintain compliant data sources. If the database is in a standard format, tools are probably already available. Some solutions come with their own tools but functionality varies.
User administration refers to the work the user must do to install and run the application. By definition, the user will do most of this for stand-alone applications, and it's important to make things as easy as possible. A Java application is not inherently more difficult to install and run than any other executable file. The potential for difficulty arises when it comes to the database and related components. If a solution has multiple components, any one of them may require administration by the user. Typical components requiring administration include database servers, the RMI registry, ODBC drivers, native DBMS drivers and communication protocols like TCP/IP. It's important to consider the demands that the entire solution places on the user when making your choice.
Some solutions are specifically designed for zero administration. Newer solutions are more likely to require less administration, but probably entail a proprietary database format. A true zero administration database with full automatic recovery is ideal.
Cost is always an issue when choosing a solution. This is especially true for standalone database applications since typically their scale is small. The total cost of the database solution must not be so high that an enterprise-scale application is required to justify it.
Most solutions require a development license, usually on a per-developer basis. Typical developer licenses cost anywhere from nothing to a few hundred dollars per license. Many solutions also require you to pay for runtime or deployment licenses. You want this cost to be low since a license is usually required for each copy of your application. Deployment licenses range from nothing or less than a dollar all the way up to several hundred dollars per client.
Another typical cost is for source code licenses. You may want the source code to modify or enhance a solution. Source code licenses are not always available, but when they are they can range from several hundred to several thousand dollars. Other possible costs are for any administration or support tools required to develop with the solution or to support the application.
One final cost to watch out for is some type of minimum initial investment required before you can use a product. An example would be a required purchase of a certain number of deployment licenses along with the development licenses. Although uncommon, some vendors do use this approach and the cost can be thousands of dollars.
Now that you know what to look for, here's a look at the various types of database solutions available for stand-alone use. Solution types are first classified by whether or not they are JDBC-compliant. Then they're roughly subdivided based on the type of database they access. Notable strengths and weaknesses of each type of solution are described below, with actual products used as examples where possible.
JDBC to ODBC Bridge
This solution is simply a Type 1 JDBC driver that converts JDBC calls to ODBC. It's an attractive solution because ODBC drivers are widely available for many standard databases and the JDBC driver is available free from JavaSoft. JDBC compliance is the other main advantage to this solution.
Drawbacks include possible slow performance due to multiple translation layers in a Type 1 driver. Deployment is complicated because ODBC is required on each client. Since most ODBC drivers aren't networked, the RMI to JDBC bridge may be required for shared access. Portability can be an issue depending on the choice of database and ODBC drivers. There are many products available in this category. Companies like Intersolv and Openlink Software sell ODBC drivers and the JDBC to ODBC bridge is available free from JavaSoft.
JDBC to Standard Database
Solutions of this type use JDBC to access a standard database format directly using a Type 2 or Type 4 JDBC driver. A solution accessing a standard database with a native API, all Java driver (not defined) would also fall into this category. Type 3 drivers don't fit in this category since they interface with a middle-tier server and not with the database. Advantages of this type of solution include JDBC compliance and the benefits of a standard database format. Performance should also be better than with the JDBC to ODBC bridge.
One major disadvantage of this solution is that Type 2 and Type 4 drivers aren't commonly available for databases suitable for stand-alone use. Deployment may be complicated for a Type 2 driver if an external library is required to access the database. Products suitable for stand-alone use aren't readily available in this category. JDBC access to traditional desktop databases is usually via the JDBC to ODBC bridge.
JDBC to Proprietary Database
This type of solution may also use a Type 2 or Type 4 JDBC driver to directly access a proprietary format database. As you might expect, the database vendor usually supplies the driver. Some solutions come with a JDBC-compliant driver that doesn't fit into one of the four defined types.
In addition to the advantages of JDBC compliance, solutions with proprietary database formats often offer added value. Typical features include DBMS functionality, object orientation, low administration design, transaction processing and replication. Since many proprietary formats for Java are new, potential drawbacks include early adopter risks and a lack of mature support and administration tools. The flexibility of JDBC mitigates the risks associated with a proprietary solution. One example of this type of solution is InstantDB by Instant Software Solutions, Ltd. This proprietary database comes with an unclassified JDBC driver that is 100% Java and directly accesses a local database. This solution provides added value through some SQL implementation, triggers and administration tools. Another example is JDBMS by Cloudscape, Inc. This is a full-featured ORDBMS that provides significant value added with enhanced SQL, automatic disaster recovery, transparent migration from one to n tiers, low administration design and advanced replication features.
Abstraction Layer to JDBC
This type of solution provides a layer of abstraction on top of JDBC in an attempt to make life easier for the developer. In some solutions this abstraction layer is designed to help bridge the gap between the object- oriented nature of Java and the relational nature of JDBC. In other solutions the abstraction layer simply provides a higher level programming interface than JDBC. Direct JDBC access is usually available if needed. These solutions often provide significant added value in the form of assistance in mapping objects to relational tables, Java and SQL code generation, and transparent migration to multiple tiers.
A potential drawback of these solutions is a lack of modularity at the application level due to the proprietary abstraction layer. JDBC compliance provides flexibility on the database end that mitigates this effect. Figure 4 shows a diagram of this type of solution.
One product in this category is CocoBase Lite from Thought, Inc., which provides the CocoBase API abstraction layer. This API implements persistence by mapping each class to one relational table. If you write your code in terms of the CocoBase API, you can upgrade transparently to CocoBase Enterprise to migrate to a multitier architecture.
Another product in this class is JDBCStore from LPC Consulting Services, Inc. This product comes with a workbench application that helps you build a model that maps objects to a relational database schema. The workbench automatically subclasses objects and generates Java and SQL to implement transparent persistence using any JDBC data source.
A final example is DBTools.J by Rogue Wave Software, which is an abstraction layer that provides database replication and synchronization functionality. It also includes wrappers for JDBC classes that provide enhanced exception handling.
Non-JDBC to Standard Database
These types of solutions let you access a standard format database without using JDBC. Generally, this type of solution provides access only to a single database format. There aren't many of these solutions currently available since the JDBC to ODBC bridge accesses many standard databases. The lack of JDBC compliance is a drawback with this type of solution because it limits flexibility. Some potential advantages include compatibility with a standard database format, better performance than using the JDBC to ODBC bridge and a familiar data access API. Non-JDBC solutions may also provide lower level database access than JDBC, which is desirable and even required for some applications.
Non-JDBC database access
One example of this type of solution is XBaseJ from American Coders, Ltd., which provides classes to directly access and manipulate XBase files and indexes. Miscellaneous utilities and tools are available, as is a free multiuser server component. CodeBase also offers an ODBC driver and the JDBC-ODBC bridge to access the CodeBase server.
Non-JDBC to Proprietary Database
Solutions of this type access a proprietary database without using JDBC. These solutions are usually implemented as a set of classes and/or interfaces for storing and retrieving objects in the database. Classes for manipulating, indexing and querying the database may be provided as well. Since proprietary databases are usually designed for use with Java, they often have some object orientation.
This type of solution should have good performance due to a native driver. As stated above, proprietary databases often provide more object orientation than JDBC- compliant solutions. They also can provide lower level database access than JDBC if needed. Drawbacks include a lack of JDBC compliance and the disadvantages associated with a proprietary solution.
An example of this solution is Streamstore from Bluestream Database Software Corp. This object-persistence engine provides a simple interface that classes implement so they can be saved, retrieved and indexed. Classes for manipulating and querying the object store are also provided.
One other type of solution is what I call a data management framework. This class of solution provides a database application framework that you can customize to create your application. Its advantage is that you can create a database application very quickly. The disadvantage is that you're completely tied to a solution that may not provide the functionality you need.
An example of this kind of solution is MaxBase from Max Marsiglietti. This solution uses indexed ASCII files to store data. Data access and presentation are controlled by the MaxBase application, which can be customized somewhat to meet your particular needs.
By now you should be ready to go out and find the ideal stand-alone database solution to use for your Java application. But what exactly should you look for? Here are a couple of recommendations to get you started.
Shed a Tier or Two
Perhaps the most important requirement for a stand-alone database application is that it be simple to install and run. This must be true for the whole system including the application, database and supporting software. Multiple tiers make an application more complex, which can make it difficult to install and run. This is a good reason to try to limit the number of tiers in your application.
One-tier applications are nice because everything comes in one neat package. If you don't need to share your database, you may be able to use a one-tier solution. If you do, be sure to plan for migrating to more tiers in the future. Shared database access will most likely require a two-tier solution. Traditional two-tier RDBMSs such as Oracle and Sybase are not suitable for stand-alone use since they're complex to install and configure and require professional administration.
A two-tier solution destined for stand-alone use should be as simple as possible. There should be minimal software to install and configure on the client and server, and the database should need little or no administration. When selecting a two-tier solution, the simpler the better.
To JDBC or Not to JDBC?
JDBC is good because it provides modularity, portability and a standardized SQL-based API for accessing different data sources. Using JDBC mitigates some risks since you can change databases simply by changing the driver. Because of its many benefits, you should use a JDBC-compliant solution unless there are compelling reasons not to.
One reason not to use JDBC is to leverage experience with a particular database or database API. You may also want to consider other solutions if you have to use a Type 1 JDBC driver since the ODBC-JDBC bridge can be slow and requires special software on each client. Non-JDBC solutions often have some significant added value that may be important to your application. And finally, a JDBC-compliant solution may not provide enough low-level database access and control for some applications.
Storing objects in a database that has no object orientation can create additional overhead during development and in the finished application. If objects and their relationships can't be stored directly in the database, then you have to map them to a form that can be stored. Object navigation and database queries will also be affected. Overcoming the mismatch between objects and your database structure can add significantly to your development tasks.
One way to reduce the effort is to use a database solution that has some object orientation. Object orientation can come in the form of an OODBMS or an ORDBMS. Other solutions allow you to store and index objects but don't provide DBMS features. Another approach is to use a tool that helps you create and maintain the object-to-database mappings. Several solutions offer frameworks or workbenches that help store objects in non-object databases.
Two final recommendations are to pay attention to the extras and keep an eye toward the future. By "extras" I mean several things, including goodies that come with a solution, tools required to implement the solution and everything needed to deploy the finished application. There can be a lot of these extras, and all of them can affect the development and maintainability of the application. Make sure you consider the whole picture when deciding on the best solution.
"Keeping an eye toward the future" means you should consider how your application might need to change. You may be developing a stand-alone database application now, but what if you need to change to more distributed deployment? How easy will it be to handle increased loads or changing data requirements? Thinking about these issues now will help you later because the solution you choose determines in part how easily you can adapt your application to future requirements.
The State of the Art
Now you know some things to look for in a stand-alone database solution for Java and how to pick among the products you find. But what exactly are you likely to find? Well, the available solutions are less mature than those aimed at enterprise-level databases. New products are still under development and alternatives are limited for some solution types. Current products span the range when it comes to design, functionality and quality. Product cost is usually reasonable, with full-featured solutions tending to be more expensive than less functional ones.
One area that is noticeably thin when it comes to current products is the ability to access popular desktop databases without using the JDBC to ODBC bridge. Some non-JDBC products are available for accessing XBase, but if you want to use JDBC to access another standard database, you'll probably have to use JDBC with a Type 1 driver. There's a lack of Type 2 and Type 4 drivers for desktop databases, which is unfortunate since they can provide faster access and require less configuration than Type 1 drivers. A native API, all Java JDBC driver that's 100% Java and directly accesses desktops' databases using a native API would also be nice. This type of driver would be faster, simpler and more portable than a Type 1 driver.
As I said at the start, there is both good news and bad news. The good news is that you can create real stand-alone Java database applications with the solutions available today. The bad news is that you'll face some extra difficulties and risks due to the immaturity of the product offerings. The bright side is that your choices will improve as more products emerge and existing products are refined. No single solution may meet all your needs, so weigh the benefits against the risks to pick the best one for you.
About the Author
Tim Callahan is a software developer and consultant living in Oakland, California. You can find him at his company's Web site at www.palocolorado.com or at