HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML

The performance of J2EE-based applications sometimes doesn't live up to users' expectations. Usually it's impossible to quantify exactly where the bottlenecks are. Many developers spend time searching for articles on the Internet only to find the same old tips about using the synchronized keyword and string concatenation without ever finding information that's useful. This article will help you find the holy grail of Java performance.

In my previous article (JDJ, Vol. 6, issue 9) we focused on tips that are common to most applications. The tips presented here focus on common problems found within applications that utilize JSP, EJB, JNDI, and JDBC. Future articles will cover some powerful techniques to help your applications perform better.

Session Invalidation
Many application servers typically have a default timeout value of 30 minutes for cleaning up dead sessions. When an application server can't hold any more sessions in memory, it may cause the operating system to page out portions of memory. The application server may also swap unused sessions to disk based on the most recently used algorithm or even throw OutofMemoryExceptions. In a high-volume system, serializing sessions can be expensive. Calling HttpSession.invalidate() is the recommended way to clean up a session when you no longer need to use it. This method is usually called as part of an application's logout page.

Sessions and JSP
The J2EE specification requires all JSP objects that can be referenced in JSP source and tags be usable without explicit declaration. For Web pages that don't require session tracking, you can save resources by turning off automatic session creation using the following page directive:

<%@ page session="false"%>
Servlet and Memory Usage
Many application developers are guilty of storing way too much information in the users session. Sometimes these objects don't get garbage collected in a timely manner. The typical performance symptom may be periodic slowdowns that are reflected in the user experience but aren't traceable to any particular component. If you're monitoring the JVM's heap size, this may be reflected as significant peaks and drops in memory usage instead of a normal pattern.

There are a couple of ways to even out memory usage in this scenario. The first recommendation is to have all beans that are scoped as session implement the HttpSessionBindingListener interface. This allows you to explicitly release resources that are used within the bean by implementing the method valueUnbound().

The other approach is to simply expire sessions more quickly. Most application servers have settings that usually specify the interval. You can also do it yourself programmatically by calling session.setMaxInactiveInterval(), which specifies the time in seconds between client requests before the servlet container will invalidate the session.

HTTP Keep-Alive
The majority of Web servers on the market, including iPlanet, IIS, and Apache, support HTTP Keep-Alive. This function keeps an open connection from client to server so that subsequent requests to the server don't have to be established or reestablished. This is usually a good technique for sites serving static content. The problem with sites with heavier loads is that the benefit of keeping a connection open for a single client also has a performance penalty as it ties up resources when the process is idle. This resource usage is even more important if your Web and application server are one and the same. Choose the Right Include Mechanism

A typical JSP architecture may break out headers, footers, and navigation into their own resources, included in each JSP page as appropriate. Currently there are two methods to include a resource: the include directive and the include action.

  • Include directive <%@ include file="copyleft.html" %> : Includes the content of the resource at compile time. The page with the directive and the resource are merged into one file before final compilation. When resources are resolved at compile time, they'll always be faster than resources resolved at runtime.
  • Include action <jsp:include page="copyleft.jsp" /> : Includes the response generated by executing the specified page. Since this is done at runtime, we can vary the output produced. Use this action only for content that changes often and for scenarios in which pages to include can't be decided until the main page has been requested.
Use Cache Tagging Features
Several vendors have added cache tagging features to their application servers for use with JSP. BEA's WebLogic Server introduced this feature in the 6.0 versions of its product. This feature is also supported by the Open Symphony project. JSP cache tagging allows both fragments and page-level information to be cached. When a JSP page executes, if a tagged fragment is found in cache, the code that creates the fragment is skipped. Page-level caching catches requests for specific URLs and caches the resulting output. This feature is extremely useful for shopping cart/catalog and/or portal home pages. In this scenario a page-level cache can store the resulting content to satisfy future requests. Here's an example URL that would be a good candidate: http://quote.yahoo.com/q?s=hig&d=t".

Use of cache tagging features provides performance increases for applications where there's significant logic; and has less of an effect for truly architected sites that utilize an MVC (Model View Controller)-based framework.

Always Access Entity Beans from Session Beans
Accessing entity beans directly is bad for performance. When a client application accesses an entity bean, each get method is a remote call. A session bean accessing the entity bean is local and can collect all data in a structure and return it by value. You can read more about the value pattern in Design Patterns by the Gang of Four.

Using a session bean to wrap access of an entity bean allows for better transaction management as the session bean will commit only when it reaches a transaction boundary. Each direct call to a get method results in a transaction. The container will execute a store-and-load after each transaction on an entity bean.

At times using an entity bean will result in bad performance. If the only purpose for an entity bean is to retrieve and update values, you'll gain better performance using JDBC within session beans.

Use Read-Only in the Deployment Descriptor
The deployment descriptor for an entity bean allows all get methods to be set as Read-Only. This increases performance when the unit of work in a transaction contains no methods other than read-only, as the container won't invoke the store.

Cache Access to EJB Homes
EJB Home interfaces are obtained through a JNDI naming lookup. This operation requires significant resources. A good place to put lookup code is within a servlet's init() method. If your application requires EJB access by multiple servlets, it would be wise to create an EJBHomeCache class. This class typically would be implemented as a singleton.

Consider Local Interfaces for EJBs
Local interfaces is an addition to the EJB 2.0 specification, allowing a bean to avoid the overhead of a remote invocation call. Consider the following code:

PayrollBeanHome home = (PayrollBeanHome) javax.rmi.PortableRemoteObject.narrow (ctx.lookup ("PayrollBeanHome"), PayrollBeanHome.class);

PayrollBean bean = (PayrollBean) javax.rmi.PortableRemoteObject.narrow (home.create(), PayrollBean.class);

The first statement indicates that we want to find the bean's home interface. This lookup is via JNDI, which is an RMI call. We then locate the remote object and return the proxy reference. This too is an RMI call. The second statement demonstrates creating an instance. This code points to a stub that creates an IIOP request and transmits it over the network. This too is an RMI call.

To implement this functionality, all you have to do is extend from EJBLocalObject instead of EJBObject. My Pentium 700Mhz machine, a change from remote to local interfaces, sped up method calls overall by 20%.

To implement local interfaces, you may have to make the following changes:

  1. Methods can no longer throw java.rmi.RemoteException. The rule also applies to exceptions extended from RemoteException, such as TransactionRe- quiredException, TransactionRolledBackException, and NoSuchObjectExcep- tion. EJBs provide equivalent local exceptions (TransactionRequired- LocalException, TransactionRolledBackLocalException, and NoSuchObject- LocalException).
  2. All data and return values are passed by reference, not by value.
  3. The local interface must be used on the machine where the EJB is deployed. In simpler terms, everything must be within the same JVM. This limits you to deploying on nonclusterable applications.
  4. References for beans that implement local interfaces aren't serializable.
Consider Writing Your Own Stubs
Stubs are responsible for forwarding method invocations to remote beans, and are typically generated by most deployment tools. If you haven't already implemented the value pattern, you can implement this functionality before anything is transmitted over the wire by modifying the stub. In modifying the stub, you can also implement your own caching routine or even compress data before it's transmitted.

Normally this technique isn't recommended and would be against previous recommendations about being clever. You'll have to worry about the deployment tool overwriting your changes as well as having your stub figure out when to reload data if another client changes information.

Generating Primary Keys
There are many clever ways of generating primary keys within an EJB. I'll list several common techniques and then explain why they're all bad.

You can use the database's built-in identity (SQL Server IDENTITY or Oracle's SEQUENCE). This makes the implementation of your EJB nonportable (bad).

You could have an entity bean increment its own value, but this is bad as it requires serializable transactions, which are also slow.

You could use a time service such as NTP, but this requires native code and locks your bean to the particular OS. This approach also allows for the potential of generating two primary keys in the same millisecond on multiple CPU servers.

You could architect your bean and steal some ideas from Microsoft by creating a GUID, but you'll run into the fact that Java can't determine the MAC address of your network card without resorting to JNI, which will make your bean OS- dependent.

You'll also run into an issue if you try to utilize System.currentTimeMillis(), which will have the same issues previously mentioned. It'll be hard to identify a comparable formula. You could try to implement statics, but you'll fail as they aren't supported in the EJB specification.

There are several other approaches but they all have their limitations. There's only one appropriate answer: consider using RMI and JNDI together. You'll start with binding the RMI remote object to the JNDI tree via the RMI registry using the JNDI service provider interface. Your clients will look up the singleton via JNDI. Here's an example:

public class keyGen extends UnicastRemoteObject implements Remote {

private static long keyVal = System.currentTimeMillis();

public static synchronized long getKey() throws RemoteException { return keyVal++; }

JDBC and Unicode
Hopefully you've read the typical industry recommendations about using JDBC such as using connection pools, preferring stored procedures or direct SQL, using type 4 drivers, removing extra columns from the result set, using prepared statements when practical, having your DBA tune the query, and choosing the appropriate transaction levels.

Besides the more obvious choices, the best action you can take to increase performance is to consider storing all character data in Unicode (Code page 13488). Java processes all data in Unicode and therefore the database driver doesn't have to perform a conversion. But beware: taking this step will cause your database size to grow, as Unicode requires 2 bytes per character. You'll have to worry about performance when you implement this tip if you have non-Unicode applications accessing the data, as a conversion will also occur.

JDBC and I/O
When your application requires access to a large result set, consider implementing block fetches. By default, JDBC fetches 32 rows at a time. As an example, if you needed to iterate through a result set of 5,000 rows, this would cause JDBC to make 157 calls to the database to fetch data. If you changed the block size to 512, this would require only 10 round-trips.

This tip may not work in several scenarios. If you use scrollable result sets or specify FOR UPDATE as part of the query, blocking isn't used. Another technique to use is the Page-by-Page Iterator pattern.

Consider Using an In-Memory Database
Many applications have the need to store a significant amount of data on a per-user basis in the session. Typically, you may see this implemented as a shopping cart and catalog. Since this type of data is represented as row/column data, many applications may create either large vectors or hashmaps. Keeping this type of data in session severely limits scalability as you must have at least the amount of memory per session times the maximum number of concurrent users, which can either be a very expensive server and/or could stretch garbage collection times to unbearably long periods.

To achieve slightly better scalability, some people have offloaded the shopping cart/catalog functionality to the database tier. The fundamental problem with the database tier is based on the architecture of most relational databases. The main principle at work is that they try to make all writes durable; hence all performance is tied to the ability to physically write the data to disk. Relational databases try to reduce I/O, especially for read operations, but accomplish this goal only by execution of complex algorithms that implement caching and are the main reason that the database tier's number-one bottleneck is usually CPU.

There is an alternative. Consider using an in-memory database. Several vendors have addressed this market. I'm a fan of TimesTen, but you're welcome to choose your own. They start by allowing data to be temporarily written but not necessarily persisted to disk, and keeping all operations in memory. As a result, they also don't need complex algorithms to reduce I/O, and are faster because they can employ simpler locking mechanisms.

Develop a Smarter Caching Mechanism
Many application developers have implemented caching mechanisms for frequently requested data. These caches are typically implemented as hashmaps. The main problem with this approach is that there's no constraint on how large the hashmap can grow. Many applications should consider placing a constraint on cache growth by using algorithms utilizing a strategy of keeping only the most recently used objects. This is best accomplished by combining a hashmap with a LinkedList.

The code should implement the following logic:

  1. If the cache is full, remove the last object from the tail of the list and insert the new object at the head of the list.
  2. If the cache isn't full, and you want to insert a new object, put it at the head of the list.
  3. If the object is already in cache, move it to the head of the list.
Conclusion
Optimizing code is one of the last things developers should consider. Slow applications that work are preferred to fast applications that don't. Keep in mind that performance is sometimes in perception. It's possible to optimize a user's perception of performance without necessarily optimizing an application. Consider providing immediate feedback, as users will be happier to view a screen that paints immediately and takes 10 seconds to process than to have a screen that paints itself in seven seconds.

Author Bio
James McGovern is an enterprise architect with Hartford Technology Services Company LLC, an information technology services firm dedicated to helping businesses gain competitive advantage through the use of technology. His focus is on developing high-availability Internet applications. [email protected]

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.