In the November JDJ (Vol. 3, Issue 11) we peered into the Cosmic Cup to look at some of the Java Virtual Machines on the market. We also discussed how a VM enables Java to promote its "write once, run anywhere" (WORA) cause. To recapitulate, the Java programming environment may be categorized into two computing environments. The compile-time environment provides the translation of Java source code to bytecodes (.class files). The runtime environment provides the interpretation of the bytecodes into native, platform-specific, executable instructions. A Java Virtual Machine's purpose is to load class files and execute the bytecodes they contain.
The speed of execution in the runtime environment is crucial for the success of the Java platform. After all, it targets several facets of the computing industry, but the crux of the platform is still the Java programming language itself. Several IT managers are putting off a wholehearted commitment to pure Java solutions because they still aren't convinced that it will meet the performance requirements for their applications. Happy programmers and cool languages don't put bread on the table.
This month we'll take a closer look at the two stages of compilation that lead to executable Java code. We'll also examine the available Java code compilation alternatives. Java is an interpreted language. An executed Java program typically consists of a virtual machine that interprets bytecodes. Interpreted languages can never be as fast as compiled languages because the process of interpretation consists of converting high-level programming instructions into machine instructions, one line at a time. On the other hand, natively compiled code is machine code - that is, the program is already in the form of machine instructions when it's ready to be executed. Obviously, the latter will execute faster.
WORA or WOCA?
Do we really need a "write once, run anywhere" solution? WORA essentially means that the deployed version of code is bytecode. The application is developed in the Java programming language and then compiled down to bytecode. This bytecode is shipped to the machine where the application actually needs to run. "Anywhere" requires that the same code should be capable of running on different hardware and operating systems.
What's the alternative to code that runs in a JVM? To achieve true native speeds for a particular platform, the application should execute native code (which runs as a process spawned by the operating system as opposed to code that's interpreted by a VM). This should be a no-brainer. Platform-specific optimizations can be performed most efficiently on native code. So how do we get native code from Java source code that runs on every platform? This is a bit of a dilemma.
The answer boils down to the responsibilities of the players in the application development process - the developer and the compiler vendor. If the developers end up using platform-specific compilers for individual platforms, they'll run into the very porting headache that Java has tried to eliminate. The responsibility of providing platform-independent executable code falls on the guys who write the compiler. If the same code could be compiled to different platform executables, we'd get the best of both worlds. This would be the "WOCA" - "write once, compile anywhere" - solution. The approaches taken by WORA and WOCA are illustrated in Figure 1.
Note: Neither WORA nor WOCA is feasible in a language like C or C++. The reason is that there are many platform-specific extensions of the programming language itself, which means that the source code written by developers for one platform will differ from the source code written for another platform.
One of the greatest things Sun Microsystems has done for the computing world is to keep a tight rein on what goes into the Java programming language and to make sure that it has no platform-specific extensions. The result is that source code written in the Java programming language looks the same on any platform.
Compiled Just in Time!
Just-in-time (JIT) compilers translate Java bytecode into instructions that can be sent directly to the processor. A JIT compiler provides a second stage of compilation at the platform where code compiles the bytecode. Once recompiled by the JIT compiler, the code will usually run more quickly in the computer. Typically, JIT compilers come with the VM and their use is optional. The JVM passes on the .class file to the JIT compiler; the JIT compiler compiles them into native code for the machine that the application runs on. The JIT is an integral part of the Java Virtual Machine implementation. Since Java is a dynamic language, the bytecodes are "dynamically" compiled into machine code only when they're loaded.
JITs make the executable code faster only after it's called for the first time. In fact, the first time the class is used, it may actually be slower since it goes through an extra compilation step. Therefore, JITs are effective only when the code is used repeatedly. Going by the 80-20 rule (the program spends 80% of the time in 20% of the code), this happens most of the time and therefore JIT compilers speed up the execution considerably. In fact, when Java JIT compilers first came out, traditional Java interpreters were resulting in execution speeds that were 20 to 30 times slower than comparable programs written in C. JIT compilers brought the speeds of Java code from 40% to 60% of C/C++ execution speeds. Figure 2 illustrates the approach taken by JIT compilers.
JIT is currently the predominant technology for speeding up Java applets and applications. Almost all the compiler and IDE vendors - including Symantec's Visual Cafe, Borland's JBuilder, IBM's VisualAge, Asymetrix's Supercede and Microsoft's Visual J++ - provide Java compilers that have the JIT compilation option.
Find the "Hot Spots"!
While JIT compilers are a step up, they're still unable to reach the raw speeds of native code. Sophisticated optimization on JIT-compiled code is hard to achieve. Also, applying these optimizations slows the process of JIT compilation. For applications that require a larger performance gain, another viable approach is "adaptive optimization."
Adaptive optimization further leverages the 80-20 rule. The logic that drives adaptive compilers is that since the majority of a program's executable time is spent in a small fraction of the code, it makes sense to concentrate on making that fraction as fast as possible. Adaptive compilation uses a more "dynamic" and "intelligent" form of compilation. The runtime system identifies the sections of code that are performance bottlenecks by continuously monitoring the executing code. It then applies sophisticated optimization techniques to speed up execution in these critical sections of code.
This technique is used by Sun's much-awaited HotSpot Virtual Machine. Fundamentally, HotSpot is based on compiler technology that is an extension of the JIT compiler technology. The HotSpot VM constantly monitors the performance of the executing bytecode. It "inlines" (remember the C++ "inline" keyword?) methods in the critical region for maximum performance. Inlined methods are like static methods in the class and are replaced by actual machine instructions (instead of method calls) in compiled code. Optimization in the HotSpot compiler is achieved with the help of a very advanced garbage collector and a new thread synchronization mechanism.
Sun claims that HotSpot will make Java execution speeds comparable to C/C++ speeds. The compiler is expected to be released early this year and is more than a year late.
JIT compilers and adaptive compilers are very important for improving performance on the client side, where the program may be deployed on various platforms. Applets execute by downloading bytecodes (.class files) from specific URLs. Thus the Java source code is compiled into bytecodes at the machine corresponding to the URL. These bytecodes may be downloaded to a multitude of clients - that is, to several platforms. Hence, native optimizations can't be performed before the code is deployed. JIT compilers apply optimizations that aren't really platform specific (as they target several platforms). They're also limited by the amount of time sacrificed in performing the compilations on the fly. An additional disadvantage of compiling bytecodes at the deployment machine is that decompiling them isn't easy. Therefore, companies can't prevent software piracy of their code.
The same restriction doesn't apply to code deployed at a single machine. In a client/server architecture this would be the server. Hence, code deployed at the server can be compiled down to native code, and global native optimizations can be applied to it. The trade-off is that this code won't run on other platforms. The reality is that it won't need to.
Native compilers compile Java source code down to platform-specific executables. In that sense they're no different from the traditional C/C++ compilers. As mentioned earlier, the unique feature that Java brings to the table is that the source code is always the same. Most IDEs provide the option of compiling Java source to native code. However, most IDEs support only one platform. For example, JBuilder, Supercede, Visual Café, Visual J++ and so on run only on NT. As a result, although you develop platform-neutral source code in Java, you end up deploying on a single platform when you use a traditional native compiler.
This points toward the need for a cross-platform Java compiler, as illustrated earlier in Figure 1. In other words, we need a compiler from a single vendor that can compile Java source code down to native code, and that can take advantage of the specific platform it runs on. Such a compiler should also support compilation of bytecodes to support Java's dynamic nature. Currently, TowerJ from Tower Technologies provides these features. It provides support for several platforms, including NT and Solaris. Another cross-platform native compiler under development is JOVE from Instantiations.
What About Legacy Code?
Contrary to popular belief, the programming world will never be all Java. C/C++, Pascal, BASIC, Fortran, COBOL, Smalltalk and others will continue doing what they do best. Sun Microsystems recognizes this and that's why the Java Native Interface (JNI) is a part of the Java VM specification. Any vendor that provides a Java VM needs to support JNI (Microsoft thought they were the exception, but were recently proved wrong). JNI allows Java applications to interact with legacy applications written in other languages. Using JNI, the application or module is linked with Java code as a kind of shared library. If your Java application needs to interact with legacy code, you need to make sure that the compiler you choose supports JNI.
Application development in Java is eventually going to require a mix of the various compiler technologies available. If speed of execution isn't a major concern, a traditional Java compiler may be used. If the performance boost provided by JIT compilers suffices, that option can always be turned on. If HotSpot lives up to its expectations, developers and managers for realtime applications can breathe easy. And if it doesn't, native compilers will grab a larger market share. On the server side, native compilers will always find a home. Whatever compiler technology is chosen for development, one significant service that Java has performed for the developers is that it has shifted a large part of the burden of supporting cross-platform applications to the compiler vendors.
About the Author
Ajit Sagar is a member of the technical staff at i2 Technologies in Dallas, Texas. He holds a
BS in electrical engineering from BITS Pilani, India, and an MS in computer science from
Mississippi State University. He is a Java-certifiedprogrammer with eight years of programming experience, including two in Java. Ajit can be reached at [email protected]