HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML
 

I've been working with Java for almost two and a half years now. I can say with confidence that I know pretty much what's going on in the Java core classes. Through the Java training I do, I try to pass an understanding of Java to my students (and hopefully through the students the same information can be passed on to other people who want to learn about Java). What I'd like to pass on in this month's Tips and Techniques column is the technique I've found has taught me more about how Java works than any other.

You'd think that the distribution of the source code for the Java core APIs(the classes in the java.* packages) would be sufficient to tell you most anything you would want to know. For example, if you really want to know how the Abstract Windows Toolkit is structured and works, There's nothing that's going to tell you as much as analyzing the classes in the java.awt package yourself. Books and on-line documentation will tell you how to use the AWT, but only a good understanding of the construction of the AWT classes will allow you to have string confidence in your expertise.

Unfortunately, the full implementation of the AWT, and most of the other java.* packages, is not included in the source code release of the Java Development Kit (the JDK). For example, the details of the HTTP and FTP client classes that are distributed with the JDK are just not available. Another example: many Java interfaces that are well documented in the JDK do not have any examples for interface implementations. Take the java.applet.AppletContext and java.applet.AppletStub interfaces, for instance. No actual implementation of these Java interfaces is included with the JDK.

I'll admit something else here that I think will probably get me a little bit of hate e-mail. Sometimes I like to know how other Java developers do some of the really cool things I run across on the Web. Sometimes I run across an applet or a demo application, and I say to myself, "Wow! How did they do that?" And sometimes I employ the tools I'm about to describe to discover exactly how the developers did what I thought was so cool.

So for both of the situations, when I want to know more about the implementation of the core Java classes and when I just have to know how a particular trick was implemented, I employ Java decompiling tools to find out what I want to know. Maybe by admitting this fact publicly I'll be forgiven by the patron god of programmers. (Are you there, Bill? It's me, Brian.)

I'm not the first nor the only developer to consider employing decompilers to dive into Java classes. In fact, one of the decompiling tools I'm going to describe here, called a javap, was distributed with the Beta version of Java. Clearly, the Java designers thought this was a tool a significant number of Java engineers would want to have and would employ.

The javap tool is a program, distributed with the Java Development Kit, that can convert a compiled .CLASS file into a readable version of the Java class. The readable file is not a reproduction of Java source code. Instead, the javap tool is only capable of making the bytecode in the .CLASS file readable. I find this tool less and less preferred as time goes on, especially since much more capable tools have been developed. The reconstruction of the bytecode is sometimes necessary to answer questions about the Java language and VM.

For example, the first time I used javap (I remember it well) was when I couldn't find adequate Java documentation to answer this question: When exactly are in-line instance variable initializers performed?" That is, if I have this simple class definition:

class MyFOO {
public int m_MyFooNum =
MyFooStaticMethod(10);

public MyFoo {
if (0 ! = m_MyFooNum)
m_myFooNum = 100;
}

public static int MyFooStaticMethod(int i) {
//returns something other than 0
}

Then, what will the instance variable m_MyFooNum be equal to after the constructor is run? There are two options: If the in-line initialization is run before the code in the constructor, then the value of the variable will be 100 when the constructor is completed, no matter what the return value of MyFooStaticMethod is. On the other hand, if the constructor's code is run before the in-line initializer, then the variable will have the return value of MyFooStaticMethod. Granted, this example is a little contrived, but the questions, that of deterministic action of the Java compiler, can be quite crucial.

To figure the answer to this question, I might have gone to the Java language spec, but that's just not very exciting. Instead, I compiled a class that had this problem and then used the javap tool to decompile the resulting .CLASS file so that I could see that exact bytecode produced. As it turns out, the javap tool showed that the in-line initializer is always run before any code in a constructor. In fact, the in-line initializer is added as code at the beginning of every class constructor, including any default class constructor provided by the compiler. that picayune detail about the Java language is not important, but the use of the JDK's javap tool let me answer that question and turns out to have been very valuable for me.

Other Java questions I had in the future could not be answered by looking at any documentation because no documentation existed. Some of the sample questions:

  • How are RMI proxy and stub classes constructed, exactly?
  • What are Microsoft's Visual J++ COM/Java shim classes made of?
The javap tool is included in the bin directory of the JDK distribution. If the JDK's java or javac tools are available through your system's search path, the the javap tool probably is, also. That is, if you can invoke the java interpreter of the javac compiler simply by typing "java" or "javac" on your system's command line, then you can probably just type javap to invoke the javap tool. I'm not going to go into the details of how you use this tool here so I can use the space to discuss other related tools and issues. Instead, I'll just encourage you to read the on-line documentation about the JDK tools available to the JavaSoft Web site (www.javasoft.com)

The decompilation of any of the Java core classes is a relatively simple task using the javap tool. These classes are distributed in the JDK in a file named CLASSES.ZIP. Using a zip file tool, such as WinZip (a shareware tool available at www. winzip.com) or PKZip, you can extract the .CLASS file for any of the Java core classes from the zip archive. Then, apply the javap tool to the .CLASS file to get a text version of the bytecode in the .CLASS file.

For Java core classes that have the source code distributed in the JDK, this is a useless technique. The source code is available and should be used. Some classes in the CLASS.ZIP file are not strictly part of the Java core API and so do not have the source code available. For example, the URL protocol handler and content handler classes do not have associated source code available. When you need to know how these classes are implemented (or you just want to know) you can use the javap tool to at least take a look at the bytecode. From the javap tool output, it's just a matter of mental gymnastics to approximate the original class source code.

Decompiling .CLASS files distributed on the Web is also a relatively simple affair using the javap tool. Let's say you see a really cool Applet with some great animation you would like to know the algorithm for. Just viewing the Applet embedded in an HTML doc doesn't readily allow you to do this. The trick is to generate the URL for the Applet's CLASS file yourself and use your browser to view that.CLASS file directly. Most browsers have a "Save the File" feature which you can use to then save the .CLASS file to your local disk. Not that you have a .CLASS file, you then apply the javap tool to decompile the downloaded .CLASS file and you've achieved your goal.

For example, you see an HTML page with the URL http://www.myserver.com/index.html. This page has a cool applet in it. Step one is to view the source for the HTML file and find the <APPLET> tag for the embedded applet. The fields of the <APPLET> tag should tell you sufficient information to generate the URL for the Applet's .CLASS file. The .CLASS file's URL is a combination of the page's URL, the applet's CODEBASE (which may be an empty string or a URI) and the Applet class' name with the extension 'class'. Step two is to generate the applets' URL (e.g., http://www.myserver.com/appletcodebase/TheAppletclass). Step three is to type this URL into the browser and view the file. What you'll see at this point is a lot of garbage characters, which is pretty useless by itself. Step four is to save the .CLASS file to a file on your local file system (e.g., to a file in your home directory called "TheApplet.class"). The final step is to apply the javap tool to the .CLASS file.

Several people have also made .CLASS file to .JAVA source code file generators. My favorite is the WingDis decompiler available from WingSoft software (www.wingsoft.com) for a nominal fee. Using such decompiler tools you can generate straight Java source code from a compiled .CLASS file. If you don't feel like slogging through readable bytecode, using such a tool is probably the best bet for you.

Now that I've published how anyone in the world can grab a copy of your Applet and steal your code, I feel that I must comfort some of my readers with advice about how you can effectively prevent other users from decompiling your .CLASS files. I mention two techniques briefly.

The first technique is very simple: Don't publish .CLASS files you don't want others to decompile on the Web. If you've implemented some really cool framework or communications scheme, protect it by not distributing it. Generally you're going to do this by running the class only on the server-side. Even if this means your Applet or push channel is going to run less efficiently, in many instances it's worth the extra hassle to protect your code.

The second protective technique, if you can't restrict the distribution of your classes, is a very old technique called "obfuscation". This involves essentially renaming all the public, protected and private members of your classes, as well as the classes themselves, using strings of random characters. So, while I may be able to understand what's going on in your class, and you have a class named "CreditCardProccessor", and I might be tempted to steal and use your class because it has a public method named "PerformTransaction", I won't able to even understand your class with the name "ksdf98whlwf98wf" and the public method "ks9wkf8". It's a simple and crude technique, but it has been used effectively for years.

A couple of code obfuscation tools have been created to generate obfuscated classes from your Java source code or compiled .CLASS files. Search using your favorite Web search engine using the string "Java and obfuscat*". You should get a few hits of interest.

Using these two techniques for protecting your code, you should be able to effectively protect your Java classes from theft or casual hacking.

About the Author
Brian Maso is a programming consultant working out of Portland, OR. He is the co-author of The Waite Group Press's upcoming release, "The Java API SuperBible." Before Java, he spent five years corralled in the MS Windows branch of programming, working for such notables as the Hearst Corp., first DataBank and Intel. Readers are encouraged to contact Brian via e-mail with any comments or questions at [email protected]

	

Listing 1
 
// This method can be called to compile the .Java file with the given filename. 
The reference to method "getCompilerClassPath()" is a placeholder for whatever 
mechanism you would use to build the classpath that the compiler should use. 
For example, you might give a list of JAR files, ZIP files or local file system directories. 
public static void compile(String fileName) { 
    // Run the javac compiler inside the current address space. 
    String astrArgs[] = { 
            "-classpath", 
            getCompilerClassPath(), 
            "-nowarn", 
            fileName 
    }; 

    // The sun.tools.javac.Main constructor takes two args: 
	the error output stream to use (which may be a file and pipe or whatever) 
	and the string “javac”. 
    sun.tools.javac.Main compiler = 
        new sun.tools.javac.Main(System.err, "javac"); 
    boolean fOk = compiler.compile(astrArgs); 

    if (!fOk) 
        System.err.println("Compilation of file " + fileName + 
                "failed!"); 
} 

Listing 2
 
// This static method compresses multiple source files into a single JAR file. 
The manifest file is an optional parameter which is ignored if it is null. 
Public static jar(String jarFile, String[] aFileNames, String manifestFile) { 
    String[] astrArgs = null; 
    if(null != manifestFile) { 
        astrArgs = new String[3+aFileNames.length]; 
        astrArgs[0] = "cfm"; 
    } else { 
        astrArgs = new String[4+aFileNames.length]; 
        astrArgs[0] = "cf"; 
    } 

    args[1] = jarFile; 

    int i = 2; 
    if(null != manifestFile) 
        args[i++] = manifestFile; 

    for (int j=0; j<aFileNames.length; j++) 
        args[i+j] = aFileNames[j]; 

    sun.tools.jar.Main jartool = 
        new sun.tools.jar.Main(System.out, System.err, "jar"); 

    boolean ok = jartool.run(args); 
    if (!ok) 
        error("Jar tool invocation failed"); 
}


 

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.