HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML
 

Journeyman's HTTP Driver, by Marc Connolly

If you were to consider all the surface clutter in my life, you might find it strange to know, deep down inside, I like things simple. I mean really simple, because I'm not very bright and have trouble with basic concepts, like getting out of bed and getting the car pointed in the right direction on Monday mornings.

I'm also cheap. Folks who know me will tell you they've seen me pick up pennies from the street, but this is simply untrue. I stoop for nickels and dimes when people are nearby. Pennies I always walk past and fetch afterwards when no one is looking. By the way, this is a secret I'd like kept between you and me, okay?

There's something else. The past 10 years of travel have left me feeling a bit like a journeyman craftsman, not too unlike a coppersmith of old who traveled from town to town plying his trade and wares. Given this trait, I tend to gravitate toward things that are lightweight and portable - the more, the better.

These particular attributes came to the surface recently when I had to perform a volume test against a servlet-based application I'd written. I wanted to get a feeling for the behavior of several key pooling components and how they fared under stress without going through a lot of trouble or expense.

So I went looking for an existing solution in that great ether we call the Internet and which I have occasionally, in a less charitable frame of mind, referred to as the great flotsam.

HTTP Driver Criteria
My criteria were simple. What I needed had to meet three basic requirements. First, because I'm a simpleton, the solution had to be easy to set up and use.

Second, it had to be economical because, remember, I'm cheap and didn't want to spend a lot of time or money.

Finally, I preferred a solution that was reasonably portable. I didn't want something I would have to make or go through a lot of trouble setting up and reconfiguring every time I moved to a different environment.

So with those requirements in mind, I spent some time looking around for anything that appeared promising; I felt pretty sure, within an hour or so, that I'd run across an acceptable solution. However, and a bit to my annoyance, I came up empty-handed.

Well, that's not exactly true, because I did find other folks who were interested in the same thing, and I found references to commercial products, a few of which I'd worked with in the past.

But given my criteria (simple, cheap, portable), I decided I'd looked around enough. So, sitting back in a huff, I considered my options:

  1. Give the app to the end users as is and hope it works
  2. Hire and train some chimpanzees to do the volume tests
  3. Quit, move to another country, and hope no one comes looking for me
Option 1 had seldom worked in the past and, given the number of times I'd tried this approach, odds weren't good that it would be any different this time. On the surface, option 2 seemed to have some merit until I recalled the energy expended training another primate more closely related to me, namely, my son. Option 3 was out of the question because they always come looking for you, wherever you run.

Deciding I was out of alternatives and the need to do a volume test wasn't going away, I decided to take a more pragmatic approach and try to come up with something on my own.

After a bit of doodling I decided that with a little effort, a viable solution was possible in Java that would meet two of my requirements. First, developing the solution in Java would give me the portability I was looking for. Second, the richness of Java would provide me a level of abstraction that would remove me from the grittier aspects of, say, socket programming details. Java's richness would also lend itself to the economies of time and expense I was looking for. However, the "simple, easy to use" requirement would be up to me.

HTTP Driver Solution
The end result was a trivial Java-based application employing sockets and threads to issue multiple HTTP requests - and process responses - repetitively, against one or more HTTP servers. Roughly speaking, the application can simulate the activity of many users.

The structure of the application is quite simple (see Figure 1).

Figure 1

The application is composed of three classes - HTTPDriver, HTTPSubDriver (which extends Thread), and URLProcessor. HTTPDriver creates one or more instances of HTTPSubDriver, each of which, in turn, creates a single instance of URLProcessor. An instance of HTTPSubDriver is analogous to a single user, while the URLProcessor could be thought of as the repetitive serial actions ("click stream," if you prefer) taken by that user.

Initially, HTTPDriver's main purpose is to read at least two parameters. The first indicates how long HTTPDriver is to run. The second specifies the location of another file that contains information on the number of "users" (HTTPSubDriver instances) to create and the runtime details for each. For example, HTTPDriver could be invoked as follows:

java HTTPDriver 20 subDriverDefs.txt

The first parameter, 20, indicates the maximum amount of time the entire process is to run. The invocation of HTTPDriver would run for a maximum of 20 minutes. The second parameter is the name of a file that contains a list of "users" definitions or, more precisely, HTTPSubDriver definitions.

HTTPDriver always produces a log that, by default, is called HTTPDriver.log and is created in the current directory. You could, however, override that by supplying an alternative log file. For instance:

java HTTPDriver 20 subDriverDefs.txt \tmp\HTTPDriver.log

Again, each user is represented by an instance of HTTPSubDriver with its own set of parameters. Using the preceding command line invocation as an example, the parameters for the HTTPSubDriver instances would have been in the file called subDriverDefs.txt.

These parameters convey, for example, the subDriver's own individual runtime (which could be less than HTTPDriver itself), its sleep time, the file containing the list of URLs to be processed, and the maximum number of passes URLProcessor is permitted to make upon that file within the time HTTPSubDriver has been allotted to run. Using the preceding command line invocation as an example, subDriverDefs.txt could contain something like this:

TEST_1 1 10 www.dev.com:80 EchoServlet.txt 3 2000
TEST_2 10 15 www.acp.com:80 empSelect.txt 9 4000
DML_1 5 3 www.dev.com:80 EmpDML1.txt 8 2000
DML_2 1 10 www.dev.com:80 EmpDML2.txt 7 2000

Each line represents the parameters for a single instance of HTTPSubDriver. The first parameter is an arbitrary tag that HTTPSubDriver uses whenever it writes to the log. The second parameter is the maximum runtime, in minutes, the HTTPSubDriver instance is to execute, while the third parameter specifies the HTTPSubDriver's sleep interval in seconds.

As you can probably deduce, the fourth parameter is the host and port of the HTTP server against which the requests will be directed and the responses read from. Note in the example I have www.dev.com and www.acp.com specified, presumably referencing two different hosts. There's nothing to stop you from driving traffic to different hosts within the same HTTPDriver instance.

The fifth parameter is the name of a file containing the list of one or more URLs (more on these later) to process. The sixth parameter indicates the maximum number of passes the URLProcessor can make on that file. The seventh and final parameter is the size of a character buffer into which segments of the response stream are read.

Now that was a mouthful but, again, the implementation is quite simple: HTTPDriver creates one or more HTTPSubDrivers, each of which creates a URLProcessor to do the work (see Figure 2).

Figure 2

Internal Structure
Let's look at some of the code at a high level. As part of its initialization, HTTPDriver creates a Vector on its HTTPSubDriver instances to monitor and control the instances collectively:

private Vector threadPool =
new Vector();

As each HTTPSubDriver instance is created, its handle is added to the Vector:

threadPool.add
(new HTTPSubDriver(
subDriverId
,subDriverRuntime
,subDriverSleepTime
,hostAndPort
,subDriverInputFile
,maxInputFileIterations
,httpResponseBufferSize
,logTimestampFormat
));

During its instantiation, HTTPSubDriver creates an instance of URLProcessor:

urlProcessor =
new URLProcessor
(subDriverId
,hostAndPort
,subDriverInputFile
,maxInputFileIterations
,httpResponseBufferSize
,logTimestampFormat);

After HTTPDriver has created all the HTTPSubDriver instances, it marks them as daemons and starts them:

for (int i=0;
i<threadPool.size();
i++)
{
HTTPSubDriver httpSubDriver =
(HTTPSubDriver)
threadPool.elementAt(i);
httpSubDriver.setDaemon(true);
httpSubDriver.start();
}

Remember, as each HTTPSubDriver was created, an instance of URLProcessor was also created. Generally speaking, URLProcessor reads the file it was given that contains a list of URLs. For each URL encountered, URLProcessor opens a socket to the server, writes the request, waits for and then reads the response, and finally closes the socket. URLProcessor does this one request after another. The requests can be any valid HTTP method but were typically GET and POST in my case. There's nothing to stop you from including other methods, such as HEAD, DELETE, OPTIONS, or so forth, in the list of URLs. PUT would require a few extra lines of code, however, similar to the way FORM data handling is described below.

URLProcessor recognizes two formats:

{Method} {URL} HTTP-version

or

{Method} {URL} HTTP-version FORMTA={}

where FORMDATA has the following format:

keyWord1=value1&keyWord2=value2...

A typical file containing a list of URLs might contain many entries and will look something like this:

GET /dev/servlets/Login HTTP/1.0
GET /TFGreenOnBlack.jpg HTTP/1.0
GET /stylesheet.css HTTP/1.0
GET /applets/MenuManager.jar HTTP/1.0
POST /dev/servlets/Login HTTP/1.0 FORMDATA=userId=Frank&userPswd=Zappa

Note the FORMDATA literal in the last sample line. This is a keyword that URLProcessor looks for to determine whether there's any FORM data to be appended to the request as its entity body.

With the exception of the last line, this sample looks very much like a typical HTTP server's access log. Also note that if you wanted to gather the "click stream" of a particular application, the HTTP server's access log would be an ideal source to begin with. Or you could do something similar to what I did. Within my application I embedded two fragments in the doGet and doPost methods, respectively, which enabled me to quickly generate "click streams" for playing back through HTTPDriver. I've since folded those fragments into an existing general utility class reducing the code to a single line in each method.

URLProcessor's next() method is where all the activity happens and where the requests to the HTTP server are issued and the responses handled. URLProcessor is responsible for reading each URL from the list, forming the request, creating a socket to write the request stream to, and subsequently reading the response stream from. After each request is written and its response completely read, the socket is closed - indirectly by a .close() on the response stream - and a relative response time duration is calculated and logged.

This repetitive process is governed by dispatching each instance of HTTPSubDriver. As each instance is dispatched, it invokes its URLProcessor's next() method and continues to do so until interrupted.

As it runs, HTTPDriver periodically examines the state of all the HTTPSubDrivers it created:

while(System.currentTimeMillis()
< driverStopTime)
{
logMessage
("Current Elapsed Runtime: "
+
(System.currentTimeMillis()
- driverStartTime)
/ 1000 + " seconds");
if (backgroundThreadsActive())
{
Thread.sleep(10000);// 10s
}
else
{
logMessage
("All background threads
are inactive");
break;
}
}

If all of the HTTPSubDriver instances are no longer active, HTTPDriver closes its log file and terminates. That's it. As you can see by this description, the entire process is fairly straightforward.

Runtime Considerations
There are a few things you should consider.

Remember when I said that this solution can simulate, roughly speaking, the activity of many users? Well, I qualified the statement for a reason. If you think about the nature of this application's structure and implementation, you'll recognize some limitations. For instance, we're trying to approximate the behavior of multiple users, but what happens if we run a single instance of HTTPDriver with 20 HTTPSubDriver instances - that is, 20 users?

First off, the runnable queue could get quite long, potentially elongating the duration between dispatch for some threads. This problem could be pronounced if your environment is not conducive to parallelism - for example, running in a single CPU configuration or using green threads, which is the default behavior. The implications of this problem are such that some of the HTTPSubDriver instances might not get dispatched frequently or long enough to do anything substantive.

Compounding the issue of a long runnable queue could be the interwoven effects of normal thread blocking, for instance, on socket read waits. Making things further problematic, I've set each HTTPSubDriver thread to Thread. NORM_PRIORITY and haven't considered systems where time slicing may not be supported. For example, I didn't implement yield() in the construction of run() in HTTPSubDriver.

Something else to think about would be network congestion along the routes taken by this application. But more important would be the congestion at the point of origin - in other words, where HTTPDriver is running. For instance, in my tests, when I ran a single instance of HTTPDriver from one machine using 10 HTTPSubDriver definitions, the behavior was adequate. However, when I started multiple instances - three in my case - of HTTPDriver on the same machine, each with 10 HTTPSubDriver definitions, things got a bit slow. Aside from the fact that I was running on a single CPU, I was choking my NIC.

By reworking the test configuration (two HTTPDrivers, each with six HTTPSubDriver instances per machine) across several "client" machines, I got a reasonable amount of traffic resembling the transit/response times of a single instance of HTTPDriver on a single machine.

These observations about runnable queue length and network congestion yield a key practical consideration in the application of this solution: having an environment where the test workload can be distributed effectively is ideal.

Summary
It should be evident by now that what I've described is a trivial solution to a common requirement and is not intended to be a commercial-grade substitute. This solution does, however, have benefits for those of us seeking a portable, economical, and effective means for generating HTTP traffic without having to employ and train chimpanzees - or leave the country in hurry.

Author Bio
Marc Connolly is by trade a programmer. Currently working for Oracle Corporation, he has worked for various companies, large and small, over the past 20 years. His focus has been primarily in product development for and with relational databases with occasional forays into stranger venues. [email protected]

Download Assoicated Source Files (Zip format - 9.82 KB)

 

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.