If you were to consider all the surface clutter in my life, you might
find it strange to know, deep down inside, I like things simple. I
mean really simple, because I'm not very bright and have trouble with
basic concepts, like getting out of bed and getting the car pointed
in the right direction on Monday mornings.
I'm also cheap. Folks who know me will tell you they've seen
me pick up pennies from the street, but this is simply untrue. I
stoop for nickels and dimes when people are nearby. Pennies I always
walk past and fetch afterwards when no one is looking. By the way,
this is a secret I'd like kept between you and me, okay?
There's something else. The past 10 years of travel have left
me feeling a bit like a journeyman craftsman, not too unlike a
coppersmith of old who traveled from town to town plying his trade
and wares. Given this trait, I tend to gravitate toward things that
are lightweight and portable - the more, the better.
These particular attributes came to the surface recently when
I had to perform a volume test against a servlet-based application
I'd written. I wanted to get a feeling for the behavior of several
key pooling components and how they fared under stress without going
through a lot of trouble or expense.
So I went looking for an existing solution in that great
ether we call the Internet and which I have occasionally, in a less
charitable frame of mind, referred to as the great flotsam.
HTTP Driver Criteria
My criteria were simple. What I needed had to meet three
basic requirements. First, because I'm a simpleton, the solution had
to be easy to set up and use.
Second, it had to be economical because, remember, I'm cheap
and didn't want to spend a lot of time or money.
Finally, I preferred a solution that was reasonably portable.
I didn't want something I would have to make or go through a lot of
trouble setting up and reconfiguring every time I moved to a
different environment.
So with those requirements in mind, I spent some time looking
around for anything that appeared promising; I felt pretty sure,
within an hour or so, that I'd run across an acceptable solution.
However, and a bit to my annoyance, I came up empty-handed.
Well, that's not exactly true, because I did find other folks
who were interested in the same thing, and I found references to
commercial products, a few of which I'd worked with in the past.
But given my criteria (simple, cheap, portable), I decided
I'd looked around enough. So, sitting back in a huff, I considered my
options:
- Give the app to the end users as is and
hope it works
- Hire and train some chimpanzees to do
the volume tests
- Quit, move to another country, and hope no one comes
looking for me
Option 1 had seldom worked in the past and, given the number
of times I'd tried this approach, odds weren't good that it would be
any different this time. On the surface, option 2 seemed to have some
merit until I recalled the energy expended training another primate
more closely related to me, namely, my son. Option 3 was out of the
question because they always come looking for you, wherever you run.
Deciding I was out of alternatives and the need to do a
volume test wasn't going away, I decided to take a more pragmatic
approach and try to come up with something on my own.
After a bit of doodling I decided that with a little effort,
a viable solution was possible in Java that would meet two of my
requirements. First, developing the solution in Java would give me
the portability I was looking for. Second, the richness of Java would
provide me a level of abstraction that would remove me from the
grittier aspects of, say, socket programming details. Java's richness
would also lend itself to the economies of time and expense I was
looking for. However, the "simple, easy to use" requirement would be
up to me.
HTTP Driver Solution
The end result was a trivial Java-based application employing
sockets and threads to issue multiple HTTP requests - and process
responses - repetitively, against one or more HTTP servers. Roughly
speaking, the application can simulate the activity of many users.
The structure of the application is quite simple (see Figure 1).
The application is composed of three classes - HTTPDriver,
HTTPSubDriver (which extends Thread), and URLProcessor. HTTPDriver
creates one or more instances of HTTPSubDriver, each of which, in
turn, creates a single instance of URLProcessor. An instance of
HTTPSubDriver is analogous to a single user, while the URLProcessor
could be thought of as the repetitive serial actions ("click stream,"
if you prefer) taken by that user.
Initially, HTTPDriver's main purpose is to read at least two
parameters. The first indicates how long HTTPDriver is to run. The
second specifies the location of another file that contains
information on the number of "users" (HTTPSubDriver instances) to
create and the runtime details for each.
For example, HTTPDriver could be invoked as follows:
java HTTPDriver 20 subDriverDefs.txt
The first parameter, 20, indicates the maximum amount of time
the entire process is to run. The invocation of HTTPDriver would run
for a maximum of 20 minutes. The second parameter is the name of a
file that contains a list of "users" definitions or, more precisely,
HTTPSubDriver definitions.
HTTPDriver always produces a log that, by default, is called
HTTPDriver.log and is created in the current directory. You could,
however, override that by supplying an alternative log file. For
instance:
java HTTPDriver 20 subDriverDefs.txt \tmp\HTTPDriver.log
Again, each user is represented by an instance of
HTTPSubDriver with its own set of parameters. Using the preceding
command line invocation as an example, the parameters for the
HTTPSubDriver instances would have been in the file called
subDriverDefs.txt.
These parameters convey, for example, the subDriver's own
individual runtime (which could be less than HTTPDriver itself), its
sleep time, the file containing the list of URLs to be processed, and
the maximum number of passes URLProcessor is permitted to make upon
that file within the time HTTPSubDriver has been allotted to run.
Using the preceding command line invocation as an example,
subDriverDefs.txt could contain something like this:
TEST_1 1 10 www.dev.com:80 EchoServlet.txt 3 2000
TEST_2 10 15 www.acp.com:80 empSelect.txt 9 4000
DML_1 5 3 www.dev.com:80 EmpDML1.txt 8 2000
DML_2 1 10 www.dev.com:80 EmpDML2.txt 7 2000
Each line represents the parameters for a single instance of
HTTPSubDriver. The first parameter is an arbitrary tag that
HTTPSubDriver uses whenever it writes to the log. The second
parameter is the maximum runtime, in minutes, the HTTPSubDriver
instance is to execute, while the third parameter specifies the
HTTPSubDriver's sleep interval in seconds.
As you can probably deduce, the fourth parameter is the host
and port of the HTTP server against which the requests will be
directed and the responses read from. Note in the example I have
www.dev.com and www.acp.com specified, presumably referencing two
different hosts. There's nothing to stop you from driving traffic to
different hosts within the same HTTPDriver instance.
The fifth parameter is the name of a file containing the list
of one or more URLs (more on these later) to process. The sixth
parameter indicates the maximum number of passes the URLProcessor can
make on that file. The seventh and final parameter is the size of a
character buffer into which segments of the response stream are read.
Now that was a mouthful but, again, the implementation is
quite simple: HTTPDriver creates one or more HTTPSubDrivers, each of
which creates a URLProcessor to do the work (see Figure 2).
Internal Structure
Let's look at some of the code at a high level. As part of
its initialization, HTTPDriver creates a Vector on its HTTPSubDriver
instances to monitor and control the instances collectively:
private Vector threadPool =
new Vector();
As each HTTPSubDriver instance is created, its handle is added to the
Vector:
threadPool.add
(new HTTPSubDriver(
subDriverId
,subDriverRuntime
,subDriverSleepTime
,hostAndPort
,subDriverInputFile
,maxInputFileIterations
,httpResponseBufferSize
,logTimestampFormat
));
During its instantiation, HTTPSubDriver creates an instance
of URLProcessor:
urlProcessor =
new URLProcessor
(subDriverId
,hostAndPort
,subDriverInputFile
,maxInputFileIterations
,httpResponseBufferSize
,logTimestampFormat);
After HTTPDriver has created all the HTTPSubDriver instances,
it marks them as daemons and starts them:
for (int i=0;
i<threadPool.size();
i++)
{
HTTPSubDriver httpSubDriver =
(HTTPSubDriver)
threadPool.elementAt(i);
httpSubDriver.setDaemon(true);
httpSubDriver.start();
}
Remember, as each HTTPSubDriver was created, an instance of
URLProcessor was also created. Generally speaking, URLProcessor reads
the file it was given that contains a list of URLs. For each URL
encountered, URLProcessor opens a socket to the server, writes the
request, waits for and then reads the response, and finally closes
the socket. URLProcessor does this one request after another. The
requests can be any valid HTTP method but were typically GET and POST
in my case. There's nothing to stop you from including other methods,
such as HEAD, DELETE, OPTIONS, or so forth, in the list of URLs. PUT
would require a few extra lines of code, however, similar to the way
FORM data handling is described below.
URLProcessor recognizes two formats:
{Method} {URL} HTTP-version
or
{Method} {URL} HTTP-version FORMTA={}
where FORMDATA has the following
format:
keyWord1=value1&keyWord2=value2...
A typical file containing a list of URLs might contain many
entries and will look something like this:
GET /dev/servlets/Login HTTP/1.0
GET /TFGreenOnBlack.jpg HTTP/1.0
GET /stylesheet.css HTTP/1.0
GET /applets/MenuManager.jar HTTP/1.0
POST /dev/servlets/Login HTTP/1.0 FORMDATA=userId=Frank&userPswd=Zappa
Note the FORMDATA literal in the last sample line. This is a
keyword that URLProcessor looks for to determine whether there's any
FORM data to be appended to the request as its entity body.
With the exception of the last line, this sample looks very
much like a typical HTTP server's access log. Also note that if you
wanted to gather the "click stream" of a particular application, the
HTTP server's access log would be an ideal source to begin with. Or
you could do something similar to what I did. Within my application I
embedded two fragments in the doGet and doPost methods, respectively,
which enabled me to quickly generate "click streams" for playing back
through HTTPDriver. I've since folded those fragments into an
existing general utility class reducing the code to a single line in
each method.
URLProcessor's next() method is where all the activity
happens and where the requests to the HTTP server are issued and the
responses handled. URLProcessor is responsible for reading each URL
from the list, forming the request, creating a socket to write the
request stream to, and subsequently reading the response stream from.
After each request is written and its response completely read, the
socket is closed - indirectly by a .close() on the response stream -
and a relative response time duration is calculated and logged.
This repetitive process is governed by dispatching each
instance of HTTPSubDriver. As each instance is dispatched, it invokes
its URLProcessor's next() method and continues to do so until
interrupted.
As it runs, HTTPDriver periodically examines the state of all
the HTTPSubDrivers it created:
while(System.currentTimeMillis()
< driverStopTime)
{
logMessage
("Current Elapsed Runtime: "
+
(System.currentTimeMillis()
- driverStartTime)
/ 1000 + " seconds");
if (backgroundThreadsActive())
{
Thread.sleep(10000);// 10s
}
else
{
logMessage
("All background threads
are inactive");
break;
}
}
If all of the HTTPSubDriver instances are no longer active,
HTTPDriver closes its log file and terminates. That's it. As you can
see by this description, the entire process is fairly straightforward.
Runtime Considerations
There are a few things you should consider.
Remember when I said that this solution can simulate, roughly
speaking, the activity of many users? Well, I qualified the statement
for a reason. If you think about the nature of this application's
structure and implementation, you'll recognize some limitations. For
instance, we're trying to approximate the behavior of multiple users,
but what happens if we run a single instance of HTTPDriver with 20
HTTPSubDriver instances - that is, 20 users?
First off, the runnable queue could get quite long,
potentially elongating the duration between dispatch for some
threads. This problem could be pronounced if your environment is not
conducive to parallelism - for example, running in a single CPU
configuration or using green threads, which is the default behavior.
The implications of this problem are such that some of the
HTTPSubDriver instances might not get dispatched frequently or long
enough to do anything substantive.
Compounding the issue of a long runnable queue could be the
interwoven effects of normal thread blocking, for instance, on socket
read waits. Making things further problematic, I've set each
HTTPSubDriver thread to Thread. NORM_PRIORITY and haven't considered
systems where time slicing may not be supported. For example, I
didn't implement yield() in the construction of run() in
HTTPSubDriver.
Something else to think about would be network congestion
along the routes taken by this application. But more important would
be the congestion at the point of origin - in other words, where
HTTPDriver is running. For instance, in my tests, when I ran a single
instance of HTTPDriver from one machine using 10 HTTPSubDriver
definitions, the behavior was adequate. However, when I started
multiple instances - three in my case - of HTTPDriver on the same
machine, each with 10 HTTPSubDriver definitions, things got a bit
slow. Aside from the fact that I was running on a single CPU, I was
choking my NIC.
By reworking the test configuration (two HTTPDrivers, each
with six HTTPSubDriver instances per machine) across several "client"
machines, I got a reasonable amount of traffic resembling the
transit/response times of a single instance of HTTPDriver on a single
machine.
These observations about runnable queue length and network
congestion yield a key practical consideration in the application of
this solution: having an environment where the test workload can be
distributed effectively is ideal.
Summary
It should be evident by now that what I've described is a
trivial solution to a common requirement and is not intended to be a
commercial-grade substitute. This solution does, however, have
benefits for those of us seeking a portable, economical, and
effective means for generating HTTP traffic without having to employ
and train chimpanzees - or leave the country in hurry.
Author Bio
Marc Connolly is by trade a programmer. Currently working
for Oracle
Corporation, he has worked for various
companies, large and small, over the past 20 years. His focus has
been primarily in product
development for and with relational databases with occasional forays
into stranger venues. marc.connolly@oracle.com
Download Assoicated Source Files (Zip format - 9.82 KB)