There are several textbooks and Internet articles that dwell on the
performance and scalability benefits of using a thread pool versus creating
new threads in a multithreaded Java application.
While some of them overstate the benefits, most fail to emphasize some
of the caveats of Java thread pooling. Due to space contraints, this article
provides only a brief summary of the benefits and emphasizes the drawbacks.
A list of references that covers the benefits in more detail is provided at
the end.
What Is Thread Pooling?
Thread pooling refers to a technique where a pool of worker threads is
created and managed by the application. When a new job arrives, instead of
creating a new thread to service it, it's queued by the thread-pool manager
and dispatched later to one of the available worker threads. The thread-pool
manager manages the number of active worker threads based on available
resources as well as load considerations, adding new threads to the pool or
freeing some worker threads in response to the number of outstanding
requests. The primary goals of thread pooling are managing the number of
active threads in the system and reducing the overhead of creating new
threads by reusing threads from a pool.
Why Pool Threads?
The primary argument in favor of managing the number of active threads
in the system is: threads have a memory overhead since each one needs a
certain amount of memory for its stack. Threads also add scheduling
overhead, since the scheduler's work increases as the number of threads
increases. Depending on the implementation of the Java Virtual Machine, each
Java thread on certain operating systems may correspond to an OS thread,
making Java threads extremely heavyweight, and may limit the total number of
active threads that the JVM is allowed to create.
To be clear, I'm not saying you don't need to manage the number of
active threads in a system. After all, the benefits of multithreading do
have diminishing returns once the number of threads contending for the
available CPUs increases. If a server can process only about 1,000
simultaneous requests, it doesn't help to dispatch each incoming request as
it's made. Often the requests must be queued and processed at a controlled
rate to maintain the number of active requests below the server threshold. A
common mistake, however, is to assume that dispatching queued requests
automatically calls for the reuse of threads from a thread pool. Dispatching
a request to a new thread and letting the thread die once the request is
serviced achieves the same effect on managing the number of active threads
in the system.
Thread creation also has an overhead that can be higher in many cases
than the overhead of managing a thread pool. While the argument still
applies, the relative performance impact has changed significantly over the
years. The newer JVM implementations are optimized for creating threads;
most use a combination of user-level threads (known as green threads) as
well as system-level threads (or OS threads) to make creating threads much
less expensive than in earlier implementations.
The Dichotomy of Pooling Threads
The reasons for pooling threads seem to make perfect sense, just as
connection pooling makes perfect sense in a server-side application. Used
inappropriately, thread pooling in Java can introduce serious programming
flaws, ranging from logic errors to potential deadlocks and even performance
bottlenecks.
I distinguish thread pooling in general from thread pooling in Java
simply because many of the arguments that apply to thread pooling in Java do
not apply to other programming environments. Perhaps a common source of
misconception about the benefits of thread pooling in Java stems from our
experiences in other environments where the cost-benefit equation tilts
strongly in favor of thread pooling. In the following discussion, "thread
pooling" implies "thread pooling in Java," unless stated otherwise.
Thread Pooling Breaks Usage of Thread-Local Variables
Thread pooling is not friendly to the java.lang.ThreadLocal and
java.lang.InheritableThreadLocal classes that were introduced in JDK 1.2 to provide
thread-local variables. These variables differ from other variables in that
each thread has its own independently initialized copy of the variable. The
typical usage of a thread-local variable in a multithreaded application is
to keep track of some application context associated with the request, such
as the identity of the user making the request. The get() and set() methods
in the ThreadLocal class return and set the value that corresponds to the
executing thread. Thus, each thread executing a get() on a given ThreadLocal
variable can potentially get a different object. The set() similarly allows
each executing thread to set a different value for the same ThreadLocal
variable.
Think of a ThreadLocal variable as a hashmap that stores one value per
thread by using the thread as a key into the hashmap; however, these values
are "associated" with the thread in a stronger and more intrusive way. Each
thread maintains a reference to a private version of a hashmap (implemented
as a package accessible class, ThreadLocalMap) that contains all the
thread-local variables associated with that thread. Each thread uses the
declared ThreadLocal variable as the key into the hashmap to store one value
per ThreadLocal variable. When a thread dies and is garbage collected, all
thread-local values referenced by it are subject to garbage collection
(unless they're referenced elsewhere).
InheritableThreadLocal extends ThreadLocal to allow thread-local
variables associated with a parent thread to be inherited by any new child
thread created by the parent thread. This class is designed to replace the
ThreadLocal in those cases where a per-thread attri-
bute being maintained by the variable, such as UserId, TransactionId, etc.,
must be automatically transmitted to any child threads that are created. To
achieve the inheritance, the Thread class maintains a separate private
hashmap (ThreadLocalMap) for inheritable thread-local variables. The Thread
constructor ensures that the inheritable thread-local variables of the
executing thread (the parent thread) are copied onto itself (the child
thread).
Thus, each Thread object has explicit references to all the thread-local
variables, which in turn are only accessible via the ThreadLocal or
InheritableThreadLocal object. Like normal variables, private ThreadLocal or
InheritableThreadLocal variables are only accessible to the declaring class and the
threads associated with them. While it's possible to expose a method in the
Thread class to "purge" all (inheritable) thread-local variables associated
with the thread, it would require additional security checks to ensure that
only privileged code can do so, the privilege being ascertained using the
Java permission mechanism. Given the lack of such a construct even in the
latest versions of the J2SE/J2EE APIs, there's no way for a thread-pool
manager to purge or reset all the thread-local variables associated with a
given thread when reusing the thread in a different request context without
the explicit cooperation of all code that uses any thread-local variables.
Unless the declaring code "removes" a value assignment by explicitly
setting the value to null, thread-local variables remain assigned and hence
"associated" with the thread. As a result, any code that uses thread locals
risks using stale/incorrect values of the variables that were created in an
earlier request context when running in a pooled thread. Given that
ThreadLocal and InheritableThreadLocal are standard J2SE/J2EE classes,
they're quite likely being used in various pieces of library code, none of
which is safe to be executed by a pooled thread without an explicit
understanding of the usage details.
The only way to get around this is to avoid using a pooled thread to
execute code you don't know and control its implementation details. An
application that uses a thread pool to dispatch requests made in different
contexts is likely to have "inconsistent" logical errors when executing a
piece of code while servicing a request that uses a thread-local variable.
Lack of a Standard Thread-Pooling Library
There are several reference/example implementations of a thread pool
manager in various texts that describe and prescribe them, but most
developers will choose to implement their own since these reference
implementations are meant only for illustration and therefore are not
product quality, are copyrighted, nonstandard, and often won't meet your
specific requirements. Implementing a robust thread-pool library is a
complex task that requires extensive tests in a variety of situations,
including different operating systems, multiprocessor machines, extensive
load testing, various application usage scenarios, and thread-pool
management policies. While it seems simple on the surface, a robust
implementation must address such issues as pool-size determination based on
execution environment and application usage, request throttle, job
scheduling, and perhaps even priority scheduling.
When using a new thread per request, the JVM's scheduler ensures that
every runnable thread gets a fair share of the CPU, even if the share
happens to be really small, as in the case where there are simply too many
threads for the given execution environment. Using a size-bounded thread
pool can cause queued requests to be starved. If one of the queued requests
happens to be a producer (in a typical producer-consumer paradigm), it can
lead to a deadlock if all the dispatched requests happen to be consumers
waiting for the producer. Such application dependencies may necessitate
knowledge of the application logic in the thread-pool dispatching decision,
requiring some kind of priority dispatching construct. Priority-based
dispatching opens up another can of worms, exemplified by the Mars
Pathfinder "reset" problem caused by overlooking the classic
priority-inversion problem.
Addressing all the design issues that a robust thread-pool library must
implement is a nontrivial task. This happens to be one area of the system
that can have systemic effects and bring your application to a grinding
halt, unless tested for all potential race conditions and deadlocks,
especially since the memory model in multiprocessor systems is often
nonintuitive. This is no reflection of your abilities as a programmer,
rather a statement about the inherent complexity of the problem and the
effort involved in getting a robust implementation.
Performance Benefit Myths of Thread Pooling
While the lack of a standard implementation of a thread-pool library
seems like a lame excuse not to use one, it's worth asking why even the
latest versions of J2SE and J2EE don't provide one if using a thread pool is
so critical to performance on server-side applications. The answer lies in
understanding the details of the Java threads implementation. As mentioned
earlier, newer JVMs are optimized for thread creation and destruction and
use a combination of user- and system-level threads to minimize the
overhead. Not that there aren't any potential benefits in using thread
pools, but these are insignificant unless the jobs to be run by pooled
threads are short and quick and have a runtime overhead that's comparable to
the overhead of thread creation and destruction. Determining the relative
overhead of thread creation for the job in question and comparing it with
the overhead of thread-pool management must be backed up with real tests in
load conditions. As with many performance-related exercises, the results
often defy common sense.
To Pool or Not to Pool
Before deciding that you need a thread pool for your application because
that little timer thread you need to start for every request seems too much
of an overhead, or deciding that you can churn out a thread-pool library for
your particular usage in a day or so, here are a few things to consider.
How critical is the performance of that portion of the application and
would you make the same decision if it turned out that you needed over a
month to write a robust thread-pool library? Is it acceptable to risk an
application deadlock due to a less-than-robust thread pool implemented in a
few days? Do you have the time to validate and perhaps quantify the savings
achieved when using a pooled thread versus creating a new thread? Do you
have the time to validate correct behavior under heavy load on a
multiprocessor machine, particularly when the boundary conditions on pool
size are exercised? If you're not sure about the implementation details of
some code, such as usage of thread-local variables, will the pooled thread
run it?
In my own experience, a quick and dirty thread-pool implementation of
the job at hand often comes back to bite you. A small perceived performance
gain is probably not worth the risks introduced by a less-than-robust
thread-pool implementation. Not that these concerns don't apply to other
design decisions, but thread pooling falls in the category in which the
risks are much higher and the benefits are often much lower than perceived.
Summary
Top reasons for pooling threads:
- Limiting the number of active threads in the system
- Performance benefits of reducing thread creation overhead
Top reasons for not using thread pools:
- Breaks usage of java.lang.ThreadLocal and
java.lang.InheritedThreadLocal objects
- Lack of a standard and time-tested thread-pool library
- The myths of thread pooling performance benefits
References
Bloch, J. (2001). Effective Java Programming Language Guide.
Addison-Wesley.
Pugh, W., Ed. (2001). "The Java Memory Model." University of Maryland.
March: www.cs.umd.edu
/~pugh/java/memoryModel
Hyde, P. (1999). Java Thread Programming. SAMS.
Oaks, S., Wong, H., and Loukides, M. (1999). Java Thread, Second
Edition. O'Reilly.
Shirazi, J. (2000). Java Performance Tuning. O'Reilly.
Sha L., Rajkumar, R., and Lehoczky, J.P (1990). "Priority Inheritance
Protocols: An Approach to Real-Time Synchronization." IEEE Transactions on
Computers. September.
Kalinsky, D., and Barr, M. (2002). "Priority Inversion". Embedded
Systems Programming, April, pp. 55-56.
Author Bio
Vishal Goenka is a system architect for the core platform components at
Campus Pipeline. He holds a BS in
computer science from the Indian Institute of Technology, Kanpur (India).
vgoenka@campuspipeline.com