Last month in JDJ (Vol. 6, issue 1) we looked at the advantages of
downloading servlets and JavaServer Pages (JSP) from a repository,
for example, the same way a browser downloads applets. We described a
simple implementation of this concept based on a service servlet and
a custom class loader. This tool, named JSPservlet, handled servlets
and JSP packaged in JAR archives to minimize the number of
connections and transfers required.
This month I'll show you how to publish an archive, update it
or force its download through a JSPupdate servlet, and extend the
solution to handle resources, HTTP caching, request forwarding, page
inclusion, and JSP beans. The code listings for this article can be
found on the JDJ Web site,
www.JavaDevelopersJournal.com.
Archive Update
This simple JSP, JSPupdate, handles archive publishing and
updates (see Figure 1).
JSPservlet and JSPupdate are packaged in a Web application,
typically in a WAR archive that's described by a web.xml deployment
descriptor (see Listing 1). This archive is a general-purpose agent
responsible for downloading the target presentation archives and
routing requests to their servlets and JSPs.
To publish a new archive you must query the proper agent and
provide the archive name and the remote location where it should be
downloaded from. Simply identify the agent on your browser by its
URL, in this example http://localhost:8080/jdj/ JSPupdate, where
http://localhost:8080 identifies the Java server and jdj the agent's
Web application. This displays the form in Figure 1. Fill it and
click the button to start the publishing. Use the same form to change
the archive location or to force a new download. In the latter case
you don't even need to fill the remote location.
Let's go back to the tool design and walk through the
implementation of JSPupdate (see Figure 2).
Tool Design
The JSPservlet is a special servlet that handles HTTP
requests for a Web application and forwards them to target servlets
and JSPs with the help of the following objects:
- JSPhandler: Manages Web applications and maintains a ClassEntry map.
- ClassEntry: Manages archives and maintains a cache of target objects.
- JSPloader: Maintains a cache of target classes.
Therefore JSPupdate handling implies the following steps:
- Identify the relevant JSPhandler and create
it on the fly if it doesn't exist.
- Find the ClassEntry responsible for the
archive.
- If it doesn't yet exist, create a ClassEntry. In
this case the tool acts as an archive publisher.
- Otherwise, unreference the JSPloader, clear
the target object cache, and instantiate a new
JSPloader.
The first step is implemented in JSPupdate and JSPhandler.
Listing 2 shows the JSPupdate code. I prefer to use the GET mode to
simplify updates by programs or scripts. I'll return to this point in
the next section. In the script, starting on line 27 in Listing 2, I
first get the JAR name you filled on the form and the application
name, contextPath, from the URL. Then I look for the corresponding
JSPhandler in the JSPhandler's HashMap, and finally I invoke the
JSPhandler's update() method. I postpone the explanation of the case
in which the appropriate JSPhandler doesn't exist to the
RequestDispatcher section.
Listing 3 shows the JSPhandler.update() implementation.
Remember the tool minimizes downloads from a central repository and
handles its unavailability thanks to a local archive copy.
JSPhandler.update()first removes this archive cache with
File.delete(). Then, if you filled the Remote Location field, it
updates the remoteLocProp property and persists it on remoteLocFile.
Eventually JSPhandler.update() looks for the appropriate ClassEntry
in the classEntries HashMap and invokes its update() method. If it
doesn't find it, it creates a new ClassEntry and adds it to
classEntries.
Listing 4 shows the ClassEntry.update() implementation. It
first invokes the destroy() method of all cached target objects. Then
it clears servletObjects, the target object's HashMap, to unreference
them and then unreferences the corresponding JSPloader instance. Next
ClassEntry.update() invokes the garbage collector, which can free the
target and JSPloader objects and also the target classes and static
data. Though the garbage collection can take time, it reduces the
Java server footprint and improves its behavior. I considered the
garbage collection duration to be a minor drawback as I designed
JSPupdate to be invoked outside peak hours. Once the Java Virtual
Machine (JVM) reclaimed the memory occupied on behalf of the archive,
ClassEntry.update() created a new JSPloader and a new target object
cache.
Scripted Update
JSPupdate uses the GET mode. To require the update of an
application whose URL is www.iamakishirofan.com/gunnm for a JAR named
gally stored in a repository located in http://myserver, you simply
need to use the URL
www.iamakishirofan.com/gunnm/JSPupdate?jarName=gally&
remoteLocation=http%3A%2F%2F myserver&Submit=JSPupdate.
Listing 5 shows a Java class, UpdateClient, you can use to
update an archive from the command line or from a script. To update
the application above, invoke UpdateClient either with Java
JSPservletPkg/ UpdateClient http://www.iamakishirofan.com/gunnm gally
http:// myserver if you want to publish or update the remote location
or with Java JSPservletPkg/UpdateClient
http://www.iamakishirofan.com/gunnm gally if you simply need to force
a download.
UpdateClient first builds a URL string with the
UpdateClient's parameters. To convert the remote location to a MIME
format that's appropriate in a URL, UpdateClient simply uses
URLEncoder.encode(). Then it creates a URL and opens and reads a
stream, which it parses to check the Java server response. Use the
exit code in your script to handle error cases; the most common one
is server unavailability.
Resources
Consider the common case in which a JSP or servlet refers to
an image with a URL that's relative to the current path. Since
JSPservlet is configured in the Web application deployment descriptor
to handle all its requests, it also has to process image requests.
This case raises three issues:
- How to detect an image request
- Where to download the image from
- Where to handle the request
I chose to delegate images and other resource handling to
JSPloader because, as we'll see later in the beans section, it has to
address other resource needs anyway.
Where should we download the resources from? Should we cache
them? These aren't trivial issues, as an image is much larger than a
JSP or servlet. My choice is to support resources that are included
in the archive file or stored in the same remote location as the
archive, and to cache resources in memory.
Listing 6 shows the resource handling in JSPservlet.
JSPservlet detects images and other resources such as HTML files
after their URL extension. It also sets the content type according to
the URL extension. Then it uses JSPhandler.getResourceAsStream() to
get an input stream on the resource from JSPloader. JSPhandler simply
forwards the request to the appropriate ClassEntry, which invokes
JSPloader.getResourceAsStream(). If getResourceAsStream() doesn't
find the resource, JSPservlet invokes
HttpServletResponse.sendError(SC_NOT_FOUND), which builds an HTTP
response with a 404 status, which indicates that the requested
resource is not available. Otherwise JSPservlet reads the input
stream and rewrites it on the response output stream.
To support resources embedded in an archive, I modified last
month's JSPloader.ParseStream() method. Remember this method is
invoked during JSPloader construction to parse the archive content
that's read from the local archive cache or from its remote location.
In the latter case it's also responsible for storing the archive in
the local cache. However, the modification is minimal, as you can see
in Listing 7. JSPloader maintains a resources HashMap that acts as a
resource memory cache the way classes act as a class memory cache.
parseStream stores a resource in a byte array in resources, instead
of storing a class in classes.
Listing 8 shows the JSPloader.getResourceAsStream()
implementation. It first tries to retrieve the resource from the
resources memory cache. If it doesn't find it, it tries to download
the resource from the same location as the archive with
URL(remoteLoc).openStream(). Then it stores it in resources. If a
resource is stored in the archive, it's always served from resources.
A resource that must be downloaded is downloaded only once, then
served from resources. If the resource is neither in the archive nor
remotely available, getResourceAsStream() delegates the request to
getResourceForward() in order to support local Java server resources.
getResourceForward() first tries to find the resource in
JSPservlets's Web application using the getResourceAsStream() method
of JSPservlet's class loader, then tries to find it elsewhere using
the getResourceAsStream() method of the JSPloader parent class loader.
Proxy and Browser Caching
Figure 3 displays a typical HTTP caching scenario with three
actor types, browsers, a proxy, and an HTTP server. The first browser
requires a URL that the proxy can't find in its cache. So it requests
the URL from the HTTP server with an HTTP GET. The HTTP server
returns a response, which also includes Expires, ETag, and
Last-Modified header fields. The Expires field gives the date/time
after which the response is considered stale, the ETag field provides
a Entity tag value that can be used for comparisons, and the
LastModified tag indicates the date and time the server believes the
data was last modified.
The proxy stores the response in a cache and returns it to
the browser. Then a second browser requires the same URL. The proxy
finds the response isn't stalled so it returns it to the browser
without requesting the HTTP server. Next, a third browser requires
the same URL and this time the response is stalled but still in
cache, so the proxy asks the HTTP server if the response is still
valid with a conditional HTTP GET (a GET with an IfNoneMatch or
IfModifiedSince field). The HTTP server checks if the Entity tag is
the same in the case of IfNoneMatch, or if the LastModified tag
hasn't changed in the case of IfModifiedSince. If yes, it sends a
NotModified response without a body. If the browser requested an
image, the image is not transferred. If it requested a dynamic page
that required RDBMS access or heavy computation, this processing is
not needed.
The HTTP server sends an updated Expires value in its
NotModified response, so if a fourth browser requests the same URL
before the updated date/time, the proxy will again serve the response
without involving the HTTP server.
I described a proxy's behavior for clarity, but a browser
also caches the responses it receives and behaves exactly the same
way regarding the header fields presented in Figure 3. It's the
reason I had to involve four different browsers in the scenario. As a
consequence, an HTTP server can drive both proxy and browser caching
with the same code.
To implement that mechanism for resource requests, I had to
make two decisions:
- Where should I take the LastModified date/time?
- Should I implement Expire?
Remember the JSPloader.getResourceAsStream() implementation?
It first tries to retrieve the resource from the archive, then from
the same location as the archive, and eventually by asking the Java
server class loader.
When the resource is stored in the archive, it
picks up the LastModified date/time from the archive entry with
JarEntry.getTime(). When the resource is stored in the same location
as the archive, it uses a URLConnection object to download it.
URLConnection acts as a browser, so it has access to HTTP headers. It
even provides helper methods for the most common headers, such as
URLConnection.getLastModified() for LastModified that's invoked by
JSPloader.getResourceAsStream(). In the last case where
JSPloader.getResourceAsStream() asks the Java server for the
resource, I use the archive cache creation time. The rationale is
this sort of resource is typically stored on a Java server and
therefore cheap to retrieve.
The bottom line is:
- If an archive or downloaded resource hasn't changed,
JSPservlet will return NotModified even after an
archive update.
- For a resource retrieved by the Java server class
loader, JSPservlet will return a full response at the
first request after an archive update.
When the proxy receives a response containing ETag or
LastModified, it can set an internal expiration value. However, this
behavior isn't mandatory and can vary, so I prefer to implement
Expire and let you set it in an additional initialization parameter,
expiration, with a five-second default. The HTTP 1.1 specification
allows you to go up to one year, but its optimal value depends on
your configuration. The higher it is, the more time it will take to
refresh caches after an update. If your browsers are on the same LAN
as the Java server, don't worry about round-trip delays.
Let's go back to Listing 6 to look at the implementation details.
JSPservlet checks if the HTTP request was conditional. More
precisely, it retrieves the value of ifNoneMatch and ifModified
header fields. If they're set, it checks, respectively, if client
Entity tag and Last modified date/time are still the same as server
ones. If they are, JSPservlet returns an HTTP response with a status
NotModified (303), using HttpServletResponse.sendError
(SC_NOT_MODIFIED). This response includes an Expires field set with:
HttpServletResponse.setDateHeader("Expires",
System.currentTimeMillis() + jh.expiration * 1000)
setDateHeader is another convenient helper method that
simplifies setting a date header field. It takes two parameters, the
name of the field and the elapsed time since the epoch (January 1,
1970). JSPhandler computes it using JSPhandler's expiration, which
contains the expiration initialization parameter.
If the HTTP request is not conditional or if cache entries
are stalled, JSPservlet sends the resource. It's set before the Date,
Cachecontrol, LastModified, ETag, and Expires header fields. Date
represents the date and time the message originated. JSPservlet
builds this Date the same way as Expires. Cachecontrol:public
indicates that the response may be cached by any cache. I already
covered LastModified and ETag. Both contain the date and time
extracted by JSPloader. LastModified handling is slightly more
complex, as JSPservlet formats it in the RFC 1123 format - the HTTP
preferred date format - using a java.text.SimpleDateFormat.
RequestDispatcher
When building a Web application, it's common to forward a
request to another servlet or to include the output of another
servlet in the response. The servlet specification defines the
RequestDispatcher interface to accomplish this. The support of this
feature implies some modifications in the JSPservlet code.
First let's look at why and how the JSPservlet is involved.
RequestDispatcher lets you forward to another servlet or include the
output of a servlet, the forwarded or included servlet being in the
same Web application. Since JSPservlet handles all requests toward a
Web application, it's invoked.
The first issue is related to the include specification. The
included servlet has access to the including servlet's request
object. So when JSPservlet is invoked on behalf of an included
servlet, the request path doesn't contain its path but the path of
the servlet that included it. This is annoying since JSPservlet uses
this path to identify the archive and the class to forward the
request to.
Fortunately it's possible to know the path by which a
servlet was invoked thanks to special request attributes described by
the Java Servlet Specification, v2.2. For instance, I can get the
included servlet pathInfo and extract its archive and servlet names
with:
String pathInfo =
(String)request.getAttribute("javax.servlet.include.path_info")
If the attribute is not defined, it means the servlet wasn't
included, so I can safely retrieve pathInfo with
request.getPathInfo().
A larger issue is related to the context root. You can get a
RequestDispatcher with
ServletContext.getRequestDispatcher("/garden/header. html"). The
"/garden/header.html" path is relative to the root of the Web
application, which doesn't contain the archive name. So JSPservlet
won't be able to handle the request. There are two solutions to this
problem. The fully standard one is to use relative paths with
ServletRequest.getRequestDispatcher(). Since we're using
ServletRequest, the path can be relative to the current request. It
addresses the common case where the included servlet is located at
the same place as the including one. If it's not the case, you must
add the archive name, for instance:
ServletContext.getRequestDispatcher(jarName + "/garden/header.html")
The problem with this solution is it breaks the independence
between development and deployment (where archive names are chosen).
In the complete implementation I provide a JSPhandler.getJAR
(ServletRequest) static method to return the current archive name.
You can use it without breaking your servlet portability if you use
reflection as shown in Listing 9.
I considered and rejected a fully transparent method.
Remember the including servlet is invoked through JSPservlet. I could
implement a special ServletContext that delegates all calls to the
JSPservlet ServletContext except for getRequestDispatcher() in which
I'll transparently add the current archive name. I rejected this
solution since it forbade invoking a servlet hosted in a different
archive. However, if your requirements are different from mine, you
can implement this solution.
Now we can come back to the JSPupdate pending issue, to the
update handling when the appropriate JSPhandler doesn't exist yet.
The problem's origin lies in the JSPupdate and JSPservlet deployment
descriptor (see Listing 1). JSPupdate can't be included in an archive
because it would be unable to download an initial archive, and
JSPhandler relies on JSPservlet init-params to initialize. JSPupdate
calls JSPservlet when it needs to create a JSPhandler and uses a
RequestDispatcher to achieve this.
Let's revisit the JSPupdate code (see Listing 1). On line 39
you see that when the appropriate JSPhandler doesn't exist, JSPupdate
creates a RequestDispatcher with
getServletContext().getRequestDispatcher ("/JSPservlet") and uses it
to include JSPservlet.
JSPservlet.service() must be modified to include the code in
Listing 10 in order to identify and process updates. This code first
retrieves the JSPservlet context path using the
javax.servlet.include.context_path attribute, since the JSPservlet is
included. Then it invokes getHandler(), which will create the
appropriate JSPhandler. Next, the implementation detects that it's
called through a JSPupdate include by checking the included servlet
name returned by request.getServletPath(). Eventually it retrieves
the archive name and remote location from the request and invokes
JSPhandler.update(), which calls ClassEntry.update( ).
Beans
JSP Specification1.1 provides two mechanisms to integrate
Java logic: beans and the more recent tag extensions. A tag extension
is arguably more sophisticated and complex. However, it requires no
special handling because tag extensions are converted to Java code at
JSP compilation time. Therefore we don't need to distribute tag
library descriptors and can retrieve tag handler classes from the
archive as usual. Though beans are simpler, they can raise an issue
in the tool context.
The compiled JSP should create a bean with
Beans.instantiate(getClassLoader(), beanName). This static method
lets you specify which class loader to use, and its beanName can
indicate either a serialized object or a class. For example, given a
beanName of "x.y", Beans.instantiate would first try to read a
serialized object from the resource "x/y.ser"; if that failed, it
would try to load the class "x.y" and create an instance of that
class. To fully support beans, I need to allow unserializing from
archives.
My code supports the Beans.instantiate(getClassLoader(),
beanName) way to create the bean (Tomcat code) because target JSPs
are loaded by JSPloader. Therefore getClassLoader() returns the
relevant JSPloader instance, whose getResourceAsStream() is invoked
to get the serialized bean.
Summary
It's not difficult to develop a solution that's portable
across Java application servers and:
- Dynamically downloads Web applications from one
or many repositories.
- Commands downloads or updates from anywhere
using a browser or a command that can be started
from a scheduling tool.
- Supports Web applications' JSPs and servlets accord-
ing to JSP Specification v1.1 and Java Servlet
Specification v2.2.
A Java server can act as a browser that downloads applets on
demand and therefore is administration-free. However, there's a
difference. There's nothing to prevent a large number of browsers
from downloading the same applet at the same time, collapsing the
network. As Java servers download only when commanded, they avoid
this problem.
Author Bio
Alexis Grandemange is an architect and system designer. A Java
programmer since 1996 with a background in C++ and COM, his main
interest is J2EE with a focus on design, optimization, and
performance issues.
agrandemange@amadeus.net.
Download Assoicated Source Files (Zip format - 661 KB)