Sir Clive Sinclair had a dream: everyone should own a computer. In
the early '80s, this was quite an ambitious, almost foolhardy thing
to say, given that the cost of computing machinery was well beyond
the grasp of individuals. Despite the hurdles, Sinclair Research Ltd.
produced one of the most popular personal computers in Great Britain
and, later on, in Europe: the Sinclair ZX Spectrum.
As a child I owned one of these machines, and because of it I
am, happily, a computer scientist. Although the Spectrum is no longer
produced, my nostalgia for this machine never quite disappeared.
Recently, I embarked on a project to write an "emulator" (in
Java) that could run some favorites from the prolific library of
Spectrum software. This article is about the challenges encountered
and my accomplishments during this adventure. The source code can be
downloaded from below.
What Is an Emulator?
According to Merriam-Webster an emulator is "hardware or
software that permits programs written for one computer to be run on
another usually newer computer." Emulators are in relatively common
use today, although they are invisible - a powerful testimonial to
the fact that they do their job well. The Java programming language
is popular in large part because it can run on many different
computers. This is achieved via an emulator (a "virtual machine")
that allows Java programs to be executed on different platforms.
Emulators are particularly satisfying to write. Building an
emulator is challenging, since it requires an intimate understanding
of both the emulated machine and the host machine in order to bridge
them together. Emulators are hard to get right since they require
extensive attention to detail: every single facility of the emulated
environment, whether it's a CPU instruction or interactions between
two components, must work as per spec, or the program running on the
emulator will likely fail. Once an emulator is completed, it's
possible to revive old programs in such a way that they're completely
oblivious to their new surroundings. It's almost like traveling back
in time.
Emulation in Java
Java has matured tremendously in the years since I first came
into contact with it. Since writing emulators has been an on-and-off
hobby of mine, I thought it would be an interesting experiment to see
what it would take to implement one in Java.
The first two challenges I had to think about before starting
down this path were performance and timing. Performance is critical
to the success of an emulator since no matter how accurate, if an
emulator runs programs significantly slower than the original
hardware, no one would want to use it. The Java HotSpot engine has
recently advanced the performance of Java programs by leaps and
bounds, yet the risk still remains: for every instruction of an
original program, the emulator may have to execute tens or hundreds
of instructions. Timing is also critical since Java does not (yet)
have facilities for running programs in real time. Most notably,
garbage collection may well interfere with the proper execution of
the emulated program and cause significant pauses, making the
programs appear jerky onscreen.
On the flip side, Java provides an excellent environment for
writing code since it has a powerful and expressive API. The emulator
I'll be describing is loosely based on an open-source emulator
(implemented in C) that I worked on with a few other people a number
of years ago. The Java source is an order of magnitude more compact
and more elegant than the original C source. The Java AWT API
significantly reduced the burden of implementing the screen emulation
(one of the harder aspects of the original emulator). Last, but not
least, the emulator can run on any Java Virtual Machine.
The Sinclair ZX Spectrum
The machine I wrote the emulator for is the Sinclair ZX
Spectrum: one of the first personal computers. For a bit of history
on the machines and the man behind them, visit
www.nvg.ntnu.no/sinclair/.
The Java emulator, called "JZX", is loosely based on a Linux
native emulator called "XZX" that I worked on a number of years ago
(www.zx-spectrum.net/xzx/).
The Spectrum is a remarkably simple machine by today's standards.
- CPU (Z80 @ 3.5MHz), I/O controller (ULA)
- 16K ROM (BASIC), 48K RAM
- Integrated keyboard, TV decoder, loudspeaker
- IN/OUT ports (microphone and headphone), expansion slot
Despite its simplicity, the Spectrum was fully capable of
running sophisticated software such as games, Pascal and C compilers,
databases, and word processors.
Architecture Overview
The Spectrum is assembled as in Figure 1 and operates as follows:
- The CPU fetches instructions and executes them.
-It passes I/O instructions to the ULA.
-It handles interrupts from the ULA via interrupt routines.
- The ULA handles the interaction with the outside.
-It scans the video RAM to produce the TV frame.
-It interrupts the CPU at the end of a TV frame.
-It decodes the I/O ports to read/write the peripherals.
The software architecture resembles Figure 1, but is
different in a few key aspects (see Figure 2).
The emulator emulates two distinct machines: the original 48K
Spectrum model and the subsequent 128K one. Since the machines are
similar, 95% of the emulator code base is shared. Not all peripherals
from the original Spectrum are emulated: the I/O ports and the
speaker are missing. The I/O ports are not present since the original
hardware needed them for loading and saving software onto magnetic
tape, while the emulator uses files instead. The speaker is not
present since it would be almost impossible to emulate it correctly
in Java: the sound in the original Spectrum was produced by turning
the speaker on and off rapidly to create the appropriate frequency.
Every Java class that represents a Spectrum component is
derived from BaseComponent (see Listing 1). BaseComponent and
BaseSpectrum form a (extended) Composite pattern, where BaseSpectrum
is the Composite and every BaseComponent has a reference to its
parent. Any BaseComponent can access its parent and from there, any
other sibling. This simulates the "bus" of the original machine. For
example, whenever a byte is written into the video RAM, the
BaseMemory object can retrieve its BaseSpectrum parent from which it
can retrieve the appropriate BaseScreen object and subsequently
update the current screen frame.
The BaseComponent class also imposes the contract for the
major "lifetime events" of the emulator: startup, reset, shutdown,
and load.
init(): Initializes the component with the parent and the logger object (used for logging error and debug messages)
terminate(): Terminates the component and indicates that all its state should be discarded
reset(): Notifies the component to reset all of its state and be ready to start fresh
load(): Participates with BaseLoader in a simplified Visitor pattern used for loading relevant state into the object (the Accept() and Operation() methods are merged into load() since the Visitor is merely a passive data container)
When the top-level BaseSpectrum object is created, it calls
init() on itself and all its children, followed by reset(). When the
emulator is shut down, it calls terminate() in the same way.
Emulation Challenges
Main Loop
The main loop of the emulator resides in the BaseSpectrum
class (see Listing 2).
Every instruction executed by the Z80 CPU takes a certain
amount of time, which is a multiple of one "state" (also known as
"T-state"). The ULA renders one line of the screen every 224 CPU
states (STATES-PER-LINE), so it's essential to keep track of how many
states pass every time the CPU decodes and executes one instruction.
For efficiency purposes, the screen is not updated every time a new
line becomes available, but rather every 50 (TV-LINES) lines, which
means one frame every 20 milliseconds.
The CPU interrupts are simulated by means of wait()ing on an
external Thread object, which simply sets a public field to "true"
and calls notifyAll() every 20ms, at which point the main emulator
loop notifies the CPU of the interrupt. The reason for wait()ing on
the interrupt is that the CPU emulation runs at the speed of the host
machine; no attempt is made to slow it down in order to run at the
original Spectrum speed, except whenever a screen frame is rendered.
This turns out to be entirely appropriate and makes the emulation
both fast and believable. Note that it's possible for the main
emulation loop to "skip" one interrupt (if, for example, refreshing
the screen takes too long). This is a measurably low risk that would
take place only on slower machines and would not be readily visible.
Memory Emulation
The Z80 is a 16-bit CPU, meaning it can index up to 64K of
memory. This works naturally for the 48K Spectrum (16K ROM and 48K
RAM.) The 128K Spectrum, on the other hand, has 12x16K pages (4xROM +
8xRAM); the software can select any four to be "seen" by the CPU. The
BaseMemory implementation uses pages for the emulation in both models
to achieve maximum reuse.
The BaseMemory object keeps track of two arrays to represent
the memory:
byte[][] m_page;
byte[][] m_frame;
The "m_page" array represents the full set of memory pages.
There are 12 pages, each 16K long.
The "m_frame" array represents the four pages currently being "seen" by the CPU. For example, the following code makes the CPU
"see" pages 0, 4, 6, and 9:
m_frame[0] = m_page[0];
m_frame[1] = m_page[4];
m_frame[2] = m_page[6];
m_frame[3] = m_page[9];
The emulated CPU reads and writes data by indexing into the
"m_frame" array.
The BaseMemory object allows the CPU to select the "visible"
pages via the method "public void pageIn(int frame, int page)". It
also allows direct access to the page data via "public byte[]
getBytes(int page)" (useful for the screen emulation, for instance).
All memory operations must convert "virtual" addresses to
physical ones. The frame number is simply the first 2 bits of the
"virtual" address; the frame offset is the remaining 14 bits (see
Listing 3).
Signed and Unsigned Data Types
You may be wondering about the return types of the methodes
in Listing 3; they both return an integer (32 bits) despite the fact
that they should perhaps return a "byte" (8 bits) and, respectively,
a "char" (16 bits).
The Z80 CPU can operate on 8-bit or 16-bit values, either
directly or via its registers. Although Java natively supports data
types that are 8- and 16-bits wide, the emulator is implemented
almost exclusively in terms of integer types. The reason for this is
that the Java byte is a signed type, while the Java char is an
unsigned type.
Consider the following (fictitious) Z80 instruction that adds
the contents of the A register (8 bits) to the contents of the HL
register (16 bits) and stores the results in the HL register:
Input:
A = 10000000 (=0x80), HL = 00000000 00000001 (=0x0001)
Result HL = HL + A:
HL = 00000000 10000001 (=0x0081)
The same thing in Java would look like this:
Input:
byte a = (byte) 0x80; char b = (char) 0x0001;
Result b = b + a:
b = (char) (b + a); (=0xFF81)
Surprised? The Java language specification (paragraph 5.6.2)
mandates that all binary operations where the operands are of type
integer (or smaller) should be promoted to integer first. In our
case, the (byte) value 0x80 and the (char) value 0x0001 are first
promoted to integer before they're added. Since the byte is a signed
type, the integer promotion yields the value 0xFFFFFF80. Since the
char is an unsigned type, the integer promotion yields the value
0x00000001. When the two integer values are added, the end result is
0xFFFFFF81, which is then truncated to a char, yielding the value
0xFF81. The only way to avoid this behavior is to explicitly prevent
the sign extension in the widening conversion. The new code would
look something like this:
b = (char) (b + (a & 0xFF));
The "&" will mask all bits but the last 8, yielding (the
integer) 0x80, which is then added to "b", producing the correct
result.
Although this solution solves the problem, it's only a
partial solution for the CPU emulation as a whole. The reason has to
do with the CPU flags that indicate whether overflow occurred during
a particular operation. Java has no mechanism for indicating
overflow, so I must always use a Java data type that's larger than
the resulting value. The emulator would explicitly test for overflow
and truncate appropriately. In the end, the only feasible solution is
to use integer types everywhere, and explicitly deal with issues of
truncation and overflow.
To keep the code readable, I adopted a naming convention that
exposes the size of the data types involved. The size in bits of the
return value and each argument of a function is appended to its name.
For example, "int read8(int val16)" means that the function returns 8
bits of data, and receives as an argument 16 bits of data, all
embedded in an integer as the less significant bits. Furthermore, the
convention is that all input arguments are correct and need no
further modifications, while all return values need to be correctly
truncated before being returned.
CPU Emulation
In addition to the signed/unsigned challenges described
earlier, the CPU poses additional problems in the area of instruction
decoding. The decoder is implemented as a large "switch()" statement,
which switches on the first byte of the current instruction.
Naturally, the code is very large and rather unwieldy to read and
modify. One possible solution for dealing with such a large piece of
contiguous code would be to have an array of IRunnable objects (that
can decode a particular instruction in the "run()" method) indexed on
the instruction code.
This approach would allow the code to be structured more
elegantly, but it would proliferate the number of classes and impose
significant runtime overhead. The "switch()" approach, while
difficult to write and maintain, is extremely fast since the JVM
implements it internally as a jump table, thereby exhibiting the same
architectural approach as the object array, without the performance
penalty of invoking an interface method for every instruction.
Screen Emulation
Each pixel on the Spectrum screen can be either on or off.
This is represented by the appropriate bit value in a byte (the state
of 8 adjacent pixels is governed by the byte value at a particular
memory address in video RAM). Color information is represented by
another byte.
To draw pixels on the screen, the BaseScreen object extends
java.awt.Canvas and implements the drawing logic in the paint()
method. A nice side effect of this is that the emulator can be
"embedded" into any AWT or Swing container that can render Canvas
objects. This allows the emulator to run seamlessly as a standard
application or an applet. The BaseScreen object uses an offscreen
image to render the screen contents, after which the image is drawn
directly onto the screen via java.awt.Graphics.-
drawImage() (this common technique prevents flickering).
Every time the CPU writes into the screen memory area, the
BaseScreen object is notified. For maximum efficiency, the only
action taken at this time is to toggle a Boolean value in an array
that indicates that the particular screen byte has changed. When
paint() is called, a for loop iterates through the Boolean array and,
for every "true" value, it draws the corresponding byte into the
offscreen image.
The mechanism for fast updates to the offscreen image is the
challenging part. The naïve technique is simply to use
java.awt.Graphics.fillRect() to render every pixel into the image.
While this works, it's very slow due to the overhead of calling the
fillRect() method and running it many times for 1x1 rectangles.
A better technique is to create the offscreen image as a
decorator for a java.awt.image.MemoryImageSource object. The
MemoryImageSource is created, in turn, containing a byte array in RGB
format with the pixel data. The rendering code updates the byte array
and then calls MemoryImage-Source.newPixels() to notify the object
that the data has been updated (see Listing 4).
Table 1 provides the timing results, in milliseconds, for
rendering 200 consecutive frames of the (same) Spectrum game (the
hardware/software configuration is Windows 2000 Professional, Pentium
III 650, 192MB RAM).
These timings are barely adequate: as you recall, each
Spectrum frame is refreshed every 20ms. If rendering the frame takes
longer than 20ms, the emulation will look choppy (it will skip frames
and slow down the machine overall).
To improve performance, I made a key observation about the
way color is encoded in the Spectrum. As described earlier, every
byte in video RAM is paired with another byte that describes its
color: the first byte simply shows whether the pixels are "on" or
"off" (the "pixel" byte) and the second byte shows what color the
pixels are (the "color" byte). This means there are a total of 256 *
256 different ways that a location in video RAM could appear on the
emulated screen (256 values for the "pixel" byte and 256 values for
the "color" byte). I can prerender some (fixed) number of these
"pixel/color" byte combinations as Java image objects and then simply
use java.awt.Graphics
.drawImage() to render that piece of the video RAM on the screen (see
Figure 3.)
The new performance numbers are in Table 2.
As you can see, the performance improvements are dramatic for
Java1 (>90%); for Java2, however, the performance is far worse (a
slowdown or more than 1,000%!). The reason for the bad Java2
performance lies in the performance of java.awt.Graphics.drawImage().
This discussion is beyond the scope of this article, but you can read
more about it on the Java Developer Connection Web site (http://developer.java.sun.com/developer/) in the BugParade section
(http://developer.java.sun.com/
developer/bugParade/bugs/4276423.html).
To resolve the performance problems in Java2, I use a
different (and more conventional) technique that's similar to the
MemoryImageSource technique described earlier. In Java2, the
offscreen image object is a superclass of java.awt.Image, namely a
java.awt.image.BufferedImage. This class has a method called setRGB()
that allows you to set an RGB pixel array directly into the Image
object, without the performance penalties of MemoryImageSource.
newPixels().
The final performance numbers are in Table 3.
Note that prerendering all possible 256 * 256 (= 65536) Java
image objects will take a toll on the memory footprint of the
emulator. If I want a fixed-size cache of these images, I run the
risk of "thrashing" in the cache. Discarding entries when the cache
is full means the garbage collection will have more work, slowing
down the emulation. It's possible to reuse entries in the cache
(instead of discarding them), but this will bring us back to the
original performance problems with Graphics.drawRect() or
MemoryImageSource.newPixels(). A better idea is to use only half the
"pixel" byte (a nibble) to prerender Java images. This means that any
"pixel" byte will be drawn by concatenating two prerendered Java
images. The total number of prerendered nibbles is far more
manageable: 16 * 256 (=4096.) The tradeoff is that I now need to make
twice as many calls to java.awt.Graphics.drawImage(), but that turns
out to be inconsequential.
Debugging Techniques
Debugging the emulator is very challenging. The hardest part
to debug is the CPU emulation, primarily due to its sheer size and
complexity. The CPU emulation code is bigger and more complicated
than the rest of the emulator. Although the Z80 CPU is simple by
today's standards, it has a great many flags, registers, and
instructions that manipulate these flags. For example, any addition
will modify the sign flag, parity flag, carry flag, half-carry flag,
and add-subtract flag. The Spectrum software uses all these flags,
and any mistake most often translates into a "hard reset" of the
emulated Spectrum.
Furthermore, determining exactly where the emulated software
crashed is not as easy as watching for the equivalent of an illegal
memory access or page fault. (There's no such thing on the Spectrum.)
An incorrectly decoded instruction will most often translate to a
Spectrum "hard reset" or "hang" thousands of instructions down the
stream from it.
The easiest way to debug the emulator is to use another
emulator (that's known to be correct) and compare the CPU traces for
executing the same program. In this case, I modified the original
Linux native emulator to output CPU traces (a CPU trace is the state
of all registers and flags after executing every instruction). This
has the advantage of pinpointing the precise spot where the Java
emulator diverges from the native emulator and thus dramatically
reduces the time required for debugging.
Performance Considerations
I was pleasantly surprised to see that it was not only
possible to implement a Java emulator for the Spectrum, but that it
also ran fast. The CPU emulation is on par with the native emulation;
the screen emulation, while slightly slower and a bit more awkward,
is well within the limits of realistic emulation for what I would
consider "average" hardware. Surprisingly, and most probably due to
screen emulation, Java2 didn't fare dramatically better than Java1,
which puts Java1 on the map as a reasonable contender for this type
of work.
Conclusion
As a platform for emulation, Java is a very strong player.
Although the Sinclair Spectrum is not a terribly complex machine by
today's standards, it poses significant challenges in its
implementation, and requires strong support both in terms of language
features and overall performance. Java's elegant and expressive
language rises to the challenge and overcomes it easily with code
that's more readable, more modular, and far more concise. Java's
performance, the wild card in the equation, also meets expectations.
Author Bio
Razvan Surdulescu is a software developer at Trilogy in Austin, TX,
where he writes e-business software in Java.
razvan.surdulescu@post.harvard.edu
Listing 1
public abstract class BaseComponent
{
protected BaseSpectrum m_spectrum;
public BaseSpectrum getSpectrum() {
return m_spectrum;
}
public void init(BaseSpectrum spectrum) {
m_spectrum = spectrum;
}
public void terminate() {
m_spectrum = null;
}
public abstract void reset();
public abstract void load(BaseLoader loader);
}
Listing 2
while (true) {
states = current CPU state count
if (states >= STATES-PER-LINE) {
states = (states - STATES-PER-LINE)
increment line count
if (lines == TV-LINES) {
lines = 0
refresh screen
wait for the next clock interrupt
interrupt the CPU
}
}
execute next CPU instruction
}
Listing 3
public int read8(int addr16) {
int data = m_frame[(addr16 >> 14)][(addr16 & 0x3FFF)];
return (data & 0xff);
}
public int read16(int addr16) {
int high = (read8((addr16 + 1) & 0xFFFF) << 8);
int low = read8(addr16);
return (high | low);
}
Listing 4
private int[] m_data;
private MemoryImageSource m_memoryImageSource;
// RGB data
m_data = new int[SCREEN_WIDTH * SCREEN_HEIGHT];
m_memoryImageSource = new MemoryImageSource
(SCREEN_WIDTH, SCREEN_HEIGHT, m_data, 0, SCREEN_WIDTH);
m_memoryImageSource.setAnimated(true);
public void paint(Graphics g) {
for every 'address' in Video RAM {
let 'pixels' be the byte at that address
for every 'bit' in 'pixels' {
let '(x, y)' be the Cartesian coordinates of 'bit'
let 'color' be the RGB color of 'bit'
m_data['(x, y)'] = 'color'
}
}
m_memoryImageSource.newPixels();
g.drawImage(m_offscreenImage, 0, 0, this);
}
Additional Source Code for this Article
The online source for the emulator is in jzx_source.zip.
The online compiled classes for the emulator are in jzx.jar.
To run the emulator, put jzx.jar in a folder on your hard drive, then unzip roms.zip in the same location as jzx.jar
(~ 37.5 KB ~Zip File Format).
Launch the emulator as follows:
+ Sun JDK: java -jar jzx.jar [parameters]
+ Sun JDK: java -classpath jzx.jar org.razvan.jzx.JZXFrame [parameters]
+ Microsoft JDK: jview /cp:p jzx.jar org.razvan.jzx.JZXFrame [parameters]
The command line [parameters] you can specify are as follows:
-scale x: Scale the window size by 'x', where 'x' can be 1, 2 or 3.
-mode m: Start the emulator in 48k ('m' = 48) or 128k ('m' = 128) mode.
-snapshot s: Load a given .Z80 snapshot file into the emulator.