Any individual piece of computer hardware or software can fail. That's why we back up our hard drives. When the hard drive on my laptop failed last year, the tape backup got me up and running in a few days - the time it took to get a replacement drive and reload my files.
But some systems can't afford to be down for a few days...or even a few hours...or sometimes even a few minutes. For example:
Online redundancy provides the timely robustness that meets these systems' needs. And to enable users of these systems to take advantage of CORBA, OMG members have recently standardized Fault Tolerant (FT) CORBA, which provides entity redundancy to CORBA systems.
- Medical monitoring systems deal with health-critical information constantly.
- Fly-by-wire systems must act in real time and must not fail (as you'll attest if you've ever flown in an airplane).
- When widely used e-commerce systems fail, companies lose thousands of sales, and possibly thousands of customers.
- Financial trading applications may lose unbelievable amounts of money for both customers and the responsible company if they go down at a critical moment - usually when the load is highest and potential for failure is greatest.
The difference between running an application under normal conditions and under fault-tolerant conditions is simple: under normal conditions an application will fail occasionally. Under fault-tolerant conditions it should never fail. Every invocation by a client should produce a response with the correct result (subject to the limitations listed in the last section of this article).
OMG's FT CORBA specification enables products that support a wide range of fault tolerance from simple one-copy redundancy of key objects to highly reliable, geographically dispersed, multiply connected servers. FT CORBA applications are transparent to the user, to the operation of the client, and, to some extent, to the application programmer.
It's perhaps a little more surprising that the transparency extends partially to the application programmer who codes his or her usual CORBA program with the same set of objects that would have been coded for a nonfault-tolerant system, adding to each application object a few interfaces that let the FT infrastructure synchronize the state of each set of object replicas. Running this same code on reliable, redundant hardware under the control of a fault-tolerant infrastructure makes it fault tolerant.
The transparency doesn't extend to the system administrator, who must install and configure the fault-tolerance product, provide redundant hardware and software, and configure the application to run redundant copies of its objects. As you'll discover, fault tolerance results from the way the application uses the redundant hardware to run redundant copies of its software.
FT CORBA supports applications that require a high level of reliability. Under FT CORBA, applications must not have a single point of failure.
FT systems provide this degree of assurance through
The FT CORBA infrastructure allows individual object instances to be replicated on multiple processors at different locations on the network - even in different cities or on different continents - in a very flexible way. Even when calls are cascaded, with replicated objects calling other replicated objects, propagation is controlled so that no copy executes the resulting call more than once.
- Redundancy (replication)
- Fault detection and notification
- Logging and recovery
Faults are detected by a number of mechanisms, including both push (heartbeat) and pull monitoring. To avoid scalability problems, a Fault Detector on each host monitors the individual objects on it, and a global (replicated, of course) Fault Detector monitors the individual host Fault Detectors.
Replication and Object Groups
The replicas of an object are created and managed as an object group. To render this replication transparent to the client, the copies are created and managed by a Replication Manager and addressed by a single Interoperable Object Group Reference (IOGR). I won't present details of the IOGR here - they're hidden from the application and serve only to allow the ORB and FT infrastructure to work together to deliver FT support.
Fault Tolerance Domains make it practical to create and manage large FT systems. An FT Domain may contain a number of hosts and many object groups, although a single host may support multiple domains. There is a single (replicated) Replication Manager for each domain.
In Figure 1 lightly shaded ovals denote the extent of FT domains, darker ones denote hosts, and circles denote object instance replicas. Capital letters within an object's circle denote its object group; the set of circles with the same letter is thus the set of replicas of an individual instance. Hosts may participate in one (Host 2) or more (e.g., Hosts 3 and 4) domains, but all objects in a group must be in the same domain because they are all created and managed by that domain's Replication Manager.
Every object group has a set of FT properties - ReplicationStyle, MembershipStyle, ConsistencyStyle, FaultMonitoringStyle, and six others - that may be set either domain-wide, per type, or per group. Unlike, for example, POA properties, some of these may be modified after the group is created.
Active vs Passive Replication Styles
There are two major styles of replication: active and passive. Passive then subdivides into two styles of its own, but let's look at the difference between active and passive first.
Warm passive replication lessens the recovery delay somewhat by loading state into backup objects periodically during execution; cold passive replication does no loading until the primary member fails.
- Every active replica executes every invocation independently, in the same order as every other replica. The replication mechanism inhibits duplicate invocations when one replicated object invokes another. Active replication is faster and cheaper when the cost of computation is less than the cost of saving/restoring state, and necessary when recovery has to be instantaneous.
- Only one passive replica of a replicated object, the primary member, executes invoked methods, saving its state periodically. When a fault occurs, a backup member is promoted to primary, and its state is restored from a log and the replaying of request messages.
Another mode, Stateless, applies to objects that have no dynamic state.
ACTIVE_WITH_VOTING, the only replication style we haven't covered yet, is interesting for two reasons:
- It isn't fully supported by the current specification and represents a possible future extension.
- This mode (and the fact that it's not supported) points out a limitation: the current specification protects only against crash faults, that is, when an object issues no response.
The specification doesn't protect against what it terms commission faults, where an object generates incorrect results, or against what it terms byzantine faults, where an object or host generates incorrect results intentionally. The ACTIVE_WITH_VOTING replication style, used with sophisticated algorithms, can protect against both types of faults, albeit at considerable cost in network traffic.
Strong Replica Consistency is the principle of FT CORBA that guarantees that objects have identical state. Supported by the specification, it guarantees that for passive replication all members of an object group have the same state following each state transfer, and that for active replication all members have the same state at the end of each operation.
For this to work, applications must be deterministic, or must be sanitized to give the appearance of being deterministic: for a given starting state a given invocation with a given set of parameters must produce the same output and the same internal state. (Sanitize means to clean up a set of replicas so that all give the identical answer to an invocation, removing all sources of nondeterministic behavior. Although this may be simple for a routine that queries a database row or performs a calculation, it's not so easy to sanitize an object against a query whose result varies with even slight differences in invocation delivery timing, or depends on a hardware serial number or some other value that differs from one machine to another.)
The specification requires that the FT infrastructure deliver the same set of invocations in the same order to every member of an object group. The deterministic behavior of the object instances, combined with the consistent behavior of the FT infrastructure, guarantees strong replica consistency.
The Replication Manager
Figure 2 shows a sample configuration for a fault-tolerant system. I'll use this configuration to discuss the way each element works.
The Replication Manager, itself replicated as the figure shows by the nested boxes, inherits three interfaces defined separately in the specification: PropertyManager, ObjectGroupManager, and GenericFactory. The PropertyManager interface lets you define fault-tolerance properties for your object groups, possibly using a GUI. Of the properties defined, two are especially relevant here: MembershipStyle and ConsistencyStyle. MembershipStyle defines whether the infrastructure or application controls membership in an object group, and ConsistencyStyle defines whether the infrastructure or the application controls consistency of state for members of an object group.
Fault Detector and Fault Notifier
Figure 3, copied from the specification, shows diagrammatically the sequence of events that ensues when a fault is detected.
As mentioned before, at least one Fault Detector on each host periodically pings each object replica on that host to see if it's functioning. Applications can configure additional Fault Detectors as needed. When a Fault Detector detects a fault, it pushes a report to the Fault Notifier using the FaultNotifier interface, a subset of the Notification Service interface.
Logging and Recovery
Everything covered so far was an introduction to this part. Here's where the service fixes up your application when an object crashes.
If you're running in one of the passive replication styles (either warm or cold), only the primary member of each of your object groups actually executes an invocation and sends back replies. When the Fault Detector suspects that the primary member has failed, it signals the Replication Manager to restart the primary member or to promote a backup member to primary.
At this point the primary member needs to have its state restored. If you've chosen the application-controlled consistency style, the application must fetch and restore the state of the new primary member. This is easy to describe: it's application-dependent, and you have to code it yourself. If you've chosen infrastructure-controlled consistency style, things are more automatic: using the Checkpointable interface with its operations set_state( ) and get_state( ), the system keeps state current in a log and uses the most recent values to restore the member during recovery.
If you've chosen active replication, none of this applies at failure/recovery time; the system just has one less duplicate response than usual and goes on with its processing without missing a beat. However, the logging and recovery mechanisms still use get_state( ) and set_state( ) to maintain a log that is used to synchronize new instances that might be added to an object group at runtime.
The Fault Analyzer registers with the Fault Notifier to receive fault reports. It then correlates fault reports and generates condensed fault reports using an application-specific algorithm.
Even though reporting is orthogonal to recovery and your system is (hopefully) chugging along without missing a beat even when redundant copies or hosts fail, you'll still want to know everything that's gone wrong during execution. The Fault Analyzer is the system component that takes care of this. Although a simple fault analyzer may report every fault it receives, it's much better if it performs correlations and condenses several thousand nearly simultaneous instance fault reports into a single host fault report!
Limitations and Conformance
Some limitations to FT CORBA are given in the following abbreviated list. By analyzing them you'll gain additional insight into the workings of an FT system.
- Legacy client ORBs: An unreplicated client hosted by a CORBA 2.3 or earlier ORB can invoke methods on a replicated server, but won't participate fully in FT (for reasons that space constraints preclude discussing).
- Common infrastructure: For this first release of the specification, all hosts within a FT domain must use the same FT product and ORBs from the same vendor to ensure FT behavior. Between domains, full FT behavior is guaranteed only if all employ the same FT product and the same ORB, although some improvement can be expected even when different products are used.
- Network partitioning faults: A network partitioning fault divides the system into two parts, each able to operate and communicate within itself but unable to communicate with the other. The inherent nature of the problem doesn't allow assured detection and recovery from these, which are therefore not covered by FT CORBA.
- Commission and byzantine faults: As mentioned earlier, commission faults occur when an object or host generates incorrect results, and byzantine faults occur when this happens intentionally or maliciously. These fault types are neither detected nor corrected by FT CORBA, although the ACTIVE_WITH_VOTING replication style provides a mechanism that can be used to build a solution. The solution, however, will be expensive in terms of resource.
- Correlated faults: These faults cause the same error to occur simultaneously in every replica or host, and typically result from failures in design or coding. They can happen anywhere: application, operating system, storage, network, hardware, or any other replicated part of the system. FT CORBA provides no protection against such faults.
FT CORBA defines two conformance points. An implementation must support at least one, and may support both:
That's all the fault tolerance we have space for here. I know it isn't enough to get you programming in FT mode, but I think it's more than enough to trigger an investigation into whether FT CORBA can provide your enterprise with the assurance you need to survive in today's world of business dependence on computing.
- Passive replication FT CORBA products support only the COLD_PASSIVE, WARM_PASSIVE, and STATELESS replication styles.
- Active replication FT CORBA products support only the ACTIVE and STATELESS replication styles.
I'd like to thank Dr. Louise Moser and Dr. Michael Melliar-Smith of Eternal Systems, Inc. (www.eternal-systems.com), for technical review of this article and for their contributions to the Fault Tolerant CORBA specification on which this article is based.
Jon Siegel, director of technology transfer at Object Management Group, also writes articles and presents tutorials and seminars about CORBA. His book, CORBA 3 Fundamentals and Programming, was
published last year by John Wiley & Sons.