ORCHESTRA: A Fault Injection Environment for Distributed Systems

This paper reports on ORCHESTRA, a portable fault injection environment for testing implementations of distributed protocols. The paper focuses on architectural features of ORCHESTRA that provide portability, minimize intrusiveness on target protocols, and support testing of real-time systems.

ORCHESTRA is based on a simple yet powerful framework, called script-driven probing and fault injection, for the evaluation and validation of the fault-tolerance and timing characteristics of distributed protocols. ORCHESTRA was initially developed on the Real-Time Mach operating system and later ported to other platforms including Solaris and SunOS, and has been used to conduct extensive experiments on several protocol implementations. A novel feature of the Real-Time Mach implementation of ORCHESTRA is that it utilizes certain features of the Real-Time Mach operating system to quantify and compensate for intrusiveness of the fault injection mechanism. In addition to describing the overall ORCHESTRA architecture and implementation, this paper also describes the experimental evaluation of two protocol implementations: a distributed group membership service on the Sun Solaris operating system, and a real-time audio-conferencing application on Real-Time Mach.


Back to Publications list.

sdawson@engin.umich.edu