Software Tools for Performance and Dependability Evaluation
Graduate students: Seungjae Han and Harold Rosenberg
Research Scientist: C. V. Ravishankar
Faculty: Kang G. Shin
Sponsor: NASA, ONR, NSF
As real-time safety-critical applications and systems become more
sophisticated, the task of testing and evaluating these systems has
become increasingly more complex. As a result, there is a great need
for software tools to assist programmers and system designers with
performance and dependability evaluation. We are currently
undertaking a number of research projects in the area of software
tools for performance evaluation, dependability evaluation, and
communication protocol testing. The tools being created as part of
these projects provide us with an environment in which we can test and
evaluate the hardware and software being developed for HARTS. The
evaulation tools are: SWG, a synthetic workload generator; HMON, a
real-time monitor; DOCTOR, an integrated dependability evaluation
environment that includes a powerful software fault injection tool;
and TEMPEST, a tool for automatically generating fault sets for
fault injection experiments.
SWG, The Synthetic Workload Generator:
The performance and dependability of a computing system are directly
affected by the structure and behavior of the workload it is
executing. An ideal tool for experimental evaluation is a synthetic
workload(SW) which is an executable model of an actual program. The
characteristics of the SW is specified by an abstract-level workload
specification language. The SWG compiles this high-level description
of a workload to produce a synthetic workload which can be executed in
a distributed manner. In this way the system may be evaluated under
representative operating conditions or various specific
user-controlled operating conditions. We also added a number of
parameters in our model to describe the system-dependent features for
generality.
HMON, A Real-Time Monitor:
To aid in debugging and to measure the performance of distributed
real- time applications, we have developed a real-time monitor. Our
monitor, called HMON, provides continuous and transparent monitoring
activity throughout a real-time system's lifecycle with bounded,
minimal, and predictable overhead, using purely software means. We
have developed a novel approach to monitoring shared variable
references that provides transparent monitoring with low overhead.
The monitor is designed to support tasks such as debugging real-time
applications, helping in real-time task scheduling, and measuring
system performance. We have developed schemes for debugging
distributed and parallel real-time programs by deterministic execution
replay. In addition, HMON is capable of observing
application-specific events which aids other usages such as the
dependability data monitoring.
DOCTOR, An integrateD sOftware fault injeCTiOn enviRonment:
Fault-tolerance mechanisms generally perform a series of steps: fault
detection, identification, isolation, recovery, and reconfiguration.
In particular, the time needed in each of these fault processing steps
greatly affects the dependability of real-time systems. The
dependability is also highly dependent on the applications. By
integrating software implemented fault-injection with our other tools,
we are able to create a powerful environment for validating and
evaluating system dependability. DOCTOR is capable of injecting
various types of faults with a variety of options. Faults can be
injected as many times as desired, with performance and dependability
data automatically collected by HMON. It can use a usersupplied
application, or can assist the user to generate synthetic workloads
using SWG. A comprehensive graphical user interface is provided to
help the user design and control fault-injection experiments. An
important contribution of DOCTOR is its consideration of portability
issues, an essential requirement to eliminate/reduce excessive
duplication of effort and cost. As a result, fault-injection
experiments can be performed during early design phase without
developing a new fault injector for each target system.
TEMPEST, Testing and Evaluation eMPloying Event/State
Transformations:
In order to automatically generate the fault
sets for fault-injection-based experiments, it is necessary to
formally specify the fault models and experiment parameters to be
used. TEMPEST provides a graphical interface that allows an
experimenter to formally specify fault models in terms of the effects
of a fault on the system under test. These specifications are based
on event/state transformations, which describe the effects of a fault
in terms of the ways in which it can be activated, and the erroneous
behaviors that it can cause. Based upon these specifications,
together with specifications of the metrics and statistical goals for
the experiment, TEMPEST automatically generates the fault sets for
fault-injection experiments. These fault sets can then be used by a
run-time fault-injection system, such as DOCTOR, to control the
execution of fault-injection experiments. Because of the generality
of its fault-model specification methodology, TEMPEST can be used as a
front-end for any run-time fault-injection system.