RTCAST Group Communication Service

RTCAST is a lightweight protocol that is designed to support the real-time process groups programming paradigm. The key primitives common to any process groups service are fault-tolerant multicast and group membership service. In short, the process groups paradigm provides:
o reliable message delivery
o ordered and atomic message delivery
o agreement on group membership
o failure detection and handling.
Moreover, real-time process groups must also ensure message delivery within stated deadlines, bounded-time agreement on membership changes, and timely failure detection and handling. RTCAST provides the following features:
o Support for hard real-time communication
o System consistency in the presence of failures
o Atomic, ordered message delivery semantics
o Support for a flexible event driven system model
o Immediate message delivery without compromising the above
o Separation of timing concerns from consistency concerns
o Efficient integration of membership and communication support
A complete description of the protocol and its implementation may be found in "RTCAST: Lightweight Multicast for Real-Time Process Groups" by Tarek Abdelzaher, Anees Shaikh, Farnam Jahanian, and Kang Shin.

A more complete version of this paper including a performance test of the protocol implementation is available in the RTCAST tech report.

The APIs for the RTCAST protocol and an accompanying service library are available here.

Below we provide a brief overview of the RTCAST service, including the system model and assumptions, overall layering, and software architecture. There is also a diagram of the current demonstration version.


RTCAST System Model


In RTCAST processors form a logical ring, each with its own unique identifier. It is assumed that a path exists between any two processors in the group and that communication delay is bounded (in the absence of failures). Furthermore, we assume the presence of synchronized clocks.

Processors may suffer performance or crash failures and messages may suffers performance or omission failures. Permanent link failures are assumed handled using spatial redundancy techniques and arbitrary or Byzantine failures are not considered.

In the steady state, RTCAST works as follows:

o Processors on the ring take turns in sending messages
o Each processor sends messages only when it has the logical token (messages need not travel along the ring)
o Upon receiving the token, each processor
sends multicast data messages, not exceeding maximum token hold time
sends a heartbeat which serves as token handoff to successor
o Upon receiving a multicast message
deliver to application in sequence
if message omission detected, crash
o Imposing a maximum token hold time on processors ensures a bounded (maximum) token rotation time


RTCAST Middleware Layering


This figure shows a general framework within which RTCAST might be used. These services consist of two major components, a timed atomic multicast, and a group membership service. They are tightly coupled and thus considered a single service, namely RTCAST. Clock synchronization is assumed in the protocol and enforced by the clock synchronization service. To support portability, RTCAST might lie atop a layer exporting an abstraction termed a virtual network interface. Ideally, this interface would provide a mechanism to transparently handle different network topologies each having different connectivity and timing characteristics. In particular it quantifies the underlying network in terms of available bandwidth and maximum path delay between source/destination pairs, hiding its topology and technology. The network is assumed to support unreliable unicast. Finally, the top layer provides functional (API) support for the real-time process group service and interfaces to the lower RTCAST protocol.


Software Architecture on OSF MK Operating System


Here we show the software architecture of the current RTCAST implementation. It runs as a user-level server on the Mach MK 7.2 microkernel operating system from the Open Software Foundation Research Institute. Applications using the service submit channel setup requests and send messages via the RTCAST server which contains the protocols for admission control, schedulabilty analysis (ACSA), and multicast.


RTCAST Protocol Demonstration


Our demonstration illustrates several features of the protocol. The processor membership is shown in the main window. Processors participating in RTCAST can be in one of three states:
runningcrashedjoining
The slider bar to the right of the ring allows control over the message generation rate at the local node. When real-time scheduling is active, the number of admissions per second should remain relatively constant when the load increases. The number of messages that miss their deadline, however, should be very low. Similarly, when scheduling is turned off, the number of messages admitted continues to increase with the load. The number of missed deadlines also rises.

The picture of the battleship is constructed from left to right as messages arrive in order at the node. If messages are omitted (or miss their deadline), the picture shows a discontinuity. The communication failure button allows simulation of random message omission.

Our current testbed consists of four Intel P133-based PCs on a private Ethernet LAN. Each machine runs the OSF MK 7.2 or the CMU RT-Mach operating system. The group communication server is implemented as a user-lever protocol stack developed in the x-Kernel environment.


Back to the project homepage.