Hardware Communication Support for Distributed Systems

Gradiate Students: Stuart Daniel, Ashish Mehra, Jennifer Rexford, Wu-chang Feng, Khawar Zuberi, and Anees Shaikh

Research Fellow: Jaehyun Park

Faculty: Kang G. Shin

Sponsor: NSF

The performance of a distributed system hinges on providing efficient communication between the nodes of the system. Consequently, our research focuses on providing efficient communications in point-to- point message-passing systems. By migrating communication functions to special-purpose hardware, we reduce the overhead communications place upon the host while simultaneously increasing communication performance. To explore the costs and benefits of this migration, we are developing tools for rapidly evaluating and prototyping VLSI implementations of point-to-point communication hardware.

One communication function that parallel systems commonly implement in hardware is packet routing. By implementing routing in hardware, overhead on the host is greatly reduced. In addition, switching schemes such as wormhole and virtual cut-through become more practical. By making the low-level control of routing and switching flexible, the communication subsystem can adapt to network conditions. We are investigating a number of routing and switching schemes, including hybrid switching, where the switching methodology for a packet can change at each step in its route, in response to the network state. Along with routing and switching, we are also interested in supporting real-time communication in hardware. In providing this, we hope to achieve tighter bounds on the maximum delay through a dedicated, but dynamically changing, point-to-point network for real-time traffic.

Our current design work is focusing on the Scalable Point-to-point Interface DrivER (SPIDER). The cornerstone of SPIDER is the programmable routing controller (PRC), a custom ASIC that supports a wide variety of routing and switching algorithms on any topology with connectivity four or less. The PRC retains flexibility through microprogrammable routing engines, as well as generic interfaces with the memory subsystem, the controlling processor, and the network.