Parallel Performance Project Research Paper
Research Paper
-
A Hierarchical Approach to Modeling and Improving the Performance of
Scientific Applications on the KSR1
Eric L. Boyd, Waqar Azeem, Hsien-Hsin Lee, Tien-Pao Shih,
Shih-Hao Hung, and Edward S. Davidson
Proceedings of the International Conference on Parallel Processing,
August 94.
Abstract
-
We have developed a hierarchical performance bounding methodology that
attempts to explain the performance of loop-dominated scientific applications
on particular systems. The Kendall Square Research KSR1 is used as a running
example. We model the throughput of key hardware units that are common
bottlenecks in concurrent machines. The four units currently used are: memory
port, floating-point, instruction issue, and a loop-carried dependence
pseudo-unit. We propose a workload characterization, and derive upper bounds on
the performance of specific machine-workload pairs. Comparing delivered
performance with bounds focuses attention on areas for improvement and
indicates how much improvement might be attainable.
We delineate a comprehensive approach to modeling and improving
application performance on the KSR1. Application of this approach is being
automated for the KSR1 with a series of tools including K-MA and K-MACSTAT
(which enable the calculation of the MACS hierarchy of performance bounds),
K-Trace (which allows parallel code to be instrumented to produce a memory
reference trace), and KPCache (which simulates inter-cache communications based
on a memory reference trace).
Back to Publication List, or
Parallel Performance Project Home Page