Parallel Performance Project Research Paper

Research Paper

Profile Driven Weighted Decomposition
Karen A. Tomko and Edward S. Davidson
Proceedings of the 1996 ACM International Conference on Supercomputing, May 1996, Philadelphia.

Abstract

Many large-scale computational problems are based on irregular (or unstructured) domains. Some examples are finite element methods in structural analysis, finite volume methods in fluid dynamics, and circuit simulation for VLSI design. Domain decomposition is a common technique for distributing the data and work of irregular scientific applications across a distributed memory parallel machine. To obtain efficiency, the subdomains must be constructed such that the work is divided with a reasonable balance among the processors while the subdomain boundary is kept small. Application and machine specific information can be used in conjunction with domain decomposition to achieve a level of performance not possible with traditional domain decomposition methods. Application profiling characterizes the performance of an application on a specific machine. We present a method that uses curve-fitting of application profile data to calculate vertex and edge weights for use with weighted graph decomposition algorithms. We demonstrate its potential on two routines from a production finite element application running on the IBM SP2. Our method combined with a multilevel partitioning algorithm reduced load imbalance from 52% to less than 10% for one routine in our study.
Back to Publication List, or Parallel Performance Project Home Page