Parallel Performance Project Research Paper
Research Paper
-
Impact of Load Imbalance on the Design of Software barriers
A. E. Eichenberger and S. G. Abraham
Proceedings of the 1995 International Conference on Parallel
Processing, Vol II, pp 63-72, August 95.
Abstract
-
Software barriers have been designed and evaluated for barrier
synchronization in large-scale shared-memory multiprocessors, under
the assumption that all processors reach the synchronization point
simultaneously. When relaxing this assumption, we demonstrate that
the optimum degree of combining trees is not four as previously
thought but increases from four to as much as 128 in a 4K system as
the load imbalance increases. The optimum degree calculated using our
analytic model yields a performance that is within 7% of the optimum
obtained by exhaustive simulation with a range of degrees. We also
investigate a dynamic placement barrier where slow processors migrate
toward the root of the software combining tree. We show that through
dynamic placement the synchronization delay can be reduced by a factor
close to the depth of the tree, when sufficient slack is available.
By choosing a suitable tree degree and using dynamic placement,
software barriers that are scalable to large numbers of processors can
be constructed. We demonstrate the applicability of our results by
performing measurements on a small SOR relaxation program running on a
56-processor KSR1.
Back to Publication List, or
Parallel Performance Project Home Page