Parallel Performance Project Research Paper

Research Paper

Iteration Partitioning for Resolving Stride Conflicts on Cache-Coherent Multiprocessors
Karen A. Tomko and Santosh G. Abraham
Proceedings of the International Conference on Parallel Processing, vol. II, pp 95-102, August 93.

Abstract

We develop compile-time iteration partitioning techniques for private-cache shared-memory multiprocessors. Our techniques assign loop iterations to a set of processors so that cache coherency traffic due to interprocessor communication is minimized and load balance is maintained. In contrast to most previous research that has examined uniformly-generated dependences, we develop methods for non-uniform dependences that are generated by stride conflicts. Furthermore, we consider the effects of a long cache line size and minimize false coherency traffic. Our methods can handle conflicts between any two integer strides. We have conducted experiments on a 32-processor KSR-1 from Kendall Square Research which show 2x performance improvement using our partitioning algorithm over standard contiguous partitioning techniques.
Back to Publication List, or Parallel Performance Project Home Page