Parallel Performance Project Research Paper

Research Paper

Predictability of Load/Store Instruction Latencies
Santosh G. Abraham, Rabin A. Sugumar, Daniel Windheiser, B. R. Rau and Rajiv Gupta
Proceedings of the 26th Annual International Symposium on Microarchitecture, November 93.

Abstract

Due to increasing cache-miss latencies, cache control instructions are being implemented for future systems. We study the memory referencing behavior of individual machine-level instructions using simulations of fully-associative caches under MIN replacement. Our objective is to obtain a deeper understanding of useful program behavior that can be eventually employed at optimizing programs and to motivate architectural features aimed at improving the efficacy of memory hierarchies.

Our simulation results show that a very small number of load/store instructions account for a majority of data cache misses. Specifically, fewer than 10 instructions account for half the misses for six out of nine SPEC89 benchmarks. Selectively prefetching data referenced by a small number of instructions identified through profiling can reduce overall miss ratio significantly while only incurring a small number of unnecessary prefetches.
Back to Publication List, or Parallel Performance Project Home Page