Parallel Performance Project Research Paper
Research Paper
-
Reducing Conflicts in Direct-Mapped Caches with a Temporality-Based Design
Jude A. Rivers and Edward S. Davidson
Proceedings of the 1996 International Conference on Parallel
Processing, Vol I, pp 151-162, August 96.
Abstract
-
Direct-mapped caches are often plagued by conflict
misses because they lack the associativity to store more
than one memory block in each set. However, some blocks
that have no temporal locality actually cause program
execution degradation by displacing blocks that do manifest
temporal behavior. In this paper, we present a simple
but efficient novel hardware design called the Non-Temporal
Streaming (NTS) Cache that supplements the conventional
direct-mapped cache with a parallel fully
associative buffer. Every cache block loaded into the main
cache is monitored for temporal behavior by a hardware
detection unit. Cache blocks identified as nontemporal are
allocated to the buffer on subsequent requests. Our simulations
show that the NTS Cache not only provides a
performance improvement over the conventional direct-mapped
cache, but can also save on-chip area. For some
numerical programs like FFTPDE, APPSP and APPBT
from the NAS benchmark suite, an integral NTS Cache of
size 9KB (i.e., 8KB direct-mapped cache plus 1KB NT
buffer) performs as well as a 16KB conventional direct-mapped
cache.
Back to Publication List, or
Parallel Performance Project Home Page