Parallel Performance Project Research Paper

Research Paper

Early Design Cycle Timing Simulation of Caches
Edward S. Tam and Edward S. Davidson
Technical Report CSE-TR-317-96, University of Michigan, November, 1996.

Abstract

Cache performance is a key component of a microprocessor's overall performance, as it is the cache that buffers the high speed CPU from the much slower main memory. Behavioral cache simulators indicate the performance of a given cache configuration in terms of the number of hits and misses experienced when running a piece of code. One can attach a leading edge penalty or "effective" penalty estimate to each miss to get a first order idea of run time. However, individual timing penalties are not assessed within these models, nor are the various factors that can affect each memory access' latency. Our Latency Effects (LE) cache model accounts for additional latencies due to trailing edge effects, bus width considerations, the number of outstanding accesses allowable, and port limitations.

A tool implementing the LE cache model has been built on top of a behavioral cache simulator, DineroIII, in the spirit of the Resource Conflict Methodology developed by J-D Wellman. The tool was combined with Wellman's RCM_brisc processor simulator to provide a realistic interaction of the cache with the processor (including the latency masking effects of processor activity) and to assess the accuracy of the model when simulating the execution of actual programs. The combined tool accurately mirrors the effects of changing a cache's configuration for a given processor configuration running a variety of programs. While the reported execution times do not exactly match the total execution times of the same programs running on actual hardware, the tool provides enough useful information to guide processor/cache designers early in the design cycle toward optimal configurations for target applications. This addition of cache modeling to the RCM_brisc instruction-level processor simulator with perfect cache increases simulation time by only 17% (less than 5% over a constant miss penalty cache model), which is reasonable given the added benefits of using the LE cache model.

Back to Publication List, or Parallel Performance Project Home Page