Parallel Performance Project Research Paper
Research Paper
-
A Comparative Study of Cache-Coherent Nonuniform Memory Access Systems
Gheith A. Abandah and Edward S. Davidson
High Performance Computing Systems and Applications,
Kluwer Academic Publishers,
In the 12th Ann. Int'l Symp. on High Performance Computing Systems and
Applications (HPCS'98), pp 267-282, May 1998.
Abstract
-
We present a comparative study of three important CC-NUMA
implementations, Stanford DASH, Convex SPP1000, and SGI Origin 2000,
to find strengths and weaknesses of current implementations. Although
the three systems share many similarities, they have significant
differences that translate into large performance differences; e.g.,
number of processors per node, cache configuration, memory consistency
model, location of memory in the node, and cache-coherence protocol.
In this study, we evaluate the effects of these differences on cache
misses, miss time, and local and internode traffic.
We first model the three systems according to their original
parameters, and show that they have large performance differences due
to using different component speeds and sizes. We then put the three
systems on the same technological level by assigning them components
of similar size and speed but preserve their organization and
coherence protocol differences. Although the normalized Origin 2000
has the least average remote time, it spends the longest time
satisfying its misses because most of them are remote. DASH's Illinois
protocol and SPP1000's interconnect cache reduce their remote misses.
The SPP1000 has the highest average remote time because its coherence
protocol requires more signals to satisfy a miss than either of the
other two protocols; DASH achieves lower miss time and its relaxed
memory consistency model hides some of its miss time.
Back to Publication List, or
Parallel Performance Project Home Page