x

Online Quizzes

The purpose of the online quizzes is to ensure that you have read and understood the papers in advance of class.
The quiz questions are not intended to be difficult or tricky; the answers to the questions should be known or easily found by anyone who has read the paper. However, the questions are designed so that you cannot easily find the answers within five minutes if you have not read the papers in advance. Hence read the papers before attempting the quizzes.
PDFs of the readings are available in Canvas.
Unit 1: Parallel Computing Models
L1: Introduction
L2: Message Passing & Shared Memory
(1)   M. D. Hill, S. Adve, L. Ceze, M. J. Irwin, D. Kaeli, M. Martonosi, J. Torrellas, T. F. Wenisch, D. Wood, K. Yelick - 21st Century Computer Architecture, CCC Whitepaper, 2012
(2)   David Wood and Mark Hill, Cost-Effective Parallel Computing, IEEE Computer, 1995
L3: Data-level Parallelism
(3)   Larrabee: A Many-Core x86 Architecture for Visual Computing. Siggraph 2008.
L4: GPUs
(4)   H Kim,R Vuduc,S Baghsorkhi,J Choi,Wen-mei Hwu, Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU), Ch. 1
(5)   C Bienia, S Kumar, J.P. Singh, and K. Li, The PARSEC Benchmark Suite: Characterization and Architectural Implications, PACT 2008
L5: Applications
(6)   P. Ranganathan, K. Gharachorloo, S. V. Adve, and L. A. Barroso, Performance of Database Workloads on Shared-Memory Systems with Out-of-Order Processors, ASPLOS 1998
(7)   M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. Popescu, A. Ailamaki, B. Falsafi, Clearing the Clouds: A Study of Emerging Workloads on Modern Hardware, ASPLOS 2012
Unit 2: Synchronization
L6: Synchronization
(1)   Michael Scott, Shared-Memory Synchronization Synthesis Lectures on Computer Architecture (Ch. 1, 4.0-4.3.3, 5.0-5.2.5
(2)   Alain Kagi, Doug Burger, and Jim Goodman. Efficient Synchronization: Let Them Eat QOLB, Proc. 24th International Symposium on Computer Architecture (ISCA 24), June, 1997
L7: Transactional Memory
(3)   Michael Scott, Shared-Memory Synchronization Synthesis Lectures on Computer Architecture (Ch. 9.0-9.2.3
(4)   Ravi Rajwar and James R. Goodman. Speculative lock elision: enabling highly concurrent multithreaded execution. In Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, Dec. 2001.
(5)   M. Herlihy, Wait-Free Synchronization, ACM Trans. Program. Lang. Syst. 13(1): 124-149 (1991)
Unit 3: Coherence and Consistency
L8: Snooping Cache Coherence
(1)   Daniel J. Sorin, Mark D. Hill, and David A. Wood, A Primer on Memory Consistency and Cache Coherence (Ch. 6 & 7)
L9: Snoop-based Multiprocessors
(2)   Alan Charlesworth, StarFire: Extending the SMP Envelope, IEEE Micro, Jan. 1998.
L10: Directory-based Coherence
(3)   Chaiken et al., Directory-Based Cache Coherence Protocols for Large-Scale Multiprocessors, IEEE Computer, 19-58, June 1990.
(4)   Daniel J. Sorin, Mark D. Hill, and David A. Wood, A Primer on Memory Consistency and Cache Coherence , Chapter 8
L11: Coherence Optimization & COMA
(5)   A. Gupta et al. "Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes". ICPP 1990.
(6)   Erik Hagersten, Anders Landin, and Seif Haridi, DDM--A Cache Only Memory Architecture, IEEE Computer, Sep. 1992.
L12 Memory Consistency
(7)   Daniel J. Sorin, Mark D. Hill, and David A. Wood, A Primer on Memory Consistency and Cache Coherence, Ch. 3-4
L13 Relaxed Memory Consistency
(8)   A. Singh, Satish Narayanasamy, Daniel Marino, Todd Millstein, Madanlal Musuvathi. A Safety-First Approach to Memory Models. IEEE Micro, Top Picks from the 2012 Computer Architecture Conferences
(9)   K. Gharachorloo, D. Lenoski, J. Laudon, P. B. Gibbons, A. Gupta, and J. L. Hennessy, Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors, ISCA 1990
L14 Speculative Consistency
(10)   K. Gharachorloo et al. "Two Techniques to Enhance the performance of Memory Consistency Models". ICPP 1991.
(11)   C. Blundell, M. M. K. Martin, T.F. Wenisch, InvisiFence: Performance-transparent Memory Ordering in Conventional Multiprocessors, ISCA 2009
L15 Speculative Consistency
(12)   B. Boehm, S. Adve, Foundations of the C++ Concurrency Model, PLDI 2008
L17 DeNovo
(13)   B. Choi et al, DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism, PACT 2011
(14)   B. Hechtman, D. Sorin, Exploring memory consistency for massively-threaded throughput-oriented processors, ISCA 2013
Unit 4: Interconnection Networks
L16: Interconnects: Intro
(1)   D. Hower et al, Heterogeneous-race-free memory models, ASPLOS 2014
(2)   Mukherjee et al. The Alpha 21364 Network Architecture, Hot Interconnects 2001.
L17 Interconnects: Topology
(3)   On-Chip Networks, Synthesis Lecture, Jerger and Peh, Ch. 3
(4)   Kim, Dally, & Abts. Flattened Butterfly : A Cost-Efficient Topology for High-Radix Networks. ISCA 2007.
L18 Interconnects: Routing
(5)   Scott & Thorson. The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus, Hot Interconnects 1996.
(6)   On-Chip Networks, Synthesis Lecture, Jerger and Peh, Ch. 4
L19 Interconnects: Flow Control
(7)   On-Chip Networks, Synthesis Lecture, Jerger and Peh, Ch. 5
L20 Interconnects: Router uArch
(8)   On-Chip Networks, Synthesis Lecture, Jerger and Peh, Ch. 6
(9)   Kim, Dally, Towles, & Gupta. Microarchitecture of a High-Radix Router. ISCA 2005.
Unit 5: Modern & Unconventional Multiprocessors
L22 Multithreading
(1)   D. Tullsen et al. "Simultaneous multithreading: Maximizing On-Chip Parallelism". ISCA 1995.
(2)   Gurindar S. Sohi, Scott E. Breach, and T. N. Vijaykumar. 1995. Multiscalar processors. In Proceedings of the 22nd annual international symposium on Computer architecture (ISCA 95).