# EECS 427 Lecture 19: Memory core and peripherals Reading: 12.1-12.3

# Overview

- SRAM and DRAM basics
- Decoders
- Sense amplifiers
- Our focus is mainly on SRAM

# A Typical Memory Hierarchy

## By taking advantage of the principle of locality:

- Present the user with as much memory as is available in the cheapest technology
- Provide access at the speed offered by the fastest technology



# **Read-write Memory Review**

- SRAM
  - Data is stored as long as power is supplied
  - Relatively large cells, 6-transistors, lower density (vs. DRAM)
  - Fast use closer to computation
  - Compatible with CMOS technology
- DRAM
  - Data must be periodically refreshed
  - Small cells, 1 transistor, VERY dense
  - Slower, use in larger main memories
  - Process not compatible with standard CMOS

# 1-Transistor DRAM cell review



Write: C<sub>S</sub> is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance

$$\Delta V = V_{BL} - V_{PRE} = (V_{BIT} - V_{PRE}) \frac{C_S}{C_S + C_{BL}}$$

Voltage swing is small; typically around 250 mV.

EECS 427 W07

Lecture 19

# CMOS 6T SRAM cell review





# **1D Memory Architecture**



## 2D (array) Memory Architecture

## Problem: ASPECT RATIO or HEIGHT >> WIDTH



Can put column decode above/before sense amps instead

Total memory access time consists of many components

Row decode Word line drivers Cell driving bit line cap Sense amp delay Column decode Output driving circuitry

(which dominate?)

# **3D Memory Architecture**



#### Advantages:

- 1. Shorter wires within blocks
- 2. Block address activates only 1 block => power savings

# Peripheral Components of Memories

# Decoders (both row & column) Sense Amplifiers Not discussing: Input/Output Buffers Control / Timing Circuitry

# Row Decoders (M to 2<sup>M</sup>)

# Simplest visualization: Collection of 2<sup>M</sup> complex logic gates Organized in regular and dense fashion

(N)AND Decoder

$$WL_0 = \overline{A}_0 \overline{A}_1 \overline{A}_2 \overline{A}_3 \overline{A}_4 \overline{A}_5 \overline{A}_6 \overline{A}_7 \overline{A}_8 \overline{A}_9$$

$$WL_{511} = A_0 A_1 A_2 A_3 A_4 A_5 A_6 A_7 A_8 A_9$$

### **NOR Decoder**

$$WL_{0} = \overline{A_{0} + A_{1} + A_{2} + A_{3} + A_{4} + A_{5} + A_{6} + A_{7} + A_{8} + A_{9}}$$
$$WL_{511} = \overline{A_{0} + A_{1} + A_{2} + A_{3} + A_{4} + A_{5} + A_{6} + A_{7} + A_{8} + A_{9}}$$

# **Hierarchical Decoders**

Multi-stage implementation improves performance



# Decoder design goals

- Only 1 critical transition
  - 1 signal must go HIGH, going low again is not as crucial
- Can skew the gate sizing
- Can use dynamic logic with precharge

# **Dynamic Decoders**





## 2-input NOR decoder

## 2-input NAND decoder

NOR is faster but consumes more power since all but one word line pulls down each cycle

## 4-input pass-transistor based **column** decoder



Often use PMOS pass transistors instead since BLs precharge high (speed penalty)

Advantages: speed (Decode portion doesn't add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count

EECS 427 W07

Lecture 19



# **Sense Amplifiers**



## Idea: Use Sense Amplifier



# **Differential Sense Amplifier**



# SRAM sense amp design



20

# Summary

- Memory performance is critical to overall system performance
  - Memory hierarchy develops based on speed and size requirements
- On-chip SRAM very common today
  - 6T SRAM cells have became very compact
  - Complete memory architecture involves arrays + row/column decoders, sense amps, output drivers