

Jacob Abraham

Department of Electrical and Computer Engineering The University of Texas at Austin

> VLSI Design Fall 2020

October 29, 2020

Lecture 18. Design for Low Powe

Jacob Abraham, October 29, 2020 1 / 47

Jacob Abraham, October 29, 2020 1 / 47

#### Power and Energy

ECE Department, University of Texas at Austin

Power is drawn from a voltage source attached to the  $V_{DD}$  pin(s) of a chip

**Instantaneous Power:** 

$$P(t) = i_{DD}(t)V_{DD}$$

**Energy:** 

$$E = \int_0^T P(t)dt = \int_0^T i_{DD}(t)V_{DD}dt$$

**Average Power:** 

ECE Department, University of Texas at Austin

$$P_{avg} = \frac{E}{T} = \frac{1}{T} \int_0^T i_{DD}(t) V_{DD} dt$$

Energy stored in capacitor when it is charged from 0 to  $V_C$ ,

$$E_C = \int_0^\infty I(t)V(t)dt = \int_0^\infty C \frac{dV}{dt}V(t)dt = C \int_0^{V_c} V(t)dV = \frac{1}{2}CV_C^2$$

The capacitor releases this energy when it discharges back to 0 Lecture 18. Design for Low Power



 "Short-circuit" current while both p- and n-MOS networks are partially on

#### Static Dissipation

ECE Department, University of Texas at Austin

- Subthreshold leakage (through OFF transistors)
- Gate leakage through gate dielectric
- Junction leakage from source/drain diffusion
- Contention current in ratioed circuits

#### **Dynamic** Power

- Dynamic power is required to charge and discharge load capacitances when transistors switch
- One cycle involves a rising and falling output
- On rising output, charge  $Q = CV_{DD}$  is required
- On falling output, charge is dumped to GND
- This repeats  $Tf_{sw}$  times over an interval of T



### Activity Factor

ECE Department, University of Texas at a

• Suppose the system clock frequency = f

 $P_d$ 

- Let  $f_{sw} = \alpha f$ , where  $\alpha = \text{activity factor}$ 
  - If the signal is a clock,  $\alpha = 1$
  - If the signal switches once per cycle,  $\alpha = 1/2$
  - Dynamic gates: switch either 0 or 2 times per cycle, lpha=1/2
  - $\, {\rm \bullet} \,$  Static gates: depends on design, but typically  $\alpha = 0.1$

• Dynamic power:

ECE Department, University of Texas at Austin

$$y_{namic} = \alpha C V_{DD}^2 f$$

Jacob Abraham, October 29, 2020 5 / 47

Department of Electrical and Computer Engineering, The University of Texas at Austin J. A. Abraham, October 29, 2020

#### **Computing Activity Factors**

 $P_i$ : probability that node *i* is 1  $(1 - P_i \text{ is probability that it is 0})$ Activity factor of node i,  $\alpha_i$ , is the probability that the node is 0 in one cycle and 1 in the next

If probability is uncorrelated from cycle to cycle,  $\alpha_i = \bar{P}_i P_i$ Example: 4-input AND gate



Tools exist to calculate activity factors, either using probabilities, or by monitoring nodes during simulation

18 De

### Activity Factor Example

ECE Depa

Where there is reconvergent fanout, calculating probabilities becomes more difficult



n, October 29, 2020 6 / 47

#### Glitches Contribute to Power Consumption





### Short Circuit ("Crowbar") Current

ECE Department, University of Texas at Austin

- When transistors switch, both nMOS and pMOS networks may be momentarily ON at once
- Leads to a blip of "short circuit" current.
- $\bullet < 10\%$  of dynamic power if rise/fall times are comparable for input and output



transistors and the input slew

Jacob Abraham, October 29, 2020 9 / 47



#### Static Power

- Static power is consumed even when chip is quiescent.
  - Ratioed circuits burn power in fight between ON transistors
  - · Leakage draws power from nominally OFF devices

$$I_{ds} = I_{ds0} e^{\frac{V_{gs} - V_t}{nv_T}} \left[ 1 - e^{\frac{-V_{ds}}{v_T}} \right]$$
$$V_t = V_{t0} - \eta V_{ds} + \gamma \left( \sqrt{\phi_s + V_{sb}} - \sqrt{\phi_s} \right)$$

 $V_t = V_{t0} - \eta V_{ds} + \gamma (\sqrt{\phi_s} + V_{sb} - \sqrt{\phi_s})$ 

 $\eta$  describes drain-induced barrier lowering (DIBL),

 $\gamma$  describes the body effect

ECE Department, University of Texas at Austin

For any appreciable  $V_{ds}$ , the term in brackets approaches unity

Jacob Abraham, October 29, 2020 11 / 47

#### Leakage Example: Estimate Static Power

- Process has two threshold voltages and two oxide thicknesses
- Subthreshold leakage:
  - 20 nA/ $\mu$ m for low  $V_t$
  - 0.02 nA/ $\mu$ m for high  $V_t$
- Gate leakage:
  - 3 nA/ $\mu$ m for thin oxide
  - 0.002 nA/ $\mu$ m for thick oxide
- Memories use low-leakage transistors everywhere, and gates use low-leakage transistors on 80% of logic

High leakage:  $(20 \times 10^6)(0.2)(12\lambda)(0.05\mu m/\lambda) = 2.4 \times 10^6 \mu m$ Low leakage:

 $(20 \times 10^6)(0.8)(12\lambda)(0.05\mu m/\lambda) + (180 \times 10^6)(4\lambda)(0.05\mu m/\lambda) =$  $45.6 \times 10^{6} \mu m$ 

 $I_{static} = (2.4 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) [(20nA/\mu m)/2 + (3nA/\mu m)] + (45.6 \times 10^{6} \mu m) ] ]$  $10^{6}\mu m$ ][ $(0.02nA/\mu m)/2 + (0.002nA/\mu m)$ ] = 32mA

Lecture 18. Design for Low Po

 $P_{static} = I_{static} V_{DD} = 38 \ mW$ 

ECE Department, University of Texas at Austin

If no low-leakage devices used,  $P_{static} = 749 \ mW$ 

#### Gloom and Doom Predictions of Increasing Power









| Low Power Design                                                                                                                                                                                                                                                                                                                                                                                                                        |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ul> <li>Reduce dynamic power</li> <li>α: clock gating, sleep mode</li> <li>C: small transistors (especially on clock), short wires</li> <li>V<sub>DD</sub>: lowest suitable voltage</li> <li>f: lowest suitable frequency</li> <li>Reduce static power</li> <li>Selectively use ratioed circuits</li> <li>Selectively use low V<sub>t</sub> devices</li> <li>Leakage reduction: stacked devices, body bias, low temperature</li> </ul> |
| Use a combination of techniques at different levels                                                                                                                                                                                                                                                                                                                                                                                     |
| Algorithm                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Architecture                                                                                                                                                                                                                                                                                                                                                                                                                            |
| • Logic/circuit                                                                                                                                                                                                                                                                                                                                                                                                                         |
| • Technology/circuit                                                                                                                                                                                                                                                                                                                                                                                                                    |
| ECE Department, University of Texas at Austin Lecture 18. Design for Low Power Jacob Abraham, October 29, 2020 17 / 4                                                                                                                                                                                                                                                                                                                   |

ECE Department, University of Texas at Austin

Jacob Abraham, October 29, 2020 16 / 47

















### Gate Leakage

ECE Department, University of Texas at Austin

Affected by voltage across the gate



Jacob Abraham, October 29, 2020 26 / 47



# Gate and Subthreshold Leakage in NAND3 (nA)

| Input State (ABC) | l <sub>sub</sub> | / <sub>gate</sub> | I <sub>total</sub> | V <sub>x</sub> | ٧ <sub>z</sub> |
|-------------------|------------------|-------------------|--------------------|----------------|----------------|
| 000               | 0.4              | 0                 | 0.4                | stack effect   | stack effect   |
| 001               | 0.7              | 0                 | 0.7                | stack effect   | $V_{DD} - V_t$ |
| 010               | 0                | 1.3               | 1.3                | intermediate   | intermediate   |
| 011               | 3.8              | 0                 | 10.1               | $V_{DD} - V_t$ | $V_{DD} - V_t$ |
| 100               | 0.7              | 6.3               | 7.0                | 0              | stack effect   |
| 101               | 3.8              | 6.3               | 10.1               | 0              | $V_{DD} - V_t$ |
| 110               | 5.6              | 12.6              | 18.2               | 0              | 0              |
| 111               | 28               | 18.9              | 46.9               | 0              | 0              |

Jacob Abraham, October 29, 2020 29 / 47

Lecture 18. Design f

ECE Department, University of Texas at Austin



## Controlling Threshold Voltages for Reduced Leakage



- Low- $V_t$  on critical paths, High- $V_t$  on other paths for reduced leakage
- Longer transistors in the caches
- Thicker oxides for I/O transistors









#### RAZOR

ECE Department, University of Texas at Au

- Error-tolerant dynamic voltage scaling (DVS) technology which eliminates the need for the voltage margins required for "always correct" circuit operations design
- A different value in the shadow latch shows timing errors
- Pipeline state is recovered after timing-error detection
- Error detection is done at the circuit level
  - The design overhead is large if timing paths are well balanced in the design



b Abraham, October 29, 2020 34 / 47







ECE Department, University of Texas at Aust



Jacob Abraham, October 29, 2020 38 / 47

#### Simulation Results

- MIPS core implemented in 45nm process
- Optimized to meet target frequency of 1.5GHz
   Many critical paths
- Power results from HSPICE, PrimeTime and PrimeTimePX





| Low Power by De      | sign: StrongArm 110                     |
|----------------------|-----------------------------------------|
|                      |                                         |
|                      |                                         |
| Start with Alpha 2   | 1064: 200 MHz @ 3.45V, Power = 26 W     |
| Vdd reduction:       | Power reduction = $5.3X \implies 4.9W$  |
| Reduce functions:    | Power reduction = $3X \implies 1.6W$    |
| Scale process:       | Power reduction = $2X \implies 0.8W$    |
| Clock load:          | Power reduction = $1.3X \implies 0.6W$  |
| Clock rate:          | Power reduction = $1.25X \implies 0.5W$ |
| Source: D. Dobberpuh | d                                       |

# TransMeta Example

ECE Department, University of Texas at Austin

|                           | MHz                               | Voltage                             | % Full Power                                      |             |
|---------------------------|-----------------------------------|-------------------------------------|---------------------------------------------------|-------------|
|                           | 700                               | 1.65                                | 100%                                              |             |
|                           | 400                               | 1.4                                 | 41%                                               |             |
|                           | 333                               | 1.2                                 | 25%                                               |             |
| ◆ Crus                    | soe proc                          | essor sta                           | /700MHz * 1.4V <sup>2</sup> /<br>rts off at 700MH | z           |
| ◆ Crus<br>◆ DVD<br>◆ Powe | soe proce<br>movie r<br>er is red | essor sta<br>equires b<br>uced to 2 |                                                   | z<br>400MHz |

e 18. Design for Low F

Jacob Abraham, October 29, 2020 42 / 47





Department of Electrical and Computer Engineering, The University of Texas at Austin J. A. Abraham, October 29, 2020



