# Chip/Package Mechanical Stress Impact on 3-D IC Reliability and Mobility Variations

Moongon Jung, Student Member, IEEE, David Z. Pan, Senior Member, IEEE, and Sung Kyu Lim, Senior Member, IEEE

Abstract-In this paper, we propose a fast and accurate chip/package thermomechanical stress co-analysis tool for through-silicon-via (TSV)-based 3-D ICs. We use our tool for fullstack mechanical reliability as well as stress-aware timing analyses. First, we analyze the stress induced by chip/package interconnect elements, i.e., TSV,  $\mu$ -bump, and package bump. Second, we explore and validate the principle of lateral and vertical linear superposition of stress tensors (LVLS), considering all chip/package elements. The proposed LVLS method greatly reduces the complexity of stress calculation compared with the conventional finite element analysis method with high enough accuracy for fullchip/package-scale stress simulations and reliability analysis. In addition, we build hole and electron mobility variation maps based on LVLS. Finally, we study the mechanical reliability issues and provide full-stack timing analysis results in practical 3-D chip/package designs including wide-I/O and block-level 3-D ICs.

*Index Terms*—3-D IC, chip/package co-analysis, full-stack timing, mechanical reliability, stress, TSV.

#### I. INTRODUCTION

OST PREVIOUS works on the thermomechanical stress and reliability of through-silicon-via (TSV)-based 3-D ICs have been done separately in chip or package domain. The impact of TSV-induced stress due to coefficient of thermal expansion (CTE) mismatch between TSV and substrate materials on device performance [1] and crack growth in TSV [2] were studied in the chip domain. As for the package domain, many works focused on the reliability of package bump (= C4 bump) [3]. Recently, Nakamato *et al.* [4] showed a significant impact of package components on the chip domain stress. They proposed a stress-exchange file to transfer the boundary conditions from package-level to silicon-level analysis.

However, all of these approaches require finite element analysis (FEA) methods which are computationally expensive

Manuscript received December 21, 2012; revised March 23, 2013; accepted April 23, 2013. Date of current version October 16, 2013. This work was supported in part by the National Science Foundation under Grant CCF-1018216 and Grant CCF-1018750, the Semiconductor Research Corporation under Grant CADTS-2238 and Grant CADTS-2239, an IBM Faculty Award, and Intel Corporation. This paper was recommended by Associate Editor C. C.-N. Chu.

M. Jung and S. K. Lim are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: moongon@gatech.edu; limsk@ece.gatech.edu).

D. Z. Pan is with the ECE Department, University of Texas at Austin, Austix, TX 78712 USA (e-mail: dpan@ece.utexas.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCAD.2013.2265372

or infeasible for full-chip or package analysis. To overcome the limitation of FEA method, the linear superposition of stress tensors [5] and the response surface method [6] were utilized. Nonetheless, all of these are limited to the chip domain analysis.

The package bumps, underfill, and packaging substrate all add further mechanical stress to the 3-D IC mounted above it in a nontrivial way. To accurately assess thermomechanical reliability problems and device performance variations in 3-D IC/package systems, it is imperative to consider the interplay between the stress caused by the TSVs and the one by these packaging elements simultaneously. Moreover, to enable a chip/package co-design for better reliability and timing under the chip/package stress impact, we need a fast and accurate enough chip/package mechanical stress co-analysis tool.

In this paper, we propose a full-chip/package-scale mechanical stress and reliability co-analysis flow as well as a design optimization methodology to reduce the mechanical reliability problems in TSV-based 3-D ICs. Additionally, we address the mobility and full-stack timing variations caused by the CTE mismatch among the materials in full-chip/package scale. The main contributions of this work include the following.

- Reliability modeling: Compared with existing works, we simulate more detailed 3-D IC structures including both chip and package components and study their interaction and impact on thermomechanical stress and reliability.
- 2) Mobility variation modeling: We study the impact of chip and package stress on hole and electron mobility variations of the devices as well as the impact on full-chip path delay. In addition, we provide a theoretical background on why 2-D stress and 3-D stress models lead to different mobility variations.
- 3) The lateral and vertical linear superposition (LVLS) method: LVLS is our theoretical contribution to handle full-chip/package stress analysis for 3-D IC. We validate the principle of LVLS of stress tensors against FEA simulations. We apply this methodology to obtain stress and reliability maps in full-stack (= chip/package) scale. This LVLS method significantly reduces run time compared with FEA method without losing much accuracy.
- 4) Full-stack timing analysis: We develop a full-stack static timing analysis (STA) flow considering the stress induced by chip/package interconnect elements. We compare this with a 2-D stress model [7] and a 3-D stress without package components [5].



Fig. 1. Impact of bumps and underfill on the stress of device layer (= red line). (a) TSV only [5]. (b) TSV +  $\mu$ -bump. (c) TSV + package-bump. (d) TSV +  $\mu$ -bump + package-bump. (e) Deformed structure of (b). (f) Deformed structure of (c). Both (e) and (f) are drawn with  $10\times$  the deformation scale factor.

5) Case studies: We study the chip/package reliability issues and full-stack timing variations using practical designs including wide-I/O and block-level 3-D ICs. We demonstrate the effect of high impact design parameters such as the alignment between TSVs and bumps.

#### II. MOTIVATION

We first examine how various chip/package interconnect components interact and alter the thermomechanical stress distribution on the device layer around TSV caused by the CTE mismatch between TSV and substrate materials. First, we only consider TSV and substrate which most previous works studied. We employ the same simulation structure used in [5] as shown in Fig. 1(a).

Then, we add a  $\mu$ -bump and underfill layer above the substrate as shown in Fig. 1(b). All structures undergo  $\Delta T = -250\,^{\circ}\mathrm{C}$  of thermal load (annealing/reflow 275  $^{\circ}\mathrm{C} \to \mathrm{room}$  temperature 25  $^{\circ}\mathrm{C}$ ). As Fig. 2 shows, by adding the  $\mu$ -bump layer (= dotted red line), we see slightly more tensile (= positive) stress than the TSV-only case (= solid black line). This is because  $\Delta\mathrm{CTE}$  of  $\mu$ -bump and underfill is 24 ppm/K, while that of TSV and substrate is 14.7 ppm/K, hence the deformation of the entire structure is largely determined by the  $\mu$ -bump and underfill layer. Since the top side of  $\mu$ -bump layer is free surface, the entire structure easily bends upward as all the elements shrink from the negative thermal load as shown in Fig. 1(e). Thus, the materials on device layer stretch outward, which results in more tensile stress. side of this  $\mu$ -bump layer would show symmetrical bending behavior.

On the other hand, if we add a package-bump layer below the substrate as shown in Fig. 1(c), now the entire structure bends downward as shown in Fig. 1(f) because package elements are shrinking more than chip elements. The  $\Delta$ CTE of package bump and underfill is 22 ppm/K. This generates highly compressive (= negative) stress on the device layer. Comparing Fig. 1(b) and (c), we see that the bending direction



Fig. 2. Impact of package components on the stress  $(\sigma_{rr})$  around TSV on device layer (FEA results).



Fig. 3. Comparison of impact of package-bump on the device layer stress  $(\sigma_{rr})$  between 2-D IC and 3-D IC (two-die stack) (FEA results).

depends on which layer shrinks more: in both cases, the bump layers shrink more than the silicon substrate.

Lastly, we include both bump layers as shown in Fig. 1(d). In this case, the  $\Delta$ CTE is almost the same (24 ppm/K on the top, 22 ppm/K on the bottom). However, the overall structure bends down in a similar fashion as shown in Fig. 1(f) because of the sheer volume of package bump layer (= shrinking more than the  $\mu$ -bump layer). This in turn causes compressive stress in the device layer. However, the magnitude is slightly more (= solid green line in Fig. 2) than the package-bump layer only case (= dotted blue line).

One might expect the overall compressive stress would be less because the  $\mu$ -bump layer tries to bend upward while the package-bump layer tries to bend downward (= canceling effect). However, this additive effect is because the  $\mu$ -bump layer eventually bends down and adds more compressive stress to the device layer. Note that the bending direction of the  $\mu$ -bump layer is affected by adjacent layers. Since now the deformation of the entire structure is dominated by the package-bump layer, the flexible underfill material in the  $\mu$ -bump layer easily bends downward. These basic simulations clearly show the importance of considering package element impact on the chip-domain stress distribution.

Fig. 3 shows the stress contributions of package bump and underfill layer to the chips (2-D versus 3-D) mounted on it. For the 3-D IC/package structure, we build a two-die stack chip/package structure similar to Fig. 4(a) excluding TSV and  $\mu$ -bump. This was to examine the impact of package-bump solely. The bottom die (= die0) is thinned, and we examine the device layer of this thin die. One 2-D IC/package structure is also created, where we use a single un-thinned die of  $1000~\mu$ m thickness. We examine the device layer of this un-thinned die.

$$\sigma_v = \sqrt{\frac{(\sigma_{xx} - \sigma_{yy})^2 + (\sigma_{yy} - \sigma_{zz})^2 + (\sigma_{zz} - \sigma_{xx})^2 + 6(\sigma_{xy}^2 + \sigma_{yz}^2 + \sigma_{zx}^2)}{2}}.$$
 (1)



Fig. 4. Side view of baseline chip/package simulation structures. (a) Two-die stack. (b) Four-die stack.

We observe in Fig. 3 that the 3-D IC experiences more severe compressive stress than the 2-D IC case. The main reason is the thickness and the flexibility of the die that we are monitoring. Even though the thickness of the entire structure is thicker in 3-D IC, the thin die (30  $\mu$ m thick) and the underfill material above the thin die is much more flexible than the un-thinned substrate in 2-D IC. Thus, this thin die is highly affected by the package-bump underneath it. This indicates that the impact of package-bump is more significant in 3-D IC.

#### III. MECHANICAL STRESS MODELING

Stress at a point in an object can be defined by the ninecomponent stress tensor

$$\sigma = \sigma_{ij} = \left[ \begin{array}{ccc} \sigma_{11} & \sigma_{12} & \sigma_{13} \\ \sigma_{21} & \sigma_{22} & \sigma_{23} \\ \sigma_{31} & \sigma_{32} & \sigma_{33} \end{array} \right]$$

where the first index i indicates that the stress acts on a plane normal to the i-axis, and the second index j denotes the direction in which the stress acts. If index i and j are same we call this a normal stress, otherwise a shear stress. Since we adopt a cylindrical coordinate system for the cylindrical TSV,  $\mu$ -bump, and package-bump, index 1, 2, and 3 represent r,  $\theta$ , and z, respectively.

We use the von Mises yield criterion [8] as a mechanical reliability metric for TSVs. However, we do not use a specific threshold value for the von Mises criterion in this paper, since it is greatly affected by fabrication process. We compute von Mises stress using (1).

# A. Chip/Package Co-Simulation Structure

Fig. 4 shows our simulation structure, where the dimensions of our baseline simulation structures are based on the published data [4]. In this paper, we specifically examine the stress distribution on device layer for each die shown in red lines in



Fig. 5. Impact of die stacking on device layer stress.  $\sigma_{rr}$  stress on device layer in each die in four-die stack (FEA results).

Fig. 4. Our baseline TSV diameter, height, landing pad size, Cu diffusion barrier thickness, and dielectric liner thickness are 5  $\mu$ m, 30  $\mu$ m, 6  $\mu$ m, 50 nm, and 125 nm, respectively. We use Ti and SiO<sub>2</sub> as Cu diffusion barrier and liner materials. Also, diameter/height of  $\mu$ -bump and package-bump are 20  $\mu$ m and 100  $\mu$ m, respectively, unless otherwise specified.

Material properties used for our simulations are as follows: CTE (ppm/K)/Young's modulus (GPa) for Cu = (17/110), Si = (2.3/188), SiO<sub>2</sub> = (0.5/71), Ti = (8.6/116), package-bump (SnCu)= (22/44.4),  $\mu$ -bump (Sn<sub>97</sub>Ag<sub>3</sub>) = (20/26.2), underfill = (44/5.6), package substrate (FR-4) = (17.6/19.7).

We use a FEA simulation tool ABAQUS to perform experiments, and all materials are assumed to be linear elastic and isotropic [2], [9]. The entire structure undergoes  $\Delta T = -250\,^{\circ}\text{C}$  of thermal load (annealing/reflow 275  $^{\circ}\text{C} \rightarrow$  room temperature 25  $^{\circ}\text{C}$ ) to represent a fabrication process. In addition, all materials are assumed to be stress free at the annealing/reflow temperature.

## B. Impact of Die Stacking

Previous works on the full-chip thermomechanical stress analysis used the same stress pattern for different dies in a multiple-die stack [1], [5]. In this section, we examine how the thermomechanical stress distribution on the device layer around a TSV differs across dies. We employ a four-die stack structure for this purpose. Also, we use only one TSV,  $\mu$ -bump, and package-bump for each die or layer, respectively, and their center locations are aligned as shown in Fig. 4.

First of all, the stress level, the extent of compression or tension, differs significantly across dies as shown in Fig. 5. The overall stress trend remains similar: the stress is highest at TSV edge and decays then saturates as distance increases from the TSV center. However, the bottom-most die (= die0, solid red line), which is closest to the package-bump layer, shows the most compressive stress among three dies containing TSV. This is because the impact of package-bump is most significant in die0 due to their proximity.

Also, as we go to the upper dies, the stress level becomes closer to the case considering TSV and substrate only. We also see that the stress curve of die0 is very close to the case



Fig. 6. Impact of relative position between TSV/ $\mu$ -bump and package-bump on von Mises stress. (a) Initial position. (b) Final position where TSV/ $\mu$ -bump are shifted by 300  $\mu$ m from package bump center. (c) von Mises stress at TSV edge along the distance between TSV/ $\mu$ -bump and package-bump (FEA results).

of TSV +  $\mu$ -bump + package-bump (= dotted purple line), which does not contain the package substrate and un-thinned top die shown in Fig. 1(d). This also indicates that the stress level in die0 is mostly determined by package-bump. The stress distribution in die3 (un-thinned top die without TSVs) is almost flat ( $-110\pm5$  MPa). Since die3 does not contain any TSVs, there is no local von Mises stress peak (= dangerous region) caused by TSVs. Thus, we only consider the dies containing TSVs in this paper.

## C. Impact of TSV and Bump Alignment

In this section, we explore the impact of alignment between TSV,  $\mu$ -bump, and package-bump on the mechanical reliability of TSVs. We first examine the impact of relative position between TSV/ $\mu$ -bump and package-bump. We use a two-die stack structure in which center locations of TSV,  $\mu$ -bump, and package-bump are aligned as shown in Fig. 6(a). Then we shift both TSV and  $\mu$ -bump together from the package-bump center with a 25  $\mu$ m step and monitor the von Mises stress at the right edge of TSV.

Fig. 6(c) shows that the von Mises stress is maximum around package-bump edge region and then decreases and saturates as distance increases. The difference between minimum and maximum is as high as 11.1%. As Fig. 3 shows, the highest stress gradient occurs around package-bump edge which results in the highest deformation of the structure near this region. Hence, this higher deformation causes more severe mechanical reliability problem in TSV.

We also see the decrease in von Mises stress near the package-bump center. This is because the material around this area is the same (= package-bump material), hence its deformation is relatively smaller than the edge which is the interface between two different materials.

In addition, we examine whether the relative position between  $\mu$ -bump and TSV/package-bump affects the mechanical reliability of TSV. We fix the location of TSV and package-bump whose centers are aligned, then move  $\mu$ -bump only

with a  $5 \mu m$  step up to  $30 \mu m$  and monitor the von Mises stress at TSV edges. We observe the similar trend as before. However, the difference between minimum and maximum is only 6.5 MPa (0.8%), which is negligible. Thus, we identify that the relative position between TSV and package-bump is a critical factor that affects the mechanical reliability of TSV.

#### IV. MOBILITY VARIATION MODELING

## A. Need for True 3-D Chip/Package Stress Model

The analytical 2-D radial stress model, known as *Lamé* stress solution, was employed to address the TSV thermomechanical stress. This 2-D plane solution assumes an infinitely long TSV embedded in an infinite silicon substrate and provides stress distribution in silicon substrate region, which can be expressed as follows [10]:

$$\sigma_{rr}^{Si} = -\sigma_{\theta\theta}^{Si} = -\frac{E\Delta\alpha\Delta T}{2} \left(\frac{D_{TSV}}{2r}\right)^{2}$$

$$\sigma_{zz}^{Si} = \sigma_{rz}^{Si} = \sigma_{\theta z}^{Si} = \sigma_{r\theta}^{Si} = 0$$
(2)

where  $\sigma^{Si}$  is stress in silicon substrate, E is Young's modulus,  $\Delta \alpha$  is mismatch in CTE,  $\Delta T$  is differential thermal load, r is the distance from TSV center, and  $D_{TSV}$  is TSV diameter.

Authors in work [7] used this 2-D analytical solution to assess the impact of TSV-induced stress on the mobility variation and full-chip timing. However, in [7] only  $\sigma_{rr}$  stress term was considered while all other eight stress tensor elements were set to zero. When only one normal stress component is considered, we call this uniaxial stress. However, stress is biaxial in nature in an elastic object as (2) indicates: there exist two nonzero normal stress components, i.e.,  $\sigma_{rr}$  and  $\sigma_{\theta\theta}$ . Since the mobility variation depends on the piezoresistive effect due to stress, the mobility variation pattern may change depending on the choice of stress mode.

Although this closed-form formula is easy to handle, this 2-D solution is only applicable to the structure with TSV and substrate only, hence it is inappropriate for the realistic TSV structure with a Cu diffusion barrier and a dielectric liner. In addition, a huge stress magnitude discrepancy was observed around TSV edge on the device layer between the 2-D stress model and the 3-D FEA simulations [5]. This is simply because a 3-D TSV structure cannot be correctly modeled by the 2-D plane solution due to the change in boundary conditions, especially near the top and bottom of the structure. Moreover, packaging elements and die-stacking affect stress distribution on each device layer differently. Therefore, if we consider the 3-D stress tensors, i.e., nonzero nine stress components, as well as packaging elements, the mobility variation pattern can be significantly different from 2-D stress cases.

# B. Piezoresistivity

In semiconductors, changes in interatomic spacing resulting from strain affect the bandgaps, making it easier or harder for electrons—depending on the material and strain—to be raised into the conduction band. This results in a change in resistivity

 $\label{eq:table I} \mbox{TABLE I}$  Piezoresistive Coefficient (\$TPa^{-1}\$) in (100) Si Wafer [12]

| Type      | $\pi_{11}$ | $\pi_{12}$ | $\pi_{44}$ | $\pi'_{11}$ | $\pi'_{12}$ | $\pi'_{44}$ |
|-----------|------------|------------|------------|-------------|-------------|-------------|
| N-type Si | -650       | 330        | -120       | -220        | -100        | -980        |
| P-type Si | -40        | 30         | 970        | 480         | -490        | -70         |

of the semiconductor, which also can be translated to a change in mobility as follows [11]:

$$\frac{\Delta R}{R} = -\frac{\Delta \mu}{\mu} = \left[ \pi'_{11} \sigma_{xx} + \pi'_{12} \sigma_{yy} \right] \cos^2 \phi 
+ \left[ \pi'_{11} \sigma_{xx} + \pi'_{12} \sigma_{yy} \right] \sin^2 \phi 
+ \pi_{12} \sigma_{zz} + \pi'_{44} \sigma_{xy} \sin 2\phi$$
(3)

where  $\sigma_{ij}$  is the stress in the silicon substrate in Cartesian coordinate system, and  $\phi$  is an angle between the wafer orientation and the transistor channel.

In this paper, we assume the (100) Si wafer with reference axes of [110], [ $\bar{1}10$ ], and [001]. We also assume that the transistor channel direction and the *x*-axis ([110]) are identical. In this setup,  $\pi'_{ij}$  is the piezoresistivity coefficient defined along the reference axes of (100) Si wafer listed in Table I

$$\pi'_{11} = \frac{\pi_{11} + \pi_{12} + \pi_{44}}{2}$$

$$\pi'_{12} = \frac{\pi_{11} + \pi_{12} - \pi_{44}}{2}$$

$$\pi'_{44} = \pi_{11} - \pi_{12}.$$

Note that the piezoresistivity coefficients in Table I were obtained under 1.5 GPa biaxial strain [12]. Thus, our mobility analysis results can provide an accurate assessment of stress impact on device performance and full-chip timing variations in deep submicrometer technologies. Many previous works [1], [7] used piezoresistivity coefficients for lightly doped n- and p-type silicon without any strain. From our mobility simulations, the case with piezoresistivity coefficients without strain shows up to 46% more mobility variations than the case with strain. In the latter case, the silicon is already highly stress engineered, hence the impact of TSV stress on the mobility variation reduces.

# C. Mobility Variation: 2-D Versus 3-D Stress

In this section, we examine the impact of different stress cases on the mobility variation around a single TSV. To utilize (3), we first need to convert stress tensors from cylindrical coordinate system ( $S_{rgz}$ ) to Cartesian coordinate system ( $S_{xyz}$ )

$$S_{xyz} = \begin{bmatrix} \sigma_{xx} & \sigma_{xy} & \sigma_{xz} \\ \sigma_{yx} & \sigma_{yy} & \sigma_{yz} \\ \sigma_{zx} & \sigma_{zy} & \sigma_{zz} \end{bmatrix} S_{r\theta z} = \begin{bmatrix} \sigma_{rr} & \sigma_{r\theta} & \sigma_{rz} \\ \sigma_{\theta r} & \sigma_{\theta \theta} & \sigma_{\theta z} \\ \sigma_{zr} & \sigma_{z\theta} & \sigma_{zz} \end{bmatrix}.$$

The transform matrix Q is the form

$$Q = \begin{bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end{bmatrix}$$

where  $\theta$  is the angle between the x-axis and a line from the origin to the center of a transistor channel. A stress tensor in a cylindrical coordinate system can be converted to a Cartesian coordinate system using conversion matrices:  $S_{xyz} = QS_{r\theta z}Q^{T}$ .



Fig. 7. Mobility variation map around a single TSV. (a) Hole mobility (2-D biaxial stress). (b) Electron mobility (2-D biaxial stress). (c) Hole mobility in die0 in four-die stack (3-D stress with package components). (d) Electron mobility in die0 in four-die stack (3-D stress with package components). For both (c) and (d) TSV,  $\mu$ -bump, and package-bump are vertically aligned.

Now we examine how different stress cases affect the mobility variation pattern. We first show stress tensor components in Cartesian coordinate system converted from cylindrical coordinate system shown in (3). Then, we derive the mobility variation formula for each case. We assume that the x-axis and the transistor channel direction are identical ( $\phi = 0$ ).

1) **2-D uniaxial stress**:  $\sigma_{rr} \neq 0$ , all other stress terms = 0

$$\sigma_{xx} = \sigma_{rr} \cos^2 \theta, \, \sigma_{yy} = \sigma_{rr} \sin^2 \theta, \, \sigma_{zz} = 0$$
$$-\Delta \mu / \mu = \pi'_{11} \sigma_{rr} \cos^2 \theta + \pi'_{12} \sigma_{rr} \sin^2 \theta. \tag{4}$$

2) **2-D biaxial stress**:  $\sigma_{rr} = -\sigma_{\theta\theta} \neq 0$ , all other stress terms = 0

$$\sigma_{xx} = -\sigma_{yy} = \sigma_{rr} \cos 2\theta, \, \sigma_{zz} = 0$$
$$-\Delta \mu / \mu = \pi'_{11} \sigma_{rr} \cos 2\theta - \pi'_{12} \sigma_{rr} \cos 2\theta = \pi_{44} \sigma_{rr} \cos 2\theta. \tag{5}$$

3) **3-D stress**: all stress tensor components  $\neq 0$ 

$$\sigma_{xx} = \sigma_{rr} \cos^2 \theta + \sigma_{\theta\theta} \sin^2 \theta - \sigma_{r\theta} \sin 2\theta$$

$$\sigma_{yy} = \sigma_{rr} \sin^2 \theta + \sigma_{\theta\theta} \cos^2 \theta + \sigma_{r\theta} \sin 2\theta$$

$$\sigma_{zz} \neq 0$$

$$-\Delta \mu/\mu = \pi'_{11} \sigma_{xx} + \pi'_{12} \sigma_{yy} + \pi_{12} \sigma_{zz}.$$
 (6)

It is clear from the above expressions that the trend of mobility variation is different between these stress cases. Mobility variation maps around a single TSV for the 2-D biaxial stress (2-D biaxial) and the 3-D stress with package components (3-D wPkg) are shown in Fig. 7. We see a significant difference in the electron mobility variation maps, which will be discussed in detail in Section VI.

Comparing both 2-D stress cases, we observe that the electron mobility in the 2-D uniaxial stress (2-D uniaxial) improves regardless of angle  $\theta$ , since both  $\pi'_{11}$  and  $\pi'_{12}$  are negative for N-type silicon and  $\sigma_{rr}\cos^2\theta$  and  $\sigma_{\theta\theta}\sin^2\theta$  terms are nonnegative. On the other hand, the sign of electron



Fig. 8. Mobility variation range of a single TSV with different stress cases. Mobility variation numbers are collected along the *x*-axis and the *y*-axis from a TSV center on device layers. (a) Hole mobility under 2-D and 3-D stress without package components. (b) Electron mobility under 2-D and 3-D stress without package components. (c) Hole mobility under 3-D stress with package components in four-die stack. (d) Electron mobility under 3-D stress with package components in four-die stack.

mobility variation in the 2-D biaxial case depends on  $\theta$ , which is shown in Fig. 8(b). We also observe that the 2-D uniaxial case underestimates the hole mobility variation range compared with the 2-D biaxial case. Thus, using 2-D uniaxial model in [7] may result in erroneous results.

As for the 3-D stress without package components case (3-D woPkg) shown in Fig. 8(a) and (b), the hole mobility variation range is larger than the 2-D biaxial case. Also, the electron mobility variation is not symmetric along the x-axis and the y-axis unlike the 2-D biaxial case. This is largely due to the nonzero  $\sigma_{zz}$  term. Note that in cases of 2-D uniaxial, 2-D biaxial, and 3-D woPkg, stress tensors are assumed to be identical across tiers, hence there is no difference in mobility variations in different dies in the 3-D stack.

As we include package components, the electron mobility variation differs across the stack as shown in Fig. 8(d). This is mainly due to the large compressive stress generated by the package-bump. This effect is most significant in die0, which is closest to the package-bump layer shown in Fig. 4. We will discuss more details in Section VI.

# V. HANDLING FULL-STACK: THE LVLS METHOD

FEA simulation for multiple TSVs, μ-bumps, and packagebumps require huge computing resources and time, thus it is not feasible for a full-system-scale analysis. In this section, we present a chip/package thermomechanical stress co-analysis flow in full-chip/package scale. We use the principle of LVLS of stress tensors from individual TSVs, μ-bumps, and packagebumps to enable a full-system-level analysis. This LVLS method provides a fast and accurate view of thermomechanical stress and reliability of the full-chip/package system. Thus, our tool can be applicable to a chip/package co-design method to manage the mechanical stress and reliability as well as performance variations in the 3-D system. Before employing our method for full-chip/package-scale analysis, we validate the accuracy of our LVLS method by comparing with FEA simulation results that contain a small number of TSVs and bumps.

## A. Lateral and Vertical Linear Superposition

In [5], authors used the principle of linear superposition of stress tensors to perform a full-chip stress and reliability analysis considering many TSVs. In that case, all stress contributors (= TSVs) are on the same layer, hence we call this lateral linear superposition. However, as we consider the impact of  $\mu$ -bump and package-bump, which are not in the same layer where TSVs are located, this lateral linear superposition cannot be used alone. Fortunately, the principle of linear superposition is not limited to 2-D plane, but applicable to any linearly elastic structures including 3-D structures.

Fig. 9 illustrates our vertical linear superposition method, which enables us to consider the stress induced by elements which are not in the same layer. We first decompose the target structure into four separate structures: TSV only, package-bump only,  $\mu$ -bump only, and background which does not contain TSV and bumps. Next, we obtain stress tensors along the red line on the device layer from aforementioned four separate structures from FEA simulations. Then, we add up the stress tensors from TSV only, package-bump only, and  $\mu$ -bump only structures, and subtract twice the magnitude of the background stress tensors since this background stress is already included in the previous three structures. If the point under consideration is affected by n components, then we need to subtract n-1 times the background stress.

Fig. 10 shows the stress distributions from each structure as well as the stress obtained by the vertical linear superposition. We see that  $\mu$ -bump induces more tensile stress than background and package-bump generates much more compressive stress than background, which is discussed in Section II. We also observe that even without interconnect elements (= background) device layer is in compression due to the shrinking of the underfill material which has the highest CTE (= 44 ppm/K) among all materials in the simulation structure.

Most importantly, our vertical linear superposition method matches well with the target stress distribution. Although we see the maximum error (11 MPa) occurs inside TSV, this is inevitable since we ignore the direct interaction between TSV,  $\mu$ -bump, and package-bump by decomposing the structure. Nonetheless, this error is acceptable for a fast full-system-scale analysis.

To obtain the stress tensor at a point affected by multiple TSVs,  $\mu$ -bumps, and package-bumps, we apply both lateral and LVLS as follows:

$$S = \sum_{i=1}^{n_{TSV}} S_{TSVi} + \sum_{j=1}^{n_{\mu B}} S_{\mu Bj} + \sum_{k=1}^{n_{pkgB}} S_{pkgBk} - (n_{TSV} + n_{\mu B} + n_{pkgB} - 1) \times S_{bg}$$
 (7)



Fig. 9. Illustration of vertical linear superposition with a two-die stack structure. Stress is extracted along the red line on device layer from each structure using FEA tool.



Fig. 10. Vertical linear superposition of  $\sigma_{rr}$  stress in a two-die stack shown in Fig. 9. All stress curves except the vertical superposition are from FEA simulations.

where S is the total stress at the point under consideration and  $S_{TSVi}$ ,  $S_{\mu B_j}$ , and  $S_{pkgB_k}$  are individual stress tensor at this point due to *i*th TSV, *j*th  $\mu$ -bump, and *k*th package-bump, respectively.  $S_{bg}$  indicates the background stress at that point.

# B. Full-Chip/Package Stress Analysis Flow

In this section, we explain how we perform a fullchip/package stress analysis based on the LVLS method shown in Algorithm 1. We first build a stress library from FEA simulations. This library contains stress tensors along an arbitrary radial line on the device layer induced by each interconnect element, i.e., TSV,  $\mu$ -bump, and package-bump, separately. Given locations of TSVs,  $\mu$ -bumps, and package-bumps, we find a stress influence zone for each element. Beyond this stress influence zone of each interconnect element, the stress induced by the element under consideration is negligible [5]. In this paper, we use five times the diameter of each component as a stress influence zone, which is determined by FEA simulations. Then, we associate each grid point with all the interconnect elements whose stress influence zone overlaps with the point. Next, we apply the LVLS method at the point under consideration to obtain the stress tensor induced by every component found in the association step. In this step, we use the coordinate conversion matrices to obtain stress tensors in the Cartesian coordinate system. Finally, we compute the von Mises stress value using (1) to assess the mechanical reliability problem in TSVs and mobility maps using (3).

#### C. Validation of LVLS

In this section, we validate our LVLS method against FEA simulations by varying the number of TSVs,  $\mu$ -bumps, and

```
Algorithm 1: Full-Chip/Package Stress and Reliability
Analysis Flow (LVLS)
 input: TSV list T, pkg-bump list P, \mu-bump list M, stress
 output: stress map, von Mises stress map, carrier mobility map
 for each TSV t, pkg-bump p, and \mu-bump m in T, P, and M do
      (it, ip, im) \leftarrow FindStressInfluenceZone(t, p, m);
      for each point it', ip', and im' in it, ip, and im do
           it'.TSV \longleftarrow it;
           ip'.pkg-bump \longleftarrow ip;
           im'.\mu-bump \leftarrow im;
           end
     end
 for each simulation point r do
      if r.TSV \neq \emptyset or r.pkg-bump \neq \emptyset or r.\mu-bump \neq \emptyset then
           for each (t, p, m) \in (r.TSV, r.pkg-bump, r.\mu-bump) do
                (dt, dp, dm) \leftarrow distance(t, p, m, r);
                S_{cyl}(t, p, m) \leftarrow \text{GetStressTensor}(dt, dp, dm);
                S_{cyl}(t, p, m) \leftarrow S_{cyl}(t, p, m) - BGstress;
                \theta(t, p, m) \leftarrow \text{GetAngle}(line \ tr, pr, mr, x-axis);
                 Q(t, p, m) \leftarrow \text{SetConversionMatrix}(\theta_t, \theta_p, \theta_m);
                S_{Cart}(t, p, m) \leftarrow
                 Q(t, p, m)S_{cyl}(t, p, m)Q(t, p, m)^T;
                r.S_{Cart} \leftarrow r.S_{Cart} + S_{Cart}(t, p, m);
                end
           end
      r.S_{Cart} \leftarrow r.S_{Cart} + BGstress;
      vonMises(r) \leftarrow ComputeVonMises(r.S_{cart});
      mobility(r) \leftarrow Compute Mobility(r.S_{cart});
      end
```

package-bumps as well as their arrangement. We set the minimum pitch of TSV,  $\mu$ -bump, and package-bump as 10, 20, and 200  $\mu$ m for all the test cases. Stress tensors along the radial line on device layer induced by each interconnect element (stress tensor library) are obtained through FEA simulations with 0.25  $\mu$ m interval. In our linear superposition method, simulation area is divided into uniform array style grid with 0.1  $\mu$ m pitch. If the stress tensor at the grid point under consideration is not obtainable directly from the stress library, we compute the stress tensor using linear interpolation with adjacent stress tensors in the library.

Table II shows some of our comparisons in die0 in a fourdie stack, which shows the largest errors among three dies containing TSVs due to its proximity to package-bumps. Also, we only list the cases with the minimum pitches for each component, which again shows maximum errors. First, we observe a huge run time reduction in our LVLS method. Note



Fig. 11. Sample stress comparison between FEA and LVLS. (a) Test structure. (b) Close-up shot of von Mises stress map (using LVLS) taken from the red box in (a) on the device layer in die0 in a four-die stack. (c) FEA versus LVLS along the red line in (b).

TABLE II VON MISES STRESS COMPARISON BETWEEN FEA AND LVLS FOR A FOUR-DIE STACK STRUCTURE (DIE0)

| No. of TS | SV | FEA          |                  | LVL        | Max error (MPa) |        |       |         |
|-----------|----|--------------|------------------|------------|-----------------|--------|-------|---------|
| /μ-B      |    | No. of node  | Run              | No of amid | Run             | Inside | TSV   | Outside |
| /pkg-B    |    | No. of flode | time No. of grid |            | time            | TSV    | edge  | TSV     |
| 1/1/1     |    | 754K         | 1d2h             | 1M         | 23s             | -11.4  | -12.6 | 7.9     |
| 2/2/1     |    | 812K         | 1d2h             | 1M         | 26s             | -12.7  | -13.2 | 7.3     |
| 5/5/2     |    | 902K         | 1d6h             | 6M         | 2m43s           | -14.1  | -15.3 | 8.2     |
| 10/10/4   |    | 1.3M         | 1d20h            | 9M         | 6m44s           | -23.1  | -19.8 | 9.4     |
| 10/10/9   |    | 1.4M         | 2d0h             | 16.8M      | 11m11s          | -22.5  | -20.5 | 11.9    |

Error = LVLS-FEA. At TSV edge, typical von Mises stress level is around 900 MPa

that we perform FEA simulations using eight CPUs while only one CPU is used for our linear superposition method. Even though the LVLS method performs stress analysis on a 2-D plane (= device layer), whereas FEA simulation is performed on the entire 3-D structure, we can perform stress analysis for other planes in a similar way if needed.

Moreover, the error between FEA simulations and LVLS is very small. Results show that our LVLS method underestimates stress magnitude inside TSV and TSV edge, and overestimates outside TSV, as shown in Fig. 10. In general, the most critical region for the mechanical reliability is the interface between different materials, hence TSV edge is most important in our case. Even though the maximum error at TSV edge is as high as  $-20.5\,\mathrm{MPa}$ , its % error is only -2.24%. Fig. 11 shows one test case comparison of von Mises stress between FEA and LVLS. The structure has 10 TSVs (5  $\mu$ m diameter and 10  $\mu$ m pitch), 10  $\mu$ -bumps (20  $\mu$ m diameter and 40  $\mu$ m pitch), and nine package-bumps (100  $\mu$ m diameter and 200  $\mu$ m pitch). It clearly shows our LVLS method matches well with the FEA simulation result.

#### VI. FULL-STACK TIMING VARIATION ANALYSIS

## A. Full-Stack Device Mobility Variation

From FEA simulations, a highly compressive stress is observed on device layers due to package-bumps, which is induced by the CTE mismatch between package-bumps and underfill. As Fig. 12 shows, die0 (= closest to package-bump layer) experiences the most compressive stress due to their



Fig. 12. Normal stress components induced by package-bump on device layers (FEA results). (a) Stress in die0 along the *x*-direction. (b) Stress in die1 and die2 along the *x*-direction. (c) Stress in die0 along the *y*-direction. (d) Stress in die1 and die2 along the *y*-direction.

proximity. The stress becomes less compressive as we go to upper dies. The stress distribution ( $\sigma_{xx}$  and  $\sigma_{yy}$ ) in die3 (unthinned top die) is almost flat ( $-110\pm 5$  MPa), since die3 does not contain any TSVs, which is discussed in Section III-B. Thus, we only compute the stress in the dies containing TSVs.

In (6), the electron mobility variation is approximately proportional to the sum of  $\sigma_{xx}$  and  $\sigma_{yy}$  due to the same sign (= negative) of  $\pi'_{11}$  and  $\pi'_{12}$ , while the hole mobility variation is roughly proportional to the difference between  $\sigma_{xx}$  and  $\sigma_{yy}$  due to the opposite sign of  $\pi'_{11}$  and  $\pi'_{12}$ . Fig. 12 shows the stress distribution on device layers induced by package-bump only. Although there is a noticeable difference between  $\sigma_{xx}$  and  $\sigma_{yy}$  near the package-bump edge in die0, their difference is almost negligible in other regions. Thus, this package-bump induced stress will not alter the hole mobility variation significantly except near the package-bump edge area. On the other hand, the electron mobility will be degraded under the influence of the package-bump since both  $\sigma_{xx}$  and  $\sigma_{yy}$  are compressive (= negative), which is shown in Fig. 8(d). Furthermore, the level of electron mobility degradation is most severe in die0.



Fig. 13. Mobility variation map with 441 TSVs/ $\mu$ -bumps (black dots) and nine C4 bumps (white circles) (LVLS results). (a) Hole mobility variation map in die0. (b) Hole mobility variation map in die2. (c) Electron mobility variation map in die2.

Fig. 13 shows hole and electron mobility variation maps in a four-die stack with 441 TSVs/ $\mu$ -bumps with 20  $\mu$ m pitch and nine package-bumps with 200  $\mu$ m pitch. Both hole and electron mobility variation range is largest in die0 due to the direct impact of package-bump-induced stress. Especially, the hole mobility degrades in between package-bumps along the x-direction and improves along the y-direction. This is because of the difference between  $\sigma_{xx}$  and  $\sigma_{yy}$  stress components near package-bump edge area shown in Fig. 12(a): along the x-direction  $\sigma_{xx}$  is higher than  $\sigma_{yy}$ , while along the y-direction  $\sigma_{yy}$  is higher than  $\sigma_{xx}$ . The electron mobility in die0 degrades in most cases, and the worst spot is inside the package-bump area since the most compressive stress occurs in this region as shown in Fig. 12.

In addition, Fig. 13 shows that the stress induced by package-bumps affects the mobility variation of a large number of cells, while TSVs generate the mobility variation pattern only for the cells nearby these TSVs. We also observe that as we go to upper dies, mobility variations due to package-bumps are almost negligible, hence the mobility variation pattern is mostly determined by TSVs.

# B. Chip/Package Stress-Aware Timing Analysis

In this section, we present our stress-aware STA flow. First, we build a Verilog netlist and a parasitic extraction file (SPEF) for each die from 3-D IC layouts. Each instance name in the netlists are replaced by the corresponding hole and electron mobility variation based on our stress and mobility analysis results. For example, INV\_X1 with +4% hole mobility and -8% electron mobility variation becomes INV\_X1\_Hp4\_Em8. Then, we create a top-level Verilog netlist that instantiates each die design and connects the 3-D nets using TSV. We also create a top-level SPEF file that contains parasitic models of the TSVs. Lastly, we run Synopsys PrimeTime to perform 3-D STA.



Fig. 14. Mobility variation impact on cell FO4 delay. (a) Rise delay dependency on hole mobility variation (INV\_X1). (b) Fall delay dependency on electron mobility variation (INV\_X1). (c) Rise delay dependency on hole mobility variation (Nand\_X1). (d) Fall delay dependency on electron mobility variation (Nand\_X1).

For this stress-aware STA, we build a timing library to capture the mobility variation impact on cell delay. We first obtain both hole and electron mobility variation range affected by multiple TSVs,  $\mu$ -bumps, and package-bumps. Since this range is different across the stack and also affected by the alignment and the pitch of TSVs,  $\mu$ -bumps, and package-bumps, we generate several test cases by varying these knobs. Fig. 13 is one of the test cases. We find that the hole mobility varies from -52% to 52% and the electron mobility ranges from -16% to 8% without any TSV keep-out-zone (KOZ), where devices cannot be placed. Actual mobility variation range is reduced by introducing KOZ. We characterize cell timing with the mobility variation using Cadence Encounter Library Characterizer with 2% mobility step size.

Fig. 14 shows the FO4 delay of INV\_X1 and NAND\_X1 gates with mobility variations. We see that the delay variation range is similar for both gates with given mobility variations. Note that the rise delay is not affected by the electron mobility variation and the fall delay is not much influenced by the hole mobility change. Thus, we can fix  $\Delta\mu_e/\mu_e$  when we sweep  $\Delta\mu_h/\mu_h$ , and vice versa. This is useful to reduce the number of library characterization. Instead of characterizing 689 (= 53×13 with 2% step size) libraries, we need to prepare 66 (= 53+13) libraries [7].

# VII. FULL-STACK RELIABILITY ANALYSIS RESULTS

We implement a chip/package thermomechanical stress and reliability co-analysis flow based on LVLS in C++/STL. We explore the impact of package-bump and  $\mu$ -bump on the reliability in full-system scale. We also examine the reliability concerns in wide-I/O DRAM and block-level 3-D IC designs.

In our simulations, we adopt a regular TSV placement style in which TSVs are placed uniformly across each die or inside TSV blocks with pre-defined pitch. In all cases, the pair of



Fig. 15. von Mises stress map for TSVs (die0 in a four-die stack). Colored dots are TSVs and white circles are package-bumps (LVLS results). (a) Test structure. (b) Close-up shot of red box in (a).



Fig. 16. Impact of package components and die stacking on the mechanical reliability of TSVs. 900 TSVs are placed in each die. (LVLS results).

TSV and  $\mu$ -bump is vertically aligned. Default diameter/height ( $\mu$ m) of TSV,  $\mu$ -bump, and package-bump are 5/30, 10/10, and 100/100, respectively, unless otherwise specified.

## A. Impact of Package-Bump and $\mu$ -Bump

We first study the impact of package-bump and  $\mu$ -bump on the mechanical reliability of different dies in a four-die stack. We also compare this to the case without these components as in the previous work [5] as shown in Fig. 1(a). In this experiment, the pitch of TSV/ $\mu$ -bump and package-bump are  $20\,\mu\text{m}$  and  $200\,\mu\text{m}$ , respectively; the total number of TSV/ $\mu$ -bump and package-bump are 900 and 16, respectively, as shown in Fig. 15(a).

We first observe that unlike the die without package-bumps and  $\mu$ -bumps [Fig. 16(a)] and the upper dies with package components [Fig. 16(c) and (d)], TSVs in die0 [Fig. 16(b)] experience large variations of von Mises stress across the die. This is because die0 is highly affected by package-bumps underneath it, and hence depending on the relative position between TSVs in die0 and package-bumps the von Mises stresses of TSVs change noticeably.

We also identify that higher von Mises stress occurs around package-bump edge and in between package-bumps due to



Fig. 17. Mechanical reliability in wide I/O DRAM. 1024 TSVs are placed in the middle of a chip. (a) Package-bumps are placed underneath TSV arrays. (b) Package-bumps are placed 200  $\mu$ m apart from TSV arrays (not drawn to scale)

# TABLE III RELIABILITY IN WIDE-I/O DRAM

| Case | vo                                              | Median |    |     |     |       |
|------|-------------------------------------------------|--------|----|-----|-----|-------|
|      | 780–810   810–840   840–870   870–900   900–930 |        |    |     |     | (MPa) |
| (a)  | 30                                              | 114    | 52 | 220 | 608 | 944.8 |
| (b)  | 182                                             | 842    | 0  | 0   | 0   | 856.2 |

constructive stress interference shown in Fig. 15(b). However, as we see in the center of Fig. 15(b), if the distance between TSV and package-bumps is long enough, the von Mises stress of TSV becomes low.

Interestingly, die1 shows the lowest von Mises stress level among all cases even though die2 is farthest from package-bumps. This is due because die2 is affected by the rigid unthinned top silicon substrate above it. Since die0 is most problematic in terms of the mechanical reliability, we only consider die0 in a four-die stack in the subsequent simulations.

# B. Case Study I: Wide-I/O DRAM

Wide-I/O based 3-D DRAM is fast becoming the first mainstream product that utilizes TSV in 3-D ICs, mainly targeting mobile computing applications such as smart phones which need lower power consumption and high data bandwidth. In this section, we evaluate the reliability concerns of TSVs in wide-I/O DRAM.

We follow the TSV placement style similar to the work in [13], where TSV arrays are placed in the middle of a chip. We assume that  $2 \times 128$  TSV array (per memory bank) is placed in the middle of a chip shown in Fig. 17. We employ four memory banks and 1024 TSVs in total. We set the pitch of TSV/ $\mu$ -bump and package-bump as 15  $\mu$ m and 200  $\mu$ m, respectively. We compare two cases; 1) Package-bumps are placed right underneath TSV arrays; 2) package-bumps are placed with 200  $\mu$ m spacing from TSV arrays. This 200  $\mu$ m distance is chosen since we see that the effect of package-bump on the TSV reliability is negligible beyond 200  $\mu$ m in case of the 100  $\mu$ m diameter package-bump shown in Fig. 6.

Table III clearly shows that the chip/package co-design can greatly reduce the mechanical reliability concerns in TSV-based 3-D ICs. With a safe margin of  $200 \,\mu\text{m}$  [= case(b)], von Mises stress magnitude reduces significantly. Thus, given



Fig. 18. Mechanical reliability in block-level 3-D IC. (a) Sample layout of block-level design. (b) von Mises stress map for TSVs in red box in (a) (LVLS result).

the TSV placement, we can find safe locations for packagebumps without affecting the package design much, or vice versa.

# C. Case Study II: Block-Level 3-D IC

In this section, we study the reliability issues in block-level 3-D designs. 3-D block-level designs are generated using an in-house 3-D floorplanner which treats a group of TSVs as a block shown in Fig. 18. Total 16 TSV blocks (368 TSVs) are used and the TSV pitch is 15  $\mu$ m. Package-bumps are regularly placed with 200  $\mu$ m pitch.

Table IV shows the von Mises stress level in selected TSV blocks. We first observe that larger TSV blocks experience more variation of von Mises stress within the TSV block. This is because the distance between each TSV in the block and package-bumps can vary more than small TSV blocks, which is a key factor that affects the reliability of TSVs.

We also see that TSV blocks with the same size can show quite different characteristics depending on the distance to the nearest package-bump. For example, although TSV block 4, 5, and 6 are all 5 × 5 TSV blocks and are located side-by-side, TSV block 5 shows the lowest von Mises stress level. However, its standard deviation of von Mises stress is highest among three blocks. We observe lower von Mises stress if TSV is placed near the package-bump center or far away from it; however, we see higher stress in TSV located around package-bump edge shown in Fig. 6 in Section III-C. In case of TSV block 5, most TSVs are near the package-bump center, which lowers von Mises stress level. However, at the same time a few TSVs are around the package-bump edge, which increases the standard deviation of von Mises stress inside the TSV block.

From this experiment, we observe two possible ways to reduce the mechanical reliability problems in block-level 3-D designs: 1) Assign TSV blocks right above package-bump center locations if possible. 2) Place package-bumps outside the TSV block locations with a safe margin such as outside the red box in Fig. 18(a). However, other design constraints such as package area and the required number of pins should be carefully considered as well.

TABLE IV

MECHANICAL RELIABILITY OF SELECTED TSV BLOCKS IN

BLOCK-LEVEL 3-D IC

| TSV       | No. of TSV   | vo    | blk-bump |       |         |          |
|-----------|--------------|-------|----------|-------|---------|----------|
| block no. | No. 01 13 V  | Max   | Min      | Avg   | Std dev | ist (µm) |
| 3         | 5 × 3        | 901.0 | 811.1    | 859.5 | 26.0    | 96.4     |
| 4         | 5 × 5        | 939.6 | 853.5    | 902.6 | 24.0    | 67.6     |
| 5         | 5 × 5        | 908.6 | 816.0    | 858.7 | 33.3    | 24.1     |
| 6         | 5 × 5        | 942.3 | 874.4    | 910.4 | 22.0    | 91.4     |
| 11        | $3 \times 1$ | 896.6 | 855.9    | 871.0 | 18.2    | 39.3     |
| 16        | 12 × 8       | 943.7 | 806.0    | 877.2 | 33.6    | 90.7     |

TSV blocks are shown in Fig. 18.



Fig. 19. Cell mobility variation histogram in die0 in four-die stack (ckt2). (a) Electron mobility. (b) Hole mobility.

## VIII. FULL-STACK TIMING ANALYSIS RESULTS

In this section, we investigate the impact of chip/package elements on the full-stack timing results. In our simulations, we build four-die stack 3-D IC designs using Cadence encounter with Nangate 45 nm cell library. We adopt a regular TSV placement style in which TSVs are placed uniformly across each die or inside TSV blocks with pre-defined pitch.

In all cases, a pair of TSV and  $\mu$ -bump is always vertically aligned. The default diameter/height ( $\mu$ m) of TSV,  $\mu$ -bump, and package-bump are 5/30, 10/10, and 100/100, respectively, unless otherwise specified. The package-bump pitch is assumed to be 200  $\mu$ m for all cases.

# A. 2-D versus 3-D Stress Impact on Mobility and Timing

We first examine the impact of different stress cases, i.e., 2-D stress (2-D uniaxial and 2-D biaxial) and 3-D stress (3-D woPkg and 3-D wPkg), on the full-stack timing and mobility variations. We use three circuits listed in Table V with the TSV KOZ size of  $1\,\mu m$ . Note that all benchmark circuits are designed with the timing optimization objective, but the stress impact is not considered in design stages.



Fig. 21. Full-chip layout (die0 in four-die stack) with the highlighted longest path. White squares are TSVs and yellow circles are package bumps. (cell mobility naming convention: e.g., Em8\_Hp4 = electron mobility minus 8% and hole mobility plus 4%). (a) Layout of ckt2 (KOZ = 1.0  $\mu$ m). (b) Cells in red circle in (a). (c) Close-up shot of green circle (1) in (b). (d) Close-up shot of green circle (2) in (b). (e) Layout of ckt2 (KOZ = 0.3  $\mu$ m). (f) Cells in red circle in (e). (g) Close-up shot of green circle (3) in (f). (d) Close-up shot of green circle (4) in (f).



Fig. 20. Impact of 2-D and 3-D stress cases on the longest path delay (LPD) and total negative slack (TNS). Timing numbers are normalized to the nostress case. TSV KOZ is 1  $\mu$ m for all cases. (a) LPD variation. (b) TNS variation.

TABLE V  $\label{eq:benchmark} \text{Benchmark Circuits With TSV KOZ} = 1.0~\mu\text{m}$ 

| Circuit | No. of cell | area $(\mu m \times \mu m)$ | WL (mm) | No. of TSV | TSV pitch ( $\mu$ m) | Profile    |
|---------|-------------|-----------------------------|---------|------------|----------------------|------------|
| ckt1    | 51K         | $290 \times 290$            | 1235    | 1062       | 15                   | DES        |
| ckt2    | 592K        | $800 \times 800$            | 15831   | 2325       | 20                   | 512pt FFT  |
| ckt3    | 1.31M       | $1150\times1150$            | 36842   | 6632       | 25                   | 1024pt FFT |

Fig. 19 shows the cell mobility distribution in die0 in a four-die stack of ckt2. We first observe that the electron mobility is highly concentrated within 0%–2% range for both

2-D stress cases and the 3-D woPkg. Note that the 2-D uniaxial case always improves the electron mobility, while the 2-D biaxial case can degrade the electron mobility as well. Most importantly, the electron mobility variation with package components shows quite a different behavior: the mobility variation range is wider than other cases and most of cells in die0 experience the electron mobility degradation. The degradation is mainly due to the compressive stress from package-bumps. Also, the wider distribution originates from the relative positions between cells,  $TSVs/\mu$ -bumps, and package-bumps.

As for the hole mobility distribution, all cases show wider distribution than the electron mobility case, which is expected from Fig. 8. However, still the 3-D wPkg case generates the largest variation, which is clear as shown in Fig. 13(a). Note that as we go to upper dies, the hole mobility distribution of the 3-D wPkg becomes comparable to the 3-D woPkg case.

Fig. 20 shows stress-aware 3-D STA results. We show the LPD and TNS for different stress cases. First, we observe that the 2-D uniaxial case always underestimates the LPD compared with the 2-D biaxial case. Interestingly, the LPD of ckt2 in the 3-D wPkg case shows better timing than other stress cases shown in Fig. 20(a). This can be explained in Fig. 21. As Fig. 21(a) shows, the cells in the critical path are located in between package-bumps in the y-direction. In this case, the hole mobility improves as shown in Fig. 13(a). Moreover, the hole mobility further improves when cells are placed in between TSVs in the y-direction as shown in Fig. 21(d).

The opposite case can also happen as shown in Fig. 21(e), where the cells in the critical path are placed in between package-bumps in the *x*-direction. In this case, the LPD

TABLE VI BLOCK-LEVEL AND WIDE-I/O STYLE 3-D IC DESIGNS

|             | No. of cell | Area                                 | WL    | No. of TSV | TSV pitch |
|-------------|-------------|--------------------------------------|-------|------------|-----------|
| Circuit     |             | $(\mu \text{m} \times \mu \text{m})$ | (mm)  |            | ( µm)     |
| ckt2_block  | 578K        | 840 × 920                            | 16083 | 1769       | 15        |
| ckt2_wideIO | 578K        | $820 \times 820$                     | 15521 | 2116       | 10        |

Package-bump pitch is  $200 \,\mu\text{m}$ .



Fig. 22. Layout and mobility variation map of wide-I/O style design (ckt2\_wide). (a) Layout of die0 in four-die stack with the highlighted cells in the critical path. (b) Close-up shot of red circle in (a). (c) Hole mobility variation map. (d) Electron mobility variation map (LVLS results).

degrades by 5.2% in the 3-D wPkg case compared with the no-stress case, while the 3-D woPkg case degrades the LPD by 2.3%.

The impact of package-bump stress on the mobility in die0 is clear if we compare Fig. 21(c) and (g). Although the relative positions between TSV and cells are similar, the hole mobility variation is significantly different depending on package-bump locations.

The stress impact on timing is more evident in TNS. In the 3-D wPkg case, TNS is larger than the no-stress case up to 22.9 % as shown in Fig. 20(b). This is because most cells in the design are affected by the stress induced by TSVs,  $\mu$ -bumps, and package-bumps, and thus undergo mobility variations.

# B. Case Studies: Block-level and Wide-I/O Style 3-D Designs

In this section, we study the chip/package stress impact on the full-stack timing in block level and wide-I/O style designs listed in Table VI. In case of the block-level design, we observe that the high mobility variation region is limited to nearby TSV blocks. Although the global mobility variation pattern is largely determined by package-bumps, the local mobility minima and maxima are mostly caused by TSVs. Thus, most



Fig. 23. Impact of 2-D and 3-D stress cases on the LPD and TNS in block level and wide-I/O style 3-D IC designs. TSV KOZ is  $1.7\,\mu m$  for all cases. (a) LPD variation. (b) TNS variation.

of the cells inside functional blocks do not experience high mobility variations.

In case of the wide-I/O style design, we assume that  $8 \times 30$  TSV array (per memory bank) is placed in the middle of a chip. In addition, there are four memory banks, hence the total 960 TSVs are employed in die0 as shown in Fig. 22. The hole and electron mobility maps in Fig. 22(c) and (d) clearly show that high mobility variation region is confined to inside and nearby the TSV array. Thus, majority of cells are not affected by the TSV stress similar to the block-level design.

Fig. 23 shows 3-D STA results for the block level and the wide-I/O style designs. As for the LPD, we observe that there is an almost negligible impact from all stress cases for both block level and wide-I/O style designs, since most cells are not affected by the TSV stress. One exception is the 3-D wPkg case in the wide-I/O style design. This is because the cells in the critical path are placed nearby TSV array and right above a package-bump as shown in Fig. 22(a). Cells that are placed in the vertical direction with respect to TSVs experience electron mobility degradation and hole mobility improvement. However, the electron mobility further decreases inside package-bump area as shown in Fig. 13(c), hence the net effect is timing degradation.

We also observe more TNS variation in the block-level design than that in the wide-I/O style design for 2-D uniaxial, 2-D biaxial, and 3-D woPkg cases. The block-level design contains more TSV blocks than the wide-I/O style design, hence the number of cells nearby these TSV blocks also increases. Thus, more paths are affected by the TSV stress than the wide-I/O style design. However, as we include the impact of package-bumps, all cells in these designs are affected by package-bumps, hence we observe nonnegligible variations in TNS for both design styles.

# IX. CONCLUSION

In this paper, we showed how package elements affect the stress field and the mechanical reliability on top of the TSV- induced stress in 3-D ICs. In addition, we demonstrated how chip and package components affect the mobility and full-stack timing variations in 3-D ICs. We observed that the mechanical reliability of TSVs in the bottom-most die in the stack are highly affected by packaging elements, and that effect decreases as we go to the upper dies. We also presented an accurate and fast full-chip/package stress and mechanical reliability co-analysis flow based on the principle of lateral and vertical linear superposition of stress tensors (LVLS), considering all chip/package elements. Lastly, we presented a chip/package stress-aware timing analysis method, which is applicable to stress-aware full-stack timing optimization for 3-D ICs.

#### REFERENCES

- K. Athikulwongse, A. Chakraborty, J.-S. Yang, D. Z. Pan, and S. K. Lim, "Stress-driven 3D-IC placement with TSV keep-out zone and regularity study," in *Proc. IEEE Int. Conf. Computer-Aided Des.*, Nov. 2010, pp. 669–674.
- [2] K. H. Lu, S.-K. Ryu, J. Im, R. Huang, and P. S. Ho, "Thermomechanical reliability of through-silicon vias in 3D interconnects," in *Proc. IEEE Int. Rel. Phys. Symp.*, Apr. 2011, pp. 3D.1.1–3D.1.7.
- [3] S. R. Vempati, S. R. Vempati, N. Su, C. H. Khong, Y. Y. Lim, K. Vaidyanathan, J. H. Lau, B. P. Liew, K. Y. Au, S. Tanary, A. Fenner, R. Erich, and J. Milla, "Development of 3-D silicon die stacked package using flip chip technology with micro bump interconnects," in *Proc. IEEE Electron. Components Technol. Conf.*, May, 2009, pp. 980–987.
- [4] M. Nakamoto, R. Radojcic, W. Zhao, V. K. Dasarapu, A. P. Karmarkar, and X. Xu, "Simulation methodology and flow integration for 3D IC stress management," in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2010, pp. 1–4.
- [5] M. Jung, J. Mitra, D. Z. Pan, and S. K. Lim, "TSV stress-aware full-chip mechanical reliability analysis and optimization for 3D IC," in *Proc. ACM Des. Autom. Conf.*, Jun. 2011, pp. 188–193.
- [6] M. Jung, X. Liu, S. Sitaraman, D. Z. Pan, and S. K. Lim, "Full-chip through-silicon-via interfacial crack analysis and optimization for 3D IC," in *Proc. IEEE Int. Conf. Computer-Aided Des.*, Nov. 2011, pp. 563–570.
- [7] J.-S. Yang, K. Athikulwongse, Y.-J. Lee, S. K. Lim, and D. Z. Pan, "TSV stress aware timing analysis with applications to 3D-IC layout optimization," in *Proc. ACM Des. Autom. Conf.*, Jun. 2010, pp. 803– 806.
- [8] J. Zhang, M. O. Bloomfield, J.-Q. Lu, R. J. Gutmann, and T. S. Cale, "Modeling thermal stresses in 3-D IC interwafer interconnects," *IEEE Trans. Semicond. Manuf.*, vol. 19, no. 4, pp. 437–448, Nov. 2006.
- [9] S.-K. Ryu, K.-H. Lu, X. Zhang, J.-H. Im, P. S. Ho, and R. Huang, "Impact of near-surface thermal stresses on interfacial reliability of through-silicon-vias for 3-D interconnects," *IEEE Trans. Device Mater. Rel.*, vol. 11, no. 1, pp. 35–43, Mar. 2011.
- [10] K. H. Lu, X. Zhang, S.-K. Ryu, J. Im, R. Huang, and P. S. Ho, "Thermomechanical reliability of 3-D ICs containing through silicon vias," in *Proc. IEEE Electron. Components Technol. Conf.*, May 2009, pp. 630–634
- [11] R. C. Jaeger, J. C. Suhling, R. Ramani, A. T. Bradley, and J. Xu, "CMOS stress sensors on (100) silicon," *IEEE J. Solid-State Circuits*, vol. 35, no. 1, pp. 85–95, Jan. 2000.
- [12] W. Xiong, C. R. Cleavelin, P. Kohli, C. Huffman, T. Schulz, K. Schruefer, G. Gebara, K. Mathews, P. Patruno, Y.-M. L. Vaillant, I. Cayrefourcq, M. Kennard, C. Mazure, K. Shin, and T.-J. K. Liu, "Impact strained-silicon-on-insulator (SSOI) substrate on finFET mobility," *IEEE Electron Device Lett.*, vol. 27, no. 7, pp. 612–614, Jul. 2006.
- [13] J.-S. Kim, C. S. Oh, H. Lee, D. Lee, H.-R. Hwang, S. Hwang, B. Na, J. Moon, J.-G. Kim, H. Park, J.-W. Ryu, K. Park, S.-K. Kang, S.-Y. Kim, H. Kim, J.-M. Bang, H. Cho, M. Jang, C. Han, J.-B. Lee, K. Kyung, J.-S. Choi, and Y.-H. Jun, "A 1.2 V 12.8 GB/s 2Gb mobile wide-I/O DRAM with 4 × 128 I/O using TSV-based stacking," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2011, pp. 496–498.



Moongon Jung (S'11) received the B.S. degree in electrical engineering from Seoul National University, Seoul, Korea, in 2002, and the M.S. degree in electrical engineering from Stanford University, Stanford, CA, USA, in 2009. He is currently pursuing the Ph.D. degree with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA.

His current research interests include computeraided design for VLSI circuits, especially on physical design methods for low power 3-D ICs and

thermomechanical reliability analysis and optimization of TSV-based 3-D ICs. Mr. Jung's works were nominated for the Best Paper Award at DAC 2011 and DAC 2012.



**David Z. Pan** (S'97–M'00–SM'06) received the Ph.D. degree (Hons.) in computer science from the University of California at Los Angeles (UCLA), Los Angeles, CA, USA, in 2000.

From 2000 to 2003, he was a Research Staff Member with the IBM T. J. Watson Research Center, Yorktown Heights, NY, USA. He is currently a Professor with the Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA. His current research interests include nanometer physical design, design

for manufacturing, vertical integration of technology/CAD/architecture, and CAD for emerging technologies.

Dr. Pan received the SRC 2013 Technical Excellence Award, 10 Best Paper Awards (ICCAD 2013, ASPDAC 2012, ISPD 2011, IBM Research 2010 Pat Goldberg Memorial Best Paper Award in CS/EE/Math, ASPDAC 2010, DATE 2009, ICICDT 2009, SRC Techcon 2012, 2007 and 1998), DAC Top 10 Author in Fifth Decade, DAC Prolific Author Award, ACM/SIGDA Outstanding New Faculty Award in 2005, NSF CAREER Award in 2007, UCLA Engineering Distinguished Young Alumnus Award in 2009, SRC Inventor Recognition Award in 2000 and 2008, IBM Faculty Award in 2004, 2005, 2006, and 2010, Dimitris Chorafas Foundation Research Award in 2000, the eASIC Placement Contest Grand Prize in 2009, the ISPD Routing Contest Award in 2007, the ICCAD CAD Contest Award in 2012, and the ACM Recognition of Service Award in 2007 and 2008. He was an IEEE CAS Society Distinguished Lecturer from 2008 to 2009. He has served as an Associate Editor of the IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, the IEEE TRANSACTIONS ON VERY LARGESCALE INTEGRATION SYSTEMS, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II, Science China Information Sciences, and Journal of Computer Science and Technology. He was the General Chair of ISPD in 2008. He has also served in the Technical Program Committees of many major VLSI/CAD conferences.



Sung Kyu Lim (S'94–M'00–SM'05) received the B.S., M.S., and Ph.D. degrees from the Computer Science Department, University of California, Los Angeles (UCLA), Los Angeles, CA, USA, in 1994, 1997, and 2000, respectively.

He joined the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA, in 2001, where he is currently a Professor. His current research interest is the design and testing of 3-D ICs. He is the author of *Practical Problems in VLSI Physical Design Automation* 

(Springer, 2008).

Dr. Lim received the National Science Foundation Faculty Early Career Development (CAREER) Award in 2006. He was on the Advisory Board of the ACM Special Interest Group on Design Automation (SIGDA) from 2003 to 2008 and received the ACM SIGDA Distinguished Service Award in 2008. He was an Associate Editor of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (TVLSI) SYSTEMS from 2007 to 2009 and served as a Guest Editor for the ACM Transactions on Design Automation of Electronic Systems (TODAES). His work was nominated for the Best Paper Award at ISPD 2006, ICCAD 2009, CICC 2010, DAC 2011, DAC 2012, ISLPED 2012, and awarded at ATS 2012. He led the Cross-Center Theme on 3-D Integration for the Focus Center Research Program (FCRP), Semiconductor Research Corporation (SRC) from 2010 to 2012.