## **Reliability-aware Global Routing under Thermal Considerations**

Katrina Lu<sup>+</sup>, David Z. Pan

Dept. of ECE, The University of Texas at Austin, Austin, TX 78712

# yiotse@cerc.utexas.edu, dpan@ece.utexas.edu

#### Abstract

Thermal effect is a key factor to interconnect reliability degradation. As technology scales, the distance between the metal layers and substrate continues to shrink and significantly increases the impact of substrate temperature on interconnect reliability. While it is already a concern in 2D ICs, the thermal impact will be more challenging in the emerging 3D ICs architecture. In this paper, we present a reliability-aware global routing with thermal considerations. We propose two techniques, thermal-driven Minimum Spanning Tree (MST) construction and thermal-driven maze routing, to reduce the probability of interconnect failures. Experimental results show that our router effectively reduces the failure rate by approximately 13% on average, with little overhead on the traditional design objectives.

#### **1. Introduction**

The impact of thermal issues on chip performance and reliability has come to attention in the development for highdensity architectures, such as microprocessor designs and integrated networks. With the demand for higher operating frequency and more versatile/advanced electronic devices, modern designs often have more complex circuits and higher power density, which leads to increased thermal concerns. Despite of the issues that have already been brought to the current 2D ICs architecture [1,2], emerging circuit technologies such as 3D ICs (vertical stacking of multiple device layers to achieve performance improvement) will suffer even more severely from the exacerbated thermal problems [3,4]. The amount of researches on thermal concerns in the various levels of design flow [5-7] demonstrates the importance and necessity of thermal management during design.

In our work, we focus on the impact of thermal effects on interconnect reliability. One of the major threats to interconnect reliability comes from electromigration. It is a phenomenon of mass transportation in electrical wires and the well-known Black's equation that models this phenomenon is commonly used to predict the lifetime of interconnects [8]. This model reveals the susceptibility of interconnect to thermal condition and current density. Due to technology scaling, the distance between the metal layers and the substrate continues to shrink, and this tendency remarkably increases the impact of substrate temperature on interconnect reliability [4]. Interconnect self-heating effect due to increased current density also becomes another factor that contributes to a rise in interconnect temperature. These again emphasize the demand for considering thermal effects during design and optimization phases.

Global routing is one of the key stages to address thermal impacts on interconnect reliability due to the following reasons: (1) global routing is the first major stage followed by cell placement, which in turn possesses information on substrate temperature profile, (2) it is a critical step to allocate the routing resources and meanwhile determine the environment of interconnect in the global space, and (3) it is a coarse planning stage which still has the flexibility to find reasonable tradeoff between thermal-dependent reliability and other conventional design objectives such as routability, wire length and timing. Therefore, we will address the thermalbased interconnect reliability problem during global routing stage.

Most of the existing works focus on the techniques of producing a uniform heat distribution over the chip, and there are very few works that directly address the reliability issue during global routing. A recent work in [9] proposed a Lshape approach for the problem, which suggested evaluating the two alternative paths during L-shape routing and choosing the path with a lower maximum temperature. However, Lshape routing is only a part of the routing flow and they did not consider the temperature factor in any other phases. Moreover, their strategy of choosing the path based on maximum temperature only implies a drop in the worst-case interconnect failure rate along that particular path, not the overall system reliability. Therefore, the reliability improvement from their algorithm is limited.

In this paper, we propose a reliability-aware global router with the aim to improve the lifetime of interconnects. Overall, our contribution is as follows:

- Instead of considering the maximum temperature that a path passes through, which is a worst-case failure optimization, we propose a new metric to capture the system failure rate during routing.
- We develop two simple yet essential techniques, reliability-aware minimum spanning tree construction and reliability-aware maze routing to reduce the probability of failures of interconnects.
- Our reliability-aware router significantly outperforms the previous work [9]. On ISPD98 benchmark suite, it reduces the system failure rate by 13% without sacrificing routability, and under tiny runtime and wire length overhead.

<sup>&</sup>lt;sup>+</sup> The author is now affiliated with Intel Corporation.

<sup>1-####-####-4/09/\$25.00 © 2009</sup> IEEE

The rest of the paper is organized as follows. Section 2 presents the preliminaries of this work and Section 3 provides the detailed description of our reliability-aware routing flow and routing techniques. The discussion of our experimental results can be found in Section 4. Finally, Section 5 concludes our work and describes how we can extend our routing algorithms from 2D ICs to 3D ICs

#### 2. Preliminaries

In this section, we first present the general global routing grid graph formulation in Section 2.1, including the traditional metrics in evaluating a global router. In Section 2.2, we discuss the reliability model that we adopt in our work.

#### 2.1. Global Routing Background

A typical formulation for the global routing problem is to partition the circuit into multiple rectangular regions and then build a grid graph for the circuit. Fig.1 shows an example of a partitioned circuit and its corresponding grid graph. Each vertex in the grid graph represents a rectangular region in the circuit, or the so-called global routing cell, and each of the edges corresponds to the boundary between two global routing cells.



Figure 1: Global routing formulation

The global routing problem is to make cell-to-cell connections instead of pin-to-pin connections for each net, which may have two or more pins that fall inside different global routing cells. All these paths should satisfy the various physical constraints, such as maximum routing capacity and metal layer directions. The metrics for evaluating the performance of a global router include routability (in terms of number of overflow), wire length (in the unit of global routing cell), runtime, and many others. In our content, besides measuring the ones that are listed above, we are at the same time measuring the thermal-based interconnect reliability. The detailed description for the model we have adopted to compute reliability will be presented in Section 2.2.

### 2.2. Interconnect Reliability Models

The reliability of interconnects is commonly measured by Mean Time to Failure (MTTF). It is an estimation of the average time for a design to fail for the first time. Black's equation, models the MTTF of interconnects in integrated circuits caused by electromigration [8],

$$MTTF = \frac{A}{J^2} e^{\frac{Ea}{KT}}$$
(1)

which describes a direct relationship between MTTF and the two variables, metal temperature (*T* in Kelvin) and current density (*J*). In the equation are *A*, *Ea* and *K*, where *A* is a constant based on the cross-sectional area of interconnect, *Ea* is the activation energy (e.g. 0.5eV for copper and aluminum interconnects), and *K* is Boltzmann constant ( $8.62x10^{-5}eV/K$ ).

MTTF can also be expressed as a general form under reliability terminologies. Assuming that the failure distribution for interconnects is an exponential distribution [10], the equations for failure F and reliability R are

$$F(t) = 1 - e^{-\lambda t} \quad \text{and} \quad R(t) = 1 - F(t) = e^{-\lambda t} \tag{2}$$

, where t is time and  $\lambda$  is called the constant failure rate. The constant failure rate of an interconnect is dependent on the temperature. From the reliability model in Eq. (2), MTTF can be derived as a function of  $\lambda$  as in Eq. (3).

$$MTTF = \int_0^\infty R(t) \cdot dt = \int_0^\infty e^{-\lambda t} \cdot dt = \frac{1}{\lambda}$$
(3)

#### 3. Reliability-aware Global Routing

In this section, we give a detailed description on our proposed reliability-aware global router. Among the various metrics, our objective is to maximize the estimated system reliability, while satisfying the capacity constraints and minimizing the wire length and runtime overhead.

A high-level framework of our router is shown in Fig. 2. The first step is to pre-process the given temperature profile using the model described in Section 3.1. Next, we decompose the nets using reliability-aware Minimum Spanning Tree (MST) which is described in Section 3.2. Followed by reliability-aware L-shape routing, it quickly determines whether a net can be entirely routed under certain capacity constraints. But instead of considering the maximum temperature on the paths like that was done in [9], we evaluate the contribution of the candidate paths to system reliability in order to choose one path over another. Finally, we apply maze routing to the remaining nets and it requires several iterations before the router can find a high quality solution in terms of the metric we defined. Our proposed reliability-aware maze routing technique, is described in Section 3.3.

#### **3.1. Reliability Metric**

The cost metric that guides a router to find quality routing paths plays an important role in global routing algorithms. The previous work in [9] uses maximum temperature on a given path as its metric in determining the superiority of one routing path over another in terms of reliability. It decreases the worst-case failure rate of a two-pin connection by selecting the path with a smaller cost (i.e. highest temperature that the path passes through). However, it does not take into account the following situations: (1) failure rate of all the grids on a path should be considered as a whole, instead of using a single grid with highest temperature to represent the entire path, and (2) reliability is an exponential function of



Figure 2: Overall flow of our reliability-aware router

temperature, therefore the cost should increase exponentially with temperature as well.

For example, consider a two-pin net shown in Fig. 3, which can be routed in two ways, path1 and path2. Under the maximum temperature metric, L-shape routing will choose path2 because the maximum temperature (96°C) in path2 is higher than the maximum temperature (95°C) in path1. But since the temperatures along path1 are kept consistently high, it is possible that path1 has lower reliability than path2. A careful judgment is necessary in order to handle this kind of circumstance for better reliability.



Figure 3: Demonstration of our reliability metric

To address the issues mentioned above, first, we would like to capture the exponential relationship between interconnect reliability and temperature during routing. For each edge e in grid graph described in Section 2.1, we define a reliability cost r(e). To save the efforts in using specific parameters in Eq. (1), we adopt a relative value for the failure rate, as shown in Eq. (4). The cost, r(e) is a ratio of the MTTF of an edge under room temperature  $T_{rm}$  to the MTTF of an edge under temperature  $T_e$ .

$$r(e) = \frac{MTTF^{rm}}{MTTF^{e}} = \exp\left[\frac{Ea}{K}\left(\frac{1}{T_{rm}} - \frac{1}{T_{e}}\right)\right] = \frac{\lambda_{T_{e}}^{e}}{\lambda_{T_{rm}}^{rm}} \quad (4)$$

Next, we propose a metric to evaluate the total failure rate of a given routing path. Let us assume that a path can be interpreted as a series system of all the edges along that path, since a failure occurred in any of the edges will likely to cause the whole path to fail [9,11]. This means the reliability of a path is a multiplication of the reliability of all the edges along that path as derived in Eq. (5), where a set of edges on a path as denoted as RP. Then the total path failure rate  $\lambda_{rp}$  can be extracted from Eq. (5) and expressed as Eq. (6).

$$R_{rp}(t) = \prod_{\forall e \in RP} R_{T_e}^e(t) = \prod_{\forall e \in RP} e^{-\lambda_{T_e}^e t} = e^{-\lambda_{rp}t}$$
(5)

$$\lambda_{rp} = \sum_{\forall e \in RP} \lambda_{T_e}^e \tag{6}$$

In this paper, instead of using the absolute value for  $\lambda$ , we apply the relative value as shown in Eq. (7) to calculate the failure rate of a path. Recall that r(e) is the reliability cost of an edge.

$$\lambda_{rp} = \sum_{\forall e \in RP} \left( \frac{\lambda_{T_e}^e}{\lambda_{T_{rm}}^{rm}} \right) = \sum_{\forall e \in RP} r(e)$$
(7)

Similarly, for the entire design, we sum up r(e) of all the edges in the global routing grid graph to evaluate system reliability. In this section hereafter, we will use the notation, r(e) throughout our algorithms to refer to the reliability cost function for each edge. Also we will use Eq. (7) to evaluate the reliability of our routing solution.

#### 3.2. Reliability-aware Minimum Spanning Tree

Similar to a congestion-driven tree topology generation [12] or techniques like extreme edge-shift [13] that slides edges out of a congested region, the motivation of a reliability-aware tree construction is to reduce the potential of routing tree edges in high temperature regions. Given that the number of alternatives for a multi-pin net tree topology is numerous, an abrupt tree construction without taking into account the temperature profile may lead to an undesired tree structure and significantly deteriorates the final solution quality. Fig. 4 shows an example of two tree topologies of the same net, where the circles are the net pins and the darkness of the grids indicates the level of temperature. Darker grids indicate a higher temperature and vice versa. Without loosing generality, we construct the trees using MST algorithm to demonstrate the effectiveness of our approach. Nonetheless, the same strategy can be applied to construct a reliabilityaware Steiner tree.

In this example, the tree in Fig. 4 (a) has a minimum wire length. However, wire b-d in the tree passes an extremely high temperature region, which will result in either (1) the way it is now and having a high probability of failures, or (2) making detours around the hotspots and producing longer wire length, which essentially will increase the probability of failures. In contrast, the topology in Fig. 4 (b) has a slightly longer total wire length, but the tree can be safely routed under simple L-shape patterns without crossing the hotspots on the chip. Therefore, this example gives us the motivation to construct trees with balance between reliability and wire length.

Lu et al, Reliability-aware Global Routing under Thermal Considerations





(a) wire length = 8, but wire b-d crosses a hot spot in design

(b) wire length = 12, wires can be routed using L-shape without crossing the hotspot

Figure 4: Different tree topologies can vary in reliability and wire length

Our proposed reliability-aware MST algorithm is as follows. First, we build a weighted clique graph, where the weight of each two-pin connection indicates a combined cost of wire length and reliability penalty, r(e). Our cost function for a MST edge *E*, is defined in Eq. (8). The notation of *E* is capitalized to avoid confusion between MST edge *E*, and routing grid edge *e*.

$$\cos t_{m\,st}(E) = d(E) \left[ 1 + \frac{r_{sum}(E)}{\gamma} \right] \tag{8}$$

In Eq. (8), d(E) is the Manhattan distance between the two pins and  $r_{sum}(E)$  is a probabilistic estimation of the reliability cost for E, which in turn does not require knowing the exact path of the edge. For instance, consider the same net shown in Fig. 4, we want to compute  $r_{sum}(E)$  for tree edge a-d. We first adopt the concept in [14] to find the probabilistic usage of each edge e by assuming the probability of an L-shape path is 0.6 and a Z-shape path is 0.4. This is shown in Fig. 5 (a). At the same time, we have the pre-computed reliability cost of each edge, r(e) as illustrated in Fig. 5 (b). Combining the information, we can calculate  $r_{sum}(E)$  using Eq. (9) and the resulting cost of tree edge a-d is shown in Fig. 5 (d). The Same strategy can be applied to all other tree edges.

$$r_{sum}(E) = \frac{\sum_{\forall e \in bbox(E)} (Pb(e) \cdot r(e))}{\sum_{\forall e \in bbox(E)} Pb(e)}$$
(9)

After we have built a weighted clique graph for the net, Prim's algorithm is applied to construct the MST. In our reliability-aware MST algorithm, the key to maintain a desired balance between reliability and wire length is the value of constant  $\gamma$  in Eq. (8). The smaller the  $\gamma$  is, the higher weight we put on reliability and therefore the bigger margin we can improve for reliability. In our experiments, we find  $\gamma$ = 600 is the value that gives us the best results.





(a) *Pb(e)*, probabilistic usage defined in [14]

(b) r(e), pre-computed edge cost



(c) r(e), pre-computed edge

Figure 5: Example of MST reliability cost computation

#### 3.3. Reliability-aware Maze Routing

Inspired by the recently published global routers [15,16], we adopt negotiation-based A\* search algorithm along with consideration of thermal effect to explore highly reliable paths. Our cost function for each edge e at *i*-th iteration is shown in the equation below.

where  $h_i(e)$ , p(e), and d(e) denote the historic cost, present cost, and the Manhattan distance between e and the target pin respectively.  $\alpha$  and  $\delta$  are the stability scaling factors explained in [15]. All of these are conventions of a negotiation-based  $A^*$  search approach. To take thermal effect into considerations, we introduce our reliability metric r(e) into the present cost, p(e), which originally is solely related to the congestion cost, c(e). Please note that, the maze router will not only use a combination of c(e) and r(e) in the present to select paths, but also remembers both c(e) and r(e) in the historic record,  $h_i(e)$ , to avoid previously unfavorable regions subjecting to both congestion and reliability.

A scaling factor  $\beta$  is also introduced to the cost function in Eq. (10). Similar to  $\gamma$  in MST constructions,  $\beta$  controls the weight of the reliability cost and makes a smooth tradeoff between reliability and other design objectives such as overflow and wire length.

#### 4. Experimental Results

We implemented our reliability-aware global router in C++ and performed all the experiments on 2.66 GHz Intel Dual Core Linux machine with 4GB RAM. ISPD98 IBM benchmark suite [19], presented in Table 1, is applied to show the effectiveness of our global router. The temperature profiles for each benchmark were generated according to the work in [9]. For comparative study, we also implemented two

| Table 2. Comparison of whe length with conventional router and TAGORE on 151 D/6 benchmarks |         |             |       |            |        |         |            |         |         |  |
|---------------------------------------------------------------------------------------------|---------|-------------|-------|------------|--------|---------|------------|---------|---------|--|
|                                                                                             | Conv    | entional Ro | uter  | TAGORE [9] |        |         | Our Router |         |         |  |
| name                                                                                        | WL      | WL          | WL    | WL         | WL     | WL      | WL         | WL      | WL      |  |
|                                                                                             | Total   | T>=1σ       | T=max | Total      | T>=1σ  | T=max   | Total      | T>=1σ   | T=max   |  |
| ibm01                                                                                       | 63428   | 9563        | 18    | 63415      | 9225   | 20      | 63642      | 7608    | 9       |  |
| ibm02                                                                                       | 174348  | 28922       | 34    | 174094     | 27680  | 34      | 174790     | 22931   | 5       |  |
| ibm03                                                                                       | 149183  | 23623       | 37    | 149274     | 22272  | 35      | 150135     | 18218   | 9       |  |
| ibm04                                                                                       | 169662  | 26151       | 9     | 169852     | 24910  | 9       | 170049     | 20540   | 4       |  |
| ibm05                                                                                       | 421079  | 71184       | 11    | 421353     | 62977  | 7       | 429560     | 53072   | 3       |  |
| ibm06                                                                                       | 285111  | 44743       | 34    | 284916     | 42629  | 28      | 286228     | 34036   | 12      |  |
| ibm07                                                                                       | 374783  | 59851       | 12    | 374474     | 54411  | 7       | 377383     | 45494   | 3       |  |
| ibm08                                                                                       | 415071  | 69093       | 13    | 414718     | 64894  | 10      | 417202     | 51646   | 11      |  |
| ibm09                                                                                       | 423178  | 64940       | 36    | 423144     | 61301  | 29      | 425609     | 48413   | 9       |  |
| lbm10                                                                                       | 593725  | 96210       | 36    | 593501     | 88959  | 26      | 599367     | 71617   | 8       |  |
| Total                                                                                       | 3069568 | 494280      | 240   | 3068741    | 459258 | 205     | 3093965    | 373575  | 73      |  |
| Incr.                                                                                       | 1       | 1           | 1     | -0.03%     | -7.09% | -14.58% | 0.80%      | -24.42% | -69.58% |  |

Table 2: Comparison of wire length with conventional router and TAGORE on ISPD98 benchmarks

 Table 1: ISPD98 IBM benchmarks and their associated temperature profile

|       |        | Circ    | Temperature<br>Profile (°C) |       |       |      |
|-------|--------|---------|-----------------------------|-------|-------|------|
| name  | # nets | # grids | vcap*                       | hcap* | mean  | std  |
| ibm01 | 11507  | 64x64   | 12                          | 14    | 89.59 | 15.1 |
| ibm02 | 18429  | 80x64   | 22                          | 34    | 90.71 | 14.7 |
| ibm03 | 21621  | 80x64   | 20                          | 30    | 90.28 | 15.1 |
| ibm04 | 26163  | 96x64   | 20                          | 23    | 90.00 | 14.8 |
| ibm05 | 27777  | 128x64  | 42                          | 63    | 90.44 | 15.1 |
| ibm06 | 33354  | 128x64  | 20                          | 33    | 90.16 | 15.3 |
| ibm07 | 44394  | 192x64  | 31                          | 36    | 90.06 | 14.9 |
| ibm08 | 47944  | 192x64  | 21                          | 32    | 90.28 | 15.2 |
| ibm09 | 50393  | 256x64  | 14                          | 28    | 89.86 | 14.9 |
| ibm10 | 64227  | 256x64  | 27                          | 40    | 89.98 | 15.3 |

\* vertical and horizontal capacity

other routers, a conventional router and the thermal-driven router presented in [9]. The flow of the conventional router is the same as we discussed in Fig. 2, but no reliability related techniques were involved. Also, since the binary of TAGORE presented in [9] is not available, we modified the L-shape stage in conventional router such that it is the exact technique used in that paper. We carried out the experiments under the same parameter settings for all three routers to ensure fair comparison.

First, we studied the wire distribution in different temperature regions. If there is more than one grid subject to the same temperature, we will sum up the total number of wires in grids that have the same temperature. Table 2 shows a comparison of our results with other two routers. "WL Total" is the total wire length, in the unit of global routing "WL T>=1 $\sigma$ " is the number of wires in high grid. temperature regions, which is defined as the grids in temperature of mean value plus one standard deviation and above. "WL T=max" denotes the number of wires in the hottest region, which may also associate with multiple grids. Compared to the conventional router, our router is able to reduce the number of wires subject to high temperatures by 24.42% and the number of wires in the hottest region by 69.58% on average, with less than 1% overhead on the total

wire length. A little increase in total wire length is reasonable because our router may potentially make more detours than the conventional router, in order to find a lower temperature path around hotspots in the chip. Also, powered by both our reliability-aware MST and the flexibility of maze router to explore in larger solution space, our router significantly outperforms TAGORE, which only obtains 7.09% and 14.58% reduction respectively.

Table 3 shows a comparison of all three routers in terms of system reliability (evaluated using the model described in Section 3.1), as well as routability and runtime. "Ovfl" is the total number of overflow, "Failure Rate Ratio" denotes the system reliability explained in Section 3.1, and "CPU" is the router runtime in seconds. As we can see, our router is able to improve the failure rate by 12.81% over the conventional router with merely less than 2x runtime overhead. And meanwhile it successfully routes all the benchmarks without producing any overflow. Compared to TAGORE which only makes 3.75% of improvement, our improvement is substantial.

#### 5. Conclusion

In order to alleviate the increasing thermal impact on interconnect reliability, we present a reliability-aware global router to maximize Mean Time to Failure of interconnects subject to substrate temperature. Our experiments results show that our router is able to achieve reliability enhancement without sacrificing the traditional routing objectives.

Moreover, our reliability-aware router for 2D ICs can be further extended to fulfill the need of a strong reliabilityaware accomplished in two ways: direct 3D routing [7] and hierarchical routing [17,18]. Of course, the former approach has the potential of achieving better solution quality, but at the expense of extremely high computational efforts. Particularly for modern designs that are already difficult in 2D ICs routing, the direct 3D routing approach may become infeasible due to an enlarged solution space from the multiple device layer structure. Therefore, it is natural to follow the latter approach, which divides a 3D routing problem into a set

|       | (    | Conventional Rou | uter   | TAGORE [9] |              |        | Our Router |              |        |
|-------|------|------------------|--------|------------|--------------|--------|------------|--------------|--------|
|       | Ovfl | Failure Rate     | CPU    | Ovfl       | Failure Rate | CPU    | Ovfl       | Failure Rate | CPU    |
|       |      | Ratio (e5)       | (S)    |            | Ratio (e5)   | (s)    |            | Ratio (e5)   | (S)    |
| ibm01 | 0    | 74.84            | 3.43   | 0          | 73.37        | 3.55   | 0          | 66.69        | 6.36   |
| ibm02 | 0    | 217.75           | 10.50  | 0          | 212.86       | 11.33  | 0          | 193.74       | 14.40  |
| ibm03 | 0    | 181.54           | 5.18   | 0          | 176.54       | 6.08   | 0          | 159.51       | 9.59   |
| ibm04 | 0    | 197.26           | 10.65  | 0          | 192.68       | 10.97  | 0          | 175.96       | 17.79  |
| ibm05 | 0    | 522.20           | 28.64  | 0          | 488.11       | 36.68  | 0          | 453.93       | 40.01  |
| ibm06 | 0    | 348.53           | 14.40  | 0          | 339.16       | 15.43  | 0          | 301.22       | 24.72  |
| ibm07 | 0    | 449.05           | 17.59  | 0          | 428.55       | 20.28  | 0          | 393.99       | 28.67  |
| ibm08 | 0    | 508.98           | 17.42  | 0          | 492.58       | 20.74  | 0          | 439.97       | 32.77  |
| ibm09 | 0    | 500.93           | 17.85  | 0          | 486.79       | 20.64  | 0          | 436.11       | 31.39  |
| lbm10 | 0    | 713.94           | 30.70  | 0          | 685.10       | 35.77  | 0          | 618.04       | 45.80  |
| Total | 0    | 3715.02          | 156.36 | 0          | 3575.74      | 181.47 | 0          | 3239.16      | 251.50 |
| Incr. | 1    | 1                | 1      | -          | -3.75%       | 1.16   | -          | -12.81%      | 1.61   |

Table 3: Comparison of reliability with conventional router and TAGORE on ISPD98 benchmark

of subregions to relax the computational complexity, and then integrates the 2D routing solutions with little additional efforts to find a complete 3D routing solution. Along with the benefit of a reduced problem size, the design effort is also reduced because it now becomes viable to reuse the existing 2D routers. Our 2D reliability-aware router is demanded for this reason, due to the capability and flexibility to handle thermal impacts in both 2D and 3D architectures.

#### 6. References

- [1] K. Banerjee, "Thermal Effects in Deep Submicron VLSI Interconnects", In Proceedings of International Symposium on Quality Electronic Design, Mar 2000.
- [2] Y.K. Cheng, P. Raha, C.C. Teng, E. Rosenbaum, and S.-M. Kang, "ILLIADS-T: An Electrothermal Timing Simulator for Temperature-Sensitive Reliability Diagnosis of CMOS VLSI Chips", In IEEE Trans. Computer-Aided Design, Aug 1998, 17.
- [3] T.Y. Chiang, S.J. Souri, C.O. Chui, and K.C. Saraswat, "Thermal Analysis of Heterogeneous 3D ICs with Various Integration Scenarios", In Technical Digest. IEDM, 2001, pp.681-684.
- [4] S. Im and K. Banerjee, "Full Chip Thermal Analysis of Planar (2D) and Vertically Integrated (3D) High Performance ICs", In Technical. Digest IEDM, 2000, pp.727-730.
- [5] J. Cong, G. Luo, J. Wei, Y. Zhang, "Thermal-Aware 3D IC Placement Via Transformation", In Proceeding. of Asia and South Pacific Design Automation Conference, Jan 2007.
- [6] J. Cong, J. Wei, and Y. Zhang, "A Thermal-Driven Floorplanning Algorithm for 3D ICs", In Proceeding of International Conference on Computer-Aided Desgin, pp. 306-313, Nov 2004.
- [7] J. Cong and Y. Zhang, "Thermal-Driven Multilevel Routing for 3-D ICs", In Proceedings of Asia and South Pacific Design Automation Conference, Jan 2005.
- [8] J. Black, "Electromigration A brief survey and some recents results", In IEEE Trans. on Electron Devices, April 1969, 4: pp. 338-347.

- [9] Gupta, N.D. Dutt, F.J. Kurdahi, K.S. Khouri, M.S. Abadir, "Thermal Aware Global Routing of VLSI Chips for Enhanced Reliability", In Proceeding of International Symposium on Quality Electronic Design, Mar 2008.
- [10] L.-T. Wang, C. Stroud, and N. Touba, System-on-Chip Test Architectures: Nanometer Design for Testability, Morgan Kaufmann Publishers, 2007.
- [11] J. Lienig, "Introduction to Electromigration-aware Physical Design", In Proceedings of International Symposium of Physical Design, Apr 2006.
- [12] M. Pan and C. Chu, "FastRoute: A Step of Integrated Global Routing into Placement", In Proceedings of International Conference on Computer-Aided Design, Nov. 2006.
- [13] M.D. Moffitt, "MaizeRouter: Engineering an Effective Global Router", In Proceedings of Asia and South Pacific Design Automation Conference, Jan 2008.
- [14] J. Westra, C. Bartels, and P. Groeneveld, "Is Probabilistic Congestion Estimation Worthwhile?", In Proceedings of International Workshop on System Level Interconnect Prediction, Apr 2005.
- [15] M. Cho, K. Lu, K. Yuan, and D.Z. Pan, "BoxRouter 2.0: Architecture and Implementation of a Hybrid and Robust Global Router", In Proceedings of International Conference on Computer-Aided Design, Nov. 2007.
- [16] M.M. Ozdal and M. Wong, "Archer: A History-driven Global Routing Algorithm", In Proceedings of International Conference on Computer-Aided Design, Nov. 2007.
- [17] C.C. Tong and C.-L. Liu, "Routing in a Threedimensional Chip", IEEE Trans. On Computers, Jan 1995, 44: pp. 106-117
- [18] S. Das, A. Chandrakasan, and R. Reif, "Design Tools for 3-D Integrated Circuits", In Proceedings of Asia and South Pacific Design Automation Conference, Jan 2003.
- [19] http://www.ece.ucsb.edu/~kastner/labyrinth/.

Lu et al, Reliability-aware Global Routing under Thermal Considerations