## Simultaneous Statistical Delay and Slew Optimization for Interconnect Pipelines

Andrew Havlir ECE Department University of Texas at Austin Austin, TX 78712 andy.havlir@amd.com

#### Abstract

Process variation has become a major concern in the design of many nanometer circuits, including interconnect pipelines. This paper develops closed-form models to predict the delay distribution of an interconnect pipeline stage and the slew distributions of all the nets in the circuit. Also, a buffer sizing and re-placement algorithm is presented to minimize the area of interconnect pipelines while meeting the delay and slew constraints. Experiments show that ignoring location dependent variation can cause a timing yield loss of 8.8% in a delay limited circuit, and the area can be improved by over 10% when the location dependent variation and residual random variation are understood and separated. Furthermore, under equivalent area, an interconnect pipeline optimized with only sizing changes may violate the slew constraint on over 50% of the nets, so location change is needed to best optimize these circuits.

#### 1. Introduction

As the physical dimensions in large scale integrated circuits decrease, the interconnect delay becomes more dominant compared to the gate delay. This is especially true for global interconnects, in which the delay scales the worst [1]. Buffer insertion is the primary method to improve the delay of long wires, but this may not be sufficient to send a global signal across the chip. Thus, interconnect circuits must be pipelined to several stages to meet chip frequency targets. [2] and [3] provide methods to model and design interconnect pipelines, but they are limited in their ability to handle process variation. [2] does not consider variation and a statistical extension is not straightforward. [3] assumes independent delays, so the variance is underestimated.

When modeling and designing with variation, the variation of the mean must be considered in addition to the purely random variation [4]. Spatial correlation among process parameters must also be considered [5]. Generally, the gate David Z. Pan ECE Department University of Texas at Austin Austin, TX 78712 dpan@ece.utexas.edu

length is the most critical device parameter, and interconnect width is important as it affects performance.

Statistical static timing analysis has been the primary method to model delay variation [6]- [8]. However, most of the statistical timing methods do not explicitly consider slew, a critical part in the design of interconnect pipelines. [7] is able to statistically propagate slew, but the variability of the input slews is not chain-ruled into the canonical delay model. [8] does consider slew propagation, but interconnect variations are not included. Furthermore, the statistical timing methodologies only present models and not optimization routines. The limited research publications on optimization considering variation [9]- [11] do not include both slew as an optimization constraint and gate location as an optimization lever. We later show that the location is critical in best meeting the slew constraints.

This research provides several contributions that address the aforementioned limitations for interconnect pipelines. First, closed-form equations are developed that predict both the delay distribution of an interconnect pipeline stage and the output slew distribution of every segment. The model considers statistical slew propagation, location dependent process variation (LDPV), and spatial correlation among process parameters. Second, an optimization routine is presented that uses both sizing and placement to meet the delay and slew constraints with the least area. Third, key experiment results show that interconnect pipelines designed when ignoring LDPV may have a timing yield loss of 8.8% in a delay limited circuit. Also, the area of interconnect pipelines can be reduced by over 10% when LDPV is fully understood. Finally, sizing alone under equivalent area may cause over 50% of the nets to violate the slew constraint, so both sizing and location change are needed for the best optimization.

The remainder of the paper is outlined as follows. Section 2 develops the interconnect pipeline model, and section 3 tests its accuracy and precision. Section 4 explains the optimization algorithm. Section 5 presents our experimental results, and finally, section 6 concludes this paper.

#### 2. Interconnect Pipeline Model

The first goal of this research is to develop a closed-form model that predicts the delay distribution of an interconnect pipeline and the slew distribution of all the nets in the circuit. The interconnect pipeline is assumed to be a two-pin circuit. The model includes transistor gate length and wire width as the process parameters that are random variables.

#### 2.1. Base Model

The base model provides the foundation for the statistical models by writing and fitting equations for the delay and output slew of one single interconnect segment, seen in figure 1. The curve fitting technique is widely used. However, approaches such as [12] are not statistical in nature and do not consider slew. Also, the equations must be functions of wire length and gate size for rapid solution evaluation during optimization.

Therefore, the key strategies in developing these equations are to write them as functions of input slew and to include inverter size and wire length for later use as optimization variables. In addition, the equations are linear functions of the random variables so that the future statistical equations remain closed-form. The inputs to this model for segment *i* are as follows: the gate length of the driving inverter ( $L_i$ ), the size of the driving inverter ( $I_i$ ), the input slew from the previous segment ( $S_{i-1}$ ), the size of the load inverter ( $I_{i+1}$ ), the wire width ( $W_i$ ), and the wire length ( $l_i$ ). The equations for the delay and slew of segment *i* are written as:

$$D_{i} = L_{i} \left(\frac{q_{1}}{I_{i}} + q_{2}\right) - W_{i} \left(\frac{q_{3}}{I_{i}} + q_{4}\right) + S_{i-1} \left(\frac{q_{5}}{I_{i}} + q_{6}\right) + I_{i+1} \left(\frac{q_{7}}{I_{i}} + q_{8}\right) + q_{9} \quad (1)$$

$$S_{i} = L_{i} \left( \frac{r_{1}}{I_{i}} + r_{2} \right) - W_{i} \left( \frac{r_{3}}{I_{i}} + r_{4} \right) + S_{i-1} \left( \frac{r_{5}}{I_{i}} + r_{6} \right) + I_{i+1} \left( \frac{r_{7}}{I_{i}} + r_{8} \right) + r_{9} \quad (2)$$

where  $q_k$  and  $r_k$  are fitting coefficients (functions of wire length). We derived the  $q_k$  and  $r_k$  coefficients using HSPICE, the 65nm models from Berkeley PTM [13], and wire resistance and capacitance calculations from [14]. The width and spacing were 200nm and 200nm based on the guidelines from [1]. To preserve the quadratic relationship between delay and wire length, the  $q_k$  and  $r_k$  coefficients were found separately for each wire length and fit to second order polynomial functions of wire length.

The transistor gate length and inverter size cause the most error in the fitted equations. We simulated a range



Figure 1. Base segment variables

of 60nm to 70nm for the gate length and 12 to 26 times the minimum width for the inverter size and found that the fit was acceptable in these ranges. The average errors for the delay and slew prediction were near 0% while the worst error was 6.2% for the delay and -12.9% for the slew. However, the accuracy of the model need not be limited by one single fit for the entire range of parameters. If more accuracy is needed or the range of process variation is too large for one fit, the coefficients may be separately derived for different ranges of the model parameters.

#### 2.2. Stage Model

Given the base equations, the stage model predicts the delay distribution of a pipeline stage and the output slew distribution for each net. A pipeline stage is defined as a source flip-flop, inverters driving long wire lengths, and a sink flip-flop. When cascading segments together, there are three key points that must be considered. First, spatial correlation among process parameters causes the correlated parameters to move in the same direction and the variance to increase significantly. Second, due to LDPV, each gate length and wire width may have its own distribution. Thus, the process parameters are modeled with a multivariate normal where each parameter may have its own mean and standard deviation. The correlations among the parameters are calculated with the model in [5]. The gate lengths and wire widths are assumed to be independent since they occur at different times during the process flow. The third key consideration is slew propagation. When global interconnect segments are cascaded, the ouput slew of one segment affects the delay and slew of all downstream segments. Thus, we write the delay and slew of a segment as functions of the upstream random variables.

For the stage model equations, the following substitutions are made to improve readability. The subscript of the coefficient represents the segment number of the coefficient. For the delay coefficients:  $d1_i = \frac{q_{1i}}{I_i} + q_{2i}$ ,  $d2_i = \frac{q_{3i}}{I_i} + q_{4i}$ ,  $d3_i = \frac{q_{5i}}{I_i} + q_{6i}$ , and  $d4_i = I_{i+1} \left(\frac{q_{7i}}{I_i} + q_{8i}\right) + q_{9i}$ . For the slew coefficients:  $s1_i = \frac{r_{1i}}{I_i} + r_{2i}$ ,  $s2_i = \frac{r_{3i}}{I_i} + r_{4i}$ ,  $s3_i = \frac{r_{5i}}{I_i} + r_{6i}$ , and  $s4_i = I_{i+1} \left(\frac{r_{7i}}{I_i} + r_{8i}\right) + r_{9i}$ . First, we write the delay of segment i in a set of cascaded segments as a function of the upstream random variables. We use (2) to recursively propagate the slew from the start of the circuit to segment i. The first segment is denoted with the subscript of 1, and  $S_0$  is the input slew to the interconnect pipeline stage.

$$D_{i} = d1_{i}L_{i} - d2_{i}W_{i} + d4_{i}$$
  
+  $d3_{i}\sum_{j=1}^{i-1} \left[\prod_{k=j+1}^{i-1} s3_{k} \left(s1_{j}L_{j} - s2_{j}W_{j} + s4_{j}\right)\right]$   
+  $d3_{i}S_{0}\prod_{k=1}^{i-1} s3_{k}$  (3)

The delay of a pipeline stage,  $D_{ps}$ , with n segments is formulated as the summation of the segment delays,  $D_{ps} = \sum_{i=1}^{n} D_i$ . We rewrite  $D_{ps}$  as a linear function of the process parameters to facilitate statistical formulation:  $D_{ps} = D_{con} + \sum_{i=1}^{n} d_{L_i} L_i + \sum_{i=1}^{n} d_{W_i} W_i$ , where,

$$D_{con} = \sum_{i=1}^{n} d4_i + \sum_{i=1}^{n} d3_i S_0 \prod_{j=1}^{i-1} s3_j + \sum_{i=2}^{n} \sum_{j=1}^{i-1} d3_i s4_j \prod_{k=j+1}^{i-1} s3_k \quad (4)$$

$$d_{L_i} = d1_i + \sum_{j=i+1}^n d3_j s 1_i \prod_{k=i+1}^{j-1} s 3_k$$
(5)

$$d_{W_i} = -d2_i + \sum_{j=i+1}^n -d3_i s 2_i \prod_{k=i+1}^{j-1} s 3_k$$
(6)

The output slew for segment *i* is similarly written:  $S_i = S_{con_i} + \sum_{j=1}^{i} s_{L_j} L_j + \sum_{j=1}^{i} s_{W_j} W_j$ , where,

$$S_{con_i} = S_0 \prod_{k=1}^{i} s_{k+1} \sum_{j=1}^{i} s_{j} \prod_{k=j+1}^{i} s_{k-1} \sum_{k=j+1}^{i} s_{k-1} \sum_{j=1}^{i} s_{j} \sum_{k=1}^{i} s_{k-1} \sum_{j=1}^{i} s_{j} \sum_{k=1}^{i} s_{k-1} \sum_{j=1}^{i} s_{j} \sum_{k=1}^{i} s_{j} \sum_{k=1}^{i} s_{k-1} \sum_{j=1}^{i} s_{j} \sum_{k=1}^{i} s_$$

$$s_{L_j} = s1_j \prod_{k=j+1}^{*} s3_k \tag{8}$$

$$s_{W_j} = -s2_j \prod_{k=j+1}^{i} s3_k$$
 (9)

Finally, the model equations need to be written in statistical form. We formulated the equations in such a way that the delay and slew remain linear with respect to the process parameters even after the slew propagation. Since the process parameters are modeled with multivariate normal distributions, the delay and slew distributions are also normal. Therefore, we have closed-form equations,(10) to (13), to calculate the mean and variance of the delay and slew distributions considering statistical slew propagation and correlations among process parameters.

$$D_{ps\mu} = D_{con} + \sum_{i=1}^{n} d_{L_i} L_{i\mu} + \sum_{i=1}^{n} d_{W_i} W_{i\mu} \qquad (10)$$

$$S_{i,\mu} = S_{con_i} + \sum_{j=1}^{i} s_{L_j} L_{j,\mu} + \sum_{j=1}^{i} s_{W_j} W_{j,\mu}$$
(11)

$$D_{ps\_\nu} = \sum_{i=1}^{n} (d_{L_i} L_{i\_\sigma})^2 + \sum_{i=1}^{n} (d_{W_i} W_{i\_\sigma})^2 + \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} 2\rho_{Lij} (d_{L_i} L_{i\_\sigma}) (d_{L_j} L_{j\_\sigma}) + \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} 2\rho_{Wij} (d_{W_i} W_{i\_\sigma}) (d_{W_j} W_{j\_\sigma})$$
(12)

$$S_{i,\nu} = \sum_{j=1}^{i} \left( s_{L_j} L_{j,\sigma} \right)^2 + \sum_{j=1}^{i} \left( s_{W_j} W_{j,\sigma} \right)^2 + \sum_{j=1}^{i-1} \sum_{k=j+1}^{i-1} 2\rho_{Ljk} \left( s_{L_j} L_{j,\sigma} \right) \left( s_{L_k} L_{k,\sigma} \right) + \sum_{j=1}^{i-1} \sum_{k=j+1}^{i} 2\rho_{Wjk} \left( s_{W_j} W_{j,\sigma} \right) \left( s_{W_k} W_{k,\sigma} \right)$$
(13)

#### 2.3. Pipeline Model

In VLSI circuits, the maximum operating frequency is set by the worst case flip-flop to flip-flop delay. In a pipelined interconnect circuit with  $\kappa$  pipeline stages, the maximum delay among all the stages is written as  $D_m = max(D_{ps_1}, D_{ps_2}, ..., D_{ps_\kappa})$ . We use the same method to estimate this maximum that [6] used based on [15]. The estimation returns the maximum as a normal distribution, but this is not exactly true. Nonetheless, the estimation is later verified to be accurate. Lastly, given a target delay,  $D_{tg}$ , the timing yield, Y, of the global interconnect pipeline circuit is calculated with the following.

$$Y = \frac{1}{D_{m_{-}\sigma}\sqrt{2\pi}} \int_{0}^{D_{tg}} \exp\left(\frac{-(x - D_{m_{-}\mu})^{2}}{2D_{m_{-}\sigma}^{2}}\right) dx \quad (14)$$

| ModelEst        | ErrAvg                           | ErrSDev | ErrMin  | ErrMax |  |
|-----------------|----------------------------------|---------|---------|--------|--|
|                 | Output Slew Prediction Errors    |         |         |        |  |
| $\mu$           | -0.533%                          | 1.30%   | -5.14%  | 4.02%  |  |
| $\mu + 1\sigma$ | -0.813%                          | 1.67%   | -6.42%  | 4.47%  |  |
| $\mu + 2\sigma$ | -1.09%                           | 2.11%   | -7.63%  | 4.85%  |  |
| $\mu + 3\sigma$ | -0.940%                          | 2.58%   | -8.45%  | 5.59%  |  |
|                 | Pipeline Stage Prediction Errors |         |         |        |  |
| $\mu$           | 0.780%                           | 0.656%  | -0.627% | 2.83%  |  |
| $\mu + 1\sigma$ | 0.418%                           | 0.688%  | -1.08%  | 2.54%  |  |
| $\mu + 2\sigma$ | 0.118%                           | 0.728%  | -1.64%  | 2.36%  |  |
| $\mu + 3\sigma$ | 0.233%                           | 0.861%  | -1.85%  | 3.31%  |  |
|                 | Maximum Delay Prediction Errors  |         |         |        |  |
| $\mu$           | 0.549%                           | 0.516%  | -0.581% | 1.73%  |  |
| $\mu + 1\sigma$ | 0.194%                           | 0.541%  | -1.03%  | 1.34%  |  |
| $\mu + 2\sigma$ | -0.109%                          | 0.573%  | -1.40%  | 1.19%  |  |
| $\mu + 3\sigma$ | -0.057%                          | 0.725%  | -2.02%  | 1.28%  |  |

Table 1. Estimation Errors on Test Circuits

#### 3. Model Validation

To test the accuracy and precision of the model over a large number of circuits, 100 random interconnect pipeline circuits were constructed with a varying number of pipeline stages, segments per pipeline stage, total wire length, and inverter sizes. Each segment had a randomly selected gate length and wire width distribution. For each of these 100 random circuits, 2000 random instantiations were created considering spatial correlations. Each random circuit was simulated with HSPICE. Table 1 summarizes the error in predicting the delay and slew distributions at the 50%, 84.1%, 97.7%, and 99.9% points. Overall, the model is suitable to guide an optimization algorithm, as the error percentages in predicting the delay and slew are small and the models have good precision.

#### 4. Interconnect Pipeline Optimization

We propose **SGASPIP** (Statistical Greedy Algorithm for Sizing and Placment of Interconnect Pipelines) to minimize the area of an interconnect pipeline while meeting the delay and slew constraints. The area is defined as the total number of minimum widths in the circuit. The following is the problem statement. The inputs are the number of stages in the pipeline, the number of inverters in each stage, the location and size of the source and sink flip-flops, a set of possible inverter sizes and placement locations, the mean and variance for the gate length and wire width for each potential placement location, the  $X_L$  and  $\rho_b$  values for process parameter spatial correlations [5], and finally the  $\mu + 3\sigma$  delay and slew constraints. The locations and sizes of the source and sink flip-flops cannot change. Otherwise, the sizes and



Figure 2. Example of location shifting

locations of the other flip-flops and inverters may be varied. We assume the wire width and spacing are fixed due to manufacturing limitations, so they are not considered as optimization variables. The algorithm returns the locations and sizes of the inverters and flip-flops from the minimum area circuit that meets the specified delay and slew constraints.

Since we consider both sizing and location, this is a non-trivial optimization problem. When LDPV is present, the delay and slew equations are more complicated than quadratic functions of location. Also, the circuit delay and output slew are functions of  $I^{-1}$ . The lack of a closed-form equation to calculate the maximum of the correlated stage delays further complicates the optimization process. Because this is a non-linear and non-convex optimization problem, an intelligent search heuristic is used to find a near optimal solution. Our algorithm is unique because none of the optimization methods in literature *statistically optimize both the delay and slew* for global nets using *both sizing and location change*.

The first key element of our algorithm is the ability to evaluate the statistical sensitivity of the delay and slew with sizing and location changes. The closed-form equations show how both the mean and variance change with both sizing and location changes. Another key element is the evaluation of the strength of each solution. The delay deviation is defined as:  $DlyDev = D_{m_{\mu+3\sigma}} -$ DlyCon. The slew deviation is the summation of the distances from the constraint for the segments that violate the constraint: $SlwDev = \sum_{i=1}^{\eta} max(0, S_{i_{\mu+3\sigma}} - SlwCon)$ , where  $\eta$  is the total number of segments. The slew margin is similarly calculated: $SlwMar = \sum_{i=1}^{\eta} max(0, SlwCon - S_{i_{\mu+3\sigma}})$ . For a given area, the algorithm first considers delay deviation, then slew deviation, and finally slew margin in evaluating the strength of a solution.

The **FindOptimalLocations** method is another crucial component of the algorithm. It works by finding the seg-

ments with the highest and lowest slews and shifting the appropriate segments, see figure 2, first to minimize the slew deviation and then to maximize the slew margin. This process repeats until the same solution has been visited twice, indicating that cycling occurred. This works well because it aids in meeting the slew constraints without increasing area.

Algorithm 1 SGASPIP Main

The pseudo code of **SGASPIP** is shown in Algorithm 1. In lines 6 to 9, **QuickDownsize** aggressively lowers the area by downsizing all segments with a certain amount of slew margin. In lines 10 to 15, **DownsizeOneBest** reduces area more slowly by downsizing one segment each time and avoiding changes that violate constraints. On line 20, **GlobalUpsize** upsizes all segments that have the worst margin relative to the slew contraint to intelligently reset the solution for global search.

#### 5. Optimization Experiment Results

With the model equations and **SGASPIP**, we next show three key experiment results. Table 2 summarizes the four test circuits for these experiments. The last row is important because it indicates whether the circuits are delay or slew limited, as this has a significant impact on the results.

First, ignoring LDPV during optimization causes both the delay and slew constraints to be violated. We optimized the four test circuits ignoring LDPV. Then, we solved these optimized circuits when the gate length mean increased linearly from 63nm at the sink to 67nm at the source. Table 3 shows that the timing yield loss increased to a significant

Table 2. Test circuits for experiments

|                     | Ckt1  | Ckt2  | Ckt3  | Ckt4  |
|---------------------|-------|-------|-------|-------|
| Total Length(µm)    | 6000  | 7200  | 12800 | 5400  |
| Pipe Stages         | 2     | 3     | 4     | 3     |
| Repeaters per Stage | 10    | 8     | 10    | 6     |
| Source/Sink FF Size | 20/20 | 22/22 | 20/20 | 18/18 |
| PlaceResolution(µm) | 10    | 10    | 20    | 10    |
| Dly Constraint(ps)  | 460   | 370   | 500   | 300   |
| Slw Constraint(ps)  | 60    | 60    | 75    | 60    |
| Dly/Slw Limited     | Both  | Both  | Dly   | Slw   |

Table 3. Ignoring LDPV causes violations

|                                  | Ckt1  | Ckt2  | Ckt3  | Ckt4  |  |
|----------------------------------|-------|-------|-------|-------|--|
| Ignore LDPV                      |       |       |       |       |  |
| $Dly_{\mu}(ps)$                  | 436.6 | 353.8 | 484.7 | 265.7 |  |
| $Dly_{\sigma}(ps)$               | 11.61 | 9.53  | 11.3  | 7.67  |  |
| Dly Yield Loss                   | 2.19% | 4.46% | 8.79% | 0%    |  |
| Slew Violations                  | 40.0% | 37.5% | 2.5%  | 41.1% |  |
| Worst Slw $_{\mu}$               | 57.2  | 57.5  | 69.8  | 57.0  |  |
| Worst Slw $_{\sigma}$            | 1.69  | 1.66  | 1.90  | 1.52  |  |
| Worst Slw Yld                    | 95.4% | 93.1% | 99.7% | 97.6% |  |
| Consider LDPV                    |       |       |       |       |  |
| $Dly_{\mu}(ps)$                  | 427.8 | 344.6 | 471.7 | 261.3 |  |
| $Dly_{\sigma}(ps)$               | 10.64 | 8.36  | 9.41  | 7.04  |  |
| Dly Yield Loss                   | 0.12% | 0.12% | 0.13% | 0%    |  |
| Increase Caused by Ignoring LDPV |       |       |       |       |  |
| Dly Yield Loss                   | 17.7x | 37.5x | 67.1x | 0x    |  |

8.8% level for the delay limited circuit. Also, all the circuits that are not purely delay limited have a significant percentage of nets that violate the slew constraint.

The constraints are violated because ignoring the LDPV ignores a part of the variability of the circuit. Therefore, the second key experiment shows the area of the circuit can be better reduced when the difference between location dependent variability and residual random variability is understood. For the fully random case, the mean gate length and standard deviation are 65nm and 1.33nm everywhere. In the LDPV case, the mean gate length linearly increases from 63nm at the source to 67nm at the sink. The standard deviation is 0.67nm everywhere. The minimum  $-3\sigma$  and maximum  $3\sigma$  are actually the same for both circuits, but table 4 shows that when part of the variability is understood to be LDPV, the area can be better optimized by over 10%.

Lastly, sizing alone is not sufficient to create an optimal circuit that successfully meets the delay and slew constraints. The area result from the prior LDPV experiment is set as the target area for a size only optimization. An iterative algorithm starts with the maximum sizes and down-

|                | Ckt1                   | Ckt2  | Ckt3  | Ckt4  |
|----------------|------------------------|-------|-------|-------|
| Variation Type | Total Area Measurement |       |       |       |
| All random     | 482                    | 578   | 830   | 380   |
| Random&LDPV    | 426                    | 502   | 730   | 338   |
| Area Decrease  | 11.6%                  | 13.1% | 12.0% | 11.1% |

# Table 4. Area is better optimized when LDPV is understood

# Table 5. Size only optimized circuit cannotmeet slew constraints

|                 | Ckt1 | Ckt2  | Ckt3 | Ckt4 |
|-----------------|------|-------|------|------|
| Area Limit      | 452  | 538   | 776  | 356  |
| Dly Violation   | No   | Yes   | No   | No   |
| Dly Yield Loss  | N/A  | 0.15% | N/A  | N/A  |
| Slew Violations | 40%  | 46%   | 0%   | 56%  |
| Dly/Slw Limited | Both | Both  | Dly  | Slw  |

sizes until it reaches the area target. Table 5 shows that the size only optimization causes a significant percentage of the nets to violate the slew constraint on the circuits that are not purely delay limited. Without the freedom to change location, the circuit cannot be optimized as well under equivalent area since slew greatly depends on wire length.

### 6. Conclusion

This research develops accurate closed-form equations to predict the delay distribution of an interconnect pipeline stage and the slew distribution of every net. We present a unique algorithm, **SGASPIP**, that optimizes an interconnect pipeline. Experiments show that circuits optimized ignoring LDPV may have a significant timing yield loss and the area can best be reduced when the LDPV and residual random variation are separated. Lastly, sizing alone is not sufficient to best optimize the circuit because the slew constraints are much more difficult to meet. Future work includes adding a power constraint to the modeling and optimization.

#### References

- "The International Technology Roadmap for Semicondcuctors," Semicondcutor Industry Association, 2003.
- [2] P. Cocchini, "A methodology for optimal repeater insertion in pipelined interconnects," *IEEE Transactions*

on Computer-Aided Design of Integrated Circuits and Systems, Dec. 2003, vol. 22, no. 12, pp. 1613-1624.

- [3] L. Zhang, et. al, "Statistical timing analysis in sequential circuit for on-chip global interconnect pipelining," *DAC*, June 2004, pp. 904-907.
- [4] M. Orshansky, et. al, "Impact of spatial intrachip gate length variability on the performance of high-speed digital circuits," *IEEE Transactions on Computer-Aided Design of Integrated-Circuits and Systems*, May 2002, vol. 21, no. 5, pp. 544-553.
- [5] P. Friedberg, et. al, "Modeling within-die spatial correlation effects for process-design co-optimization," *ISQED*, March 2005, pp. 516-521.
- [6] H. Chang and S. Sapatnekar, "Statistical timing analysis considering spatial correlations using a single PERT-like traversal," *ICCAD*, Nov. 2003, pp. 621-625.
- [7] C. Visweswariah, et. al, "First-order incremental block-based statistical timing analysis," DAC, June 2004, pp. 331-336.
- [8] A. Agarwal, et. al, "Statistical delay computation considering spatial correlations," *ASP-DAC*, Jan. 2003, pp. 271-276.
- [9] J. Xiong, et. al, "Buffer insertion considering process variation," *DATE*, 2005, pp. 970-975.
- [10] M. Mani, et. al, "An efficient algorithm for statistical minimization of total power under timing yield constraints," *DAC*, June 2005, pp. 309-314.
- [11] K. Chopra, et. al, "Parametric yield maximization using gate sizing based on efficient statistical power and delay gradient computation," *ICCAD*, Nov. 2005, pp. 1020-1025.
- [12] A.I. Abou-Seido, et. al, "Fitted Elmore delay: a simple and accurate interconnect delay model," *IEEE Transactions on VLSI Systems*, July 2004, vol. 12, no. 7, pp. 691-696.
- [13] Y. Cao, et. al, "New paradigm of predictive MOS-FET and interconnect modeling for early circuit simulation," *Proceedings of the IEEE 2000 Custom Integrated Circuits Conference*, May 2000, pp. 201-204.
- [14] S.C. Wong, et. al, "Modeling of interconnect capacitance, delay, and crosstalk in VLSI" *IEEE Transactions on Semiconductor Manufacturing*, Feb. 2000, vol. 13, no. 1, pp. 108-111.
- [15] C.E. Clark, "The Greatest of a Finite Set of Random Variables," *Operations Research*, March-April 1961, pp. 145-162.