OIL: A Nano-photonics Optical Interconnect Library for a New Photonic Networks-on-Chip Architecture

Duo Ding and David Z. Pan
ECE Dept. Univ. of Texas at Austin
1 Univ. Station C0803, Austin, TX 78712
ding@cerc.utexas.edu, dpan@ece.utexas.edu

ABSTRACT
In this paper, we present OIL, a parameterized Optical Interconnect Library of silicon nano-photonics devices for system level interconnect planning/analysis and low power high performance design exploration under a new holistic photonic Networks-on-Chip architecture. Such an architecture incorporates on-chip packet routing (photonic Network-on-Chip) and with-in core wire routing (photonic waveguide routing) onto a dedicated optical layer, contributing towards enhanced photonic silicon utilization, reduced power dissipation and high communication throughput as technology further scales down. With OIL characterization, our proposed holistic architecture is analyzed and discussed for power efficiency, communication latency and overall performance. Challenges such as on-chip memory access bandwidth bottleneck and nano-photonic fabrication cost efficiency are also discussed, together with some on-chip photonics integration explorations for next generation chip multi-processors.

Categories and Subject Descriptors
B.7.2 [Hardware, Integrated Circuit]: Design Aids

General Terms
Algorithms, Design, Performance

Keywords
Computer Aided Design, Photonic Networks-on-Chip, Low Power, High Performance

1. INTRODUCTION
As raised in the International Technology Roadmap for Semiconductors [3], silicon system complexity rockets exponentially due to increasing transistor counts, fueled by smaller feature sizes and insatiable demands for higher integration/performance with low costs. With such aggressive technology scaling, VLSI interconnect effects start playing more and more important roles in the Deep Sub-Micron realm. Below 45nm technology node, traditional copper wire interconnect faces many walls as process scaling leads to various issues such as on/off chip communication bandwidth bottleneck, clock frequency bottleneck, large power dissipation and serious cross-talk noise, etc. To keep up with Moore’s Law in the new Tera-bit super computing era, various alternative interconnect techniques [4,9,34,39,45] have been proposed and analyzed as potential solutions for aforementioned bottlenecks. Among these techniques, optical/photonic interconnect paradigm triggers heated researches (e.g., [4,6,12,17,26,46]) and is considered as a potential quantum leap towards next generation VLSI on-chip interconnect technology.

The idea of introducing optical interconnect onto integrated circuit chips was first proposed by [14] in 1984. Although optical fiber enabled long-haul photonic interconnect started webbing the globe since the 90’s, it was only until more recently that intra/inter IC chip level photonic interconnect researches truly took off. On PCB level, [50] proposed a fully embedded board-level optical interconnect schematic from OWG (optical waveguide) fabrication to device integration. For inter/intra chip communications, various high performance photonic devices have been researched and developed in both academia (e.g., [32,35]) and industry (e.g., [17,24,46]). EDA based physical synthesis flows for on-chip optical interconnect planning [13,33] have also been published; together with new architectures for high performance on-chip photonic interconnection. One important architecture is the photonic Networks-on-Chips paradigm [19,43], where data packets are routed on a on-chip photonic network with high speed photonic interconnects shared in a Time Division Multiplex (Wavelength Division Multiplex) manner.

As projected by [3,10,32], optical interconnect outperforms traditional Cu/Low-K interconnect with significant potentials as technology scales down, in terms of high throughput, small propagation delay, low power consumption and low soft error rate, etc. On-chip optical interconnect also demonstrates promising potentials compared with carbon nano-tube bundle interconnection [12] in terms of power dissipation and communication latency/bandwidth. Geared up by the recent advances in silicon micro/nano-photonic fabrication processes (e.g., [18,46,48]), it is a good time for intensive research on both CAD synthesis and architecture level explorations for intra chip photonics integration.

2. RELATED WORK AND OUR CONTRIBUTIONS
On-chip nano-photonic interconnect consists of silicon optical waveguide and opto-electrical/electro-optical conversion devices. For the past few years, researches for on-chip nano-photonic integration mainly focus on two aspects: device level fabrication (e.g., [17,18]) and network architec-
ture implementation (e.g., [41, 44, 49]). On device fabrication level, various nano-photonic Giga-scale modulators (e.g., [15, 18, 31, 48]), photo-detectors (e.g., [11, 36, 38]), couplers, switches (e.g., [37, 47]), buffers, on-chip waveguide and on-chip WDM (Wavelength Division Multiplex) devices (e.g., [6, 20, 21]) have been demonstrated in both industry and academia. On architecture level, intrigued by Network-on-Chip paradigm, many new on-chip photonic architectures are proposed (e.g., [22, 41, 49]), together with novel network packet routing mechanisms [26, 41] and performance analysis [8]. CAD based performance driven synthesis for on-chip photonic integration has also been proposed, such as timing-driven on-chip optical waveguide routing for 3D system-on-package [33] and power-driven routing framework for on-chip nanophotonic integration [13].

To further leverage the photonic Network-on-Chip paradigm for future generation Chip Multi-Processors, we first establish OIL: a parameterized library for low-power on-chip photonics integration CAD exploration, utilizing a collection of silicon compatible nano-photonic devices built on silicon-on-insulator. OIL (Optical Interconnect Library) allows us to quantitatively explore CAD optimization methods for on-chip photonics synthesis on system level in terms of power consumption and communication latency, etc., under various data constraints imposed by the device characterizations. To apply OIL, we present a new Photonic Networks-on-Chip architecture, incorporating within-core optical interconnect planning and core-to-core optical network routing onto a single layer for enhanced photonic silicon utilization.

The rest of this paper is organized as follows, section 3 gives a detailed description for Optical Interconnect Library, followed by section 4, a new architecture for photonic on-chip communications. In Section 5, OIL is applied to evaluate our proposed architecture, in terms of performance improvement, power consumption, insertion loss and performance scalability. Section 6 concludes the paper with a brief summary and some potential future work.

3. OPTICAL INTERCONNECT LIBRARY

After a comprehensive study of current photonic device fabrication literature, we establish OIL for the first time: an Optical Interconnect Library for systematic and quantitative design explorations of on-chip nanophotonic interconnect. OIL is an extensible open set of devices including on-chip optical modulators, photodetectors, buffers, switches, couplers, optical waveguide model and on-chip WDM devices. Based on recent advancement in nano-photonic fabrication technique, OIL contributes to a closer collaboration between the device fabrication community, architecture design community and CAD optimization community towards a promising on-chip photonic integration solution for next generation Tera-flop super computing, as illustrated in Fig. 1. For more details regarding OIL and future updates, please refer to [2].

3.1 Nanophotonic Modulators

Nanophotonic modulators are used for on-chip electrical to optical data conversion. Under photonic networks-on-chip architecture, a modulator is to be inserted at each gateway (G) on every processing unit of a chip multi-processor. Current nanophotonic modulators in OIL fall into two classes: Mach-Zehnder structure modulator (e.g., [16, 18, 29, 40]) and ring resonator structure (e.g., [31, 52]).

3.1.1 Mach-Zehnder Modulator

Figure 2: (a) Working mechanism for a Mach-Zehnder photonic modulator, with modulation ON state in (b) and OFF state in (c), where state switching is controlled by electrode voltage

The working principle of a typical Mach-Zehnder modulator is briefly illustrated in Fig. 2, where the reflective index of arm A is manipulated by control voltages on the electrode, leading to a phase modulation of the optical wave propagating through arm A. ON condition is depicted in Fig. 2(b) when the phase shift in arm A is integer times of $2\pi$. In Fig. 2(c), the control voltage results in $n \cdot \pi$ (n is odd integer) phase shift, causing an OFF state on the output port. A modified arm with photonic crystal structure in OIL is visualized in Fig. 3(a) with its simulated electrical field amplitude spectrum in Fig. 3(b).

3.1.2 Ring Resonator Modulator

Ring resonators modulate optical signals by selectively coupling signals from the optical waveguide, the selected wavelengths are defined as the resonant wavelengths of the ring. Similar to Mach-Zehnder modulator, such resonant wavelength is generally controlled by applying voltages across the ring structure.

A typical micro ring structure resonator is shown in Fig. 4 with OFF state FDTD simulation in Fig. 4(b) and ON state
Table 1: High level parameters of on-chip nano-photonic modulators

<table>
<thead>
<tr>
<th></th>
<th>Mach-Zehnder Optical Modulator</th>
<th>Ring Resonator Modulator</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mod</td>
<td>Mod2</td>
<td>Mod3</td>
</tr>
<tr>
<td>length</td>
<td>Mod2</td>
<td>Mod3</td>
</tr>
<tr>
<td>width</td>
<td>Mod2</td>
<td>Mod3</td>
</tr>
<tr>
<td>modulation rate</td>
<td>10 Gb/s</td>
<td>30 Gb/s</td>
</tr>
<tr>
<td>power consump</td>
<td>5.1 pJ/bit</td>
<td>600 mW</td>
</tr>
<tr>
<td>on-chip loss</td>
<td>12 dB</td>
<td>72 dB</td>
</tr>
<tr>
<td>wavelength</td>
<td>1550 nm</td>
<td>1550 nm</td>
</tr>
</tbody>
</table>

Table 2: High level parameters of on-chip nano-photonic photo-detectors

<table>
<thead>
<tr>
<th></th>
<th>Detector1</th>
<th>Detector2</th>
<th>Detector3</th>
<th>Detector4</th>
<th>Detector5</th>
<th>Detector6</th>
</tr>
</thead>
<tbody>
<tr>
<td>footprint</td>
<td>10X10 um²</td>
<td>7.4X50 um²</td>
<td>4.4X100 um²</td>
<td>10X10 um²</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>BW</td>
<td>29GHz</td>
<td>31.3GHz</td>
<td>29.4GHz</td>
<td>40GHz</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>Bit rate</td>
<td>50 Gbps</td>
<td>40 Gbps</td>
<td>40 Gbps</td>
<td>&gt;40 Gbps</td>
<td>40 Gbps</td>
<td>40 Gbps</td>
</tr>
<tr>
<td>wavelength</td>
<td>850,895nm</td>
<td>1356nm</td>
<td>1356nm</td>
<td>1327nm</td>
<td>1352nm</td>
<td>1352nm</td>
</tr>
<tr>
<td>quantum efficiency</td>
<td>40%</td>
<td>40%</td>
<td>40%</td>
<td>50%</td>
<td>50%</td>
<td>50%</td>
</tr>
<tr>
<td>operating voltage</td>
<td>1.0V</td>
<td>5.0V</td>
<td>2.0V</td>
<td>&lt;4.0V</td>
<td>2V</td>
<td>2V</td>
</tr>
<tr>
<td>dark current</td>
<td>&lt;24nA</td>
<td>169nA</td>
<td>26 nA</td>
<td>100nA</td>
<td>75nA for 1V bias</td>
<td>75nA for 1V bias</td>
</tr>
</tbody>
</table>

Note:

- Estimated/simulated or calculated based on: [16], [29], [40], [18], [31], [52], [25], [53], [11], [36], [51], [7], [21], [30], [47].

- Bias power, estimated from [29].

- RF power consumption for 10GHz signal.

- Theoretical projections or device simulation results using [1].

---

3.2 On-chip Photodetectors

Photodetectors perform the function of optical to electrical data conversions at each terminal node of an optical path. Being the last component on a photonic path, there are several key parameters to characterize for a detector: detecting bit rate serves as an important constraint for photonic communication link design because it imposes an upper-limit to the optical layer data throughput; power consumption is also crucial since detectors are used in large quantity for high fan-out nets; photo-detection power threshold is another key constraint for low power driven CAD optimizations. Under such a constraint, optical modulators and waveguide must be planned optimally to guarantee successful optical-electrical conversion at each terminal node (sink).

From Table 2 we learn that current optical detectors provide fairly high throughput for optical to electrical data conversion with relatively small footprint area. However, ultra low power detector with smaller footprint is desired for high density on-chip photonic integration since photodetectors are present on optical links in large quantity.

3.3 Switches, Couplers and Buffers

3.3.1 On-chip Nanophotonic Switch

![On-chip Nanophotonic Switch Diagram]

Figure 5: (a) A 1/8 transport switch array built with Switch3-Trans in OIL; (b) (c) (d) (e) are simulated results for (a) under different electrode control voltages using [1].

On-chip nanophotonic switches can be employed to achieve Division Multiplex functions and can be used for constructing core-to-core photonic networks for a chip multi-processor. There are various ways to implement an optical switch. Fig. 5 shows a design of 1/8 switch constructed by seven 1/2 transport switches from OIL. Ring resonators from Fig. 4 are also favored for utilizing switching / re-directing functionalities due to compact footprint and relatively low insertion loss.

3.3.2 On-chip Nanophotonic Coupler

Nanophotonic switches and couplers are important devices for our proposed holistic photonic NoC architecture since they make the within-core waveguide routing possible in a Gridless Single Layer with Coupling manner. Shown in Fig. 6 is the working principle of an optical coupler. Un-
Table 3: High level parameters of on-chip nano-photonic switches/rings/coupler

<table>
<thead>
<tr>
<th></th>
<th>Switch1_Ring</th>
<th>Switch2_Rings</th>
<th>Transport switch</th>
<th>Switch3_Trans</th>
<th>Optical coupler</th>
</tr>
</thead>
<tbody>
<tr>
<td>ring radius</td>
<td>1.8um</td>
<td>4um×5rings</td>
<td>length ≈ 10um</td>
<td>length ≈ 15um</td>
<td></td>
</tr>
<tr>
<td>coupling gap</td>
<td>0.2um</td>
<td>0.2um</td>
<td>width 2.5um</td>
<td>width &lt;0.6um</td>
<td></td>
</tr>
<tr>
<td>passing loss</td>
<td>&lt;0.01dB³</td>
<td>&lt;0.3dB³</td>
<td>coupling loss ≈ 0.02dB</td>
<td>coupling loss ≈ 0.02dB</td>
<td></td>
</tr>
<tr>
<td>coupling loss</td>
<td>&lt;0.5dB⁴</td>
<td>&lt;2.5dB⁴</td>
<td>OWG bend loss(r=3um)</td>
<td>OWG bend loss ≈ 0.05dB</td>
<td></td>
</tr>
</tbody>
</table>

³to⁴ as marked in Table 2

Figure 6: Working mechanism for an optical coupler in OIL, simulated with [1]

Figure 7: Above: A ring-switch based nanophotonic buffer in OIL; Below: FDTD simulation for the on-chip optical buffer.

3.3.3 On-chip Nanophotonic Buffer

Nanophotonic buffers (Table 4) contribute to on-chip photonic signal delay/buffering for some special purposes. Up to now, nanophotonic buffers are not commonly used for on-chip applications, since packet switching based core-to-core communication operates in a globally asynchronous manner and there is no sequential logic functions on the photonic Network-on-Chip layer that require buffering for timing requirements, etc.

An FDTD simulated buffer with 5 stages of coupling ring switches from OIL is shown in Fig. 7, with a plot of its transient behavior on the bottom.

3.4 On-Chip Optical Waveguide

Under Grid-Less Single-Layer Routing with Couplings (de-

Table 4: High level parameters of nano-photonic buffers

<table>
<thead>
<tr>
<th></th>
<th>Buffer1_MPF</th>
<th>Buffer2_CROW</th>
<th>Buffer3_CROW</th>
</tr>
</thead>
<tbody>
<tr>
<td>footprint</td>
<td>0.69 mm²</td>
<td>0.045 mm²</td>
<td></td>
</tr>
<tr>
<td>ring radius</td>
<td>6.5 um</td>
<td>6.5 um</td>
<td></td>
</tr>
<tr>
<td>ring number</td>
<td>0.56 A/P</td>
<td>100 CROW</td>
<td></td>
</tr>
<tr>
<td>coupling gap</td>
<td>0.2 um</td>
<td>0.2 um</td>
<td></td>
</tr>
<tr>
<td>insertion loss</td>
<td>0.35 dB</td>
<td>0.35 dB</td>
<td></td>
</tr>
<tr>
<td>buffer cap</td>
<td>10 bits at 20Gbps</td>
<td>1 bit at 9Gbps</td>
<td></td>
</tr>
</tbody>
</table>

³ as in Table 2

Figure 8: Sources of loss for on-chip photonic waveguide

tails in [13]), within-core optical waveguide routing becomes very flexible. To characterize an optical path, OIL defines 3 types of waveguide losses in equations 1–4, as illustrated in Fig. 8, where \( P_{total} \) is waveguide propagation loss, it is proportional to the length of optical interconnect, with a coefficient \( \alpha \); \( B_{loss} \) is the bending loss, it is related to the degree of the optical interconnect (silicon waveguide) bending arc angle \( \theta \), and the radius \( r \) of the bend; \( C_{loss} \) is the coupling loss, proportional to the number of couplers (crossings) on the interconnect, with a coefficient \( \gamma \) in dB.

\[
\begin{align*}
Total_{loss} &= P_{loss} + B_{loss} + C_{loss} \\
P_{loss} &= \alpha \cdot length_{path} \quad (1) \\
B_{loss} &= \beta \cdot f(\theta) \cdot g(r) \quad (3) \\
C_{loss} &= \gamma \cdot Num_{couplers} \quad (4)
\end{align*}
\]

Fig. 9 plots the simulation results of total insertion loss on a bending waveguide using [1]. With small bending radius (<30um), bending waveguide sidewall surfaces serve as dominant sources of total insertion loss \( Total_{loss} \); as the radius gets larger, bending loss \( B_{loss} \) decreases to zero and waveguide propagation loss \( P_{loss} \) becomes the major source of loss. In OIL, \( \alpha \) is set to 1.5dB/cm for optical waveguide with 450 nm width and 230 nm thickness silicon core on insulator, to 4.5dB/cm for optical waveguide with 200 nm width and 100 nm thickness silicon core \( (n \approx 3.46) \) on silicon dioxide \( (n \approx 1.46) \). OIL calculates \( Total_{loss} \) of

Figure 9: Total simulated insertion loss on a certain bending optical waveguide (200nm wide, 100nm thick, \( n=3.5 \)) with small bending radius \( (1um–31um) \) and small bending degree \( (0–60degree) \) in OIL.
3.5 WDM On-Chip

WDM (Wavelength Division Multiplex) technique has been playing an active role in long haul optical communication for a long time. In WDM, multiple signals are modulated by different wavelength light beams and transmitted through a single multi-mode fibre (or via free space) in a wavelength multiplex manner. Although there are still major challenges to be properly addressed before it becomes a viable application for on-chip scale, latest device fabrication advancements such as [6, 7, 21, 27] demonstrate promising potentials for major breakthroughs in the near future.

Table 5 shows a few WDM related devices in OIL with some key high level parameters. While on-chip WDM device design and fabrication faces major challenges, it holds essential potentials for high through-put photonic NoC.

4. A HOLISTIC PHOTONIC NETWORK-ON-CHIP

Network-on-chip related architectures arose as a special class of applications for chip multi-processor communication efficiency, where high speed electrical wires are shared in Time Division Multiplex manner on a dedicated electrical network for core-to-core data packet routing, etc. Despite of its many advantages there is no true relief of on-chip power dissipation [5] for such an architecture on the electrical layer. Photonic NoC preserves the advantages of electrical NoC, meanwhile demonstrating great resilience in terms of higher bandwidth/throughput and low power consumption on a silicon photonic layer.

Based on previous photonic NoC, our proposed architecture in this paper combines photonic waveguide routing and network routing together onto a dedicated on-chip optical layer for improved core performance and enhanced silicon utilization.

Such a regulated architecture allocates the photonic layer resources in a systematic manner thus contributing towards sustainable high density on-chip photonic integration for future technology nodes. In the long run, the employment of a dedicated photonic silicon layer contributes towards high area utilization, mask reusability for existing CMOS silicon/metal layers and high flexibility for photonic layer design explorations.

4.1 Architecture Overview

As illustrated in Fig. 10, Fig. 10(a) is an electrical layer chip multi-processor; Fig. 10(b) is a generalized case of conventional photonic NoC architecture. Although a non-blocking photonic NoC requires some extra infrastructures, the overall occupancy of photonic silicon layer is still low under such an architecture since only a small portion of it is utilized. Based on such an architecture, our approach aims at further improving the photonic silicon utilization towards whole chip performance enhancement.

An overview of our proposed holistic architecture on photonic layer is shown in Fig. 10(c), and whole chip top-view is in Fig. 10(d) with processor electrical layers stacked with our proposed photonic network. Such a new architecture is composed of two major parts:

- A global photonic routing network-on-chip for efficient core-to-core communication with links shared in TimeDM / Space-DM / Optical-WDM manner
- A set of within-core optical waveguide routings for a properly selected set of nets on the low latency photonic layer, for timing (performance) improvement of each precessing unit

The first part aims at the design of high throughput / bandwidth core-to-core communication with low power consumption, while the second part must be properly supported by an optimized CAD flow for performance and/or power driven objectives, subject to various constraints parameterized by OIL.

4.2 Wire and Packet Routing

On-chip optical interconnect offers unique characteristics when compared with traditional copper-based interconnect in many aspects, such as improved power dissipation and low signal propagation latency, etc. While RC delay for a copper wire increases quadratically with wire length, photonic interconnect latency maintains a linear increase with a constant group velocity. In the following sections, we describe our new photonic architecture that combines optical wire routing.
4.2.1 Within-Core Optical Wire Routing

With the unique properties of on-chip photonics, we propose a Gridless Single Layer with Coupling based optical routing technique for the optical netlist. Our optical routing rationale is illustrated in Figure 11, where there are 2 nets to be routed within a core, noted as pin-i-j, meaning it is the jth member of net i. Fig. 11(a) and (b) shows two alternatives for conventional routing on electrical layer with buffers and/or metal via inserted to alleviate the timing penalty caused by the long wires across the chip. Buffers are inserted since RC delay increases quadratically with electrical wire length. Yet buffer insertion is not all-powerful and global timing nets are generally hard to close with traditional copper wires.

Fig. 11(c)-(f) show 4 possible routing geometries for the 2 nets on optical layer based on our optical routing, where nanophotonic devices such as modulators, photodetectors, couplers and optical waveguide are integrated on a silicon photonic layer, in the presence of coupling loss, waveguide bending loss and photodetection threshold constraints etc. Coupling enabled grid-less planar routing is favorable for optical layer due to the unique properties of photonics, therefore power consumption driven optical routing can be formulated as a CAD optimization problem under insertion loss (waveguide bending, coupling) and detection constraints (data conversion on photodetectors), etc.

The overall CAD flow for optical netlist mapping/routing is illustrated by Fig. 12, where a timing-driven procedure is employed to select a proper set of global nets from each core to be routed on the photonic layer. An optimized mapping procedure can result in timing enhancement for the chip since signal propagation delay on optical interconnect greatly outperforms that of copper interconnect as technology further scales down and chip frequency scales up [10,32].

After the mapping follows performance (power/timing) driven interconnect routing procedures on both electrical and photonic layer simultaneously. Nets mapped onto photonic layer are routed in a Gridless Single Layer with Coupling manner, while the rest of the nets are routed on metal layers. please refer to [13] for more details regarding OIL application to on-chip optical routing.

4.2.2 Core-to-Core Network/Packet Routing

Core to core high throughput communication for chip multi-processors have been recently leveraged onto photonic layer for performance ratio enhancement towards Tera-flop super computing scheme [41, 43]. Shown in Fig. 10(a) is a traditional many-core-on-chip processor on electrical layer; Fig. 10(b) is an photonic network architecture on optical layer for core-to-core communications for Fig. 10(a). Major nanophotonic components for constructing (b) are shown in Fig. 13, which are drawn in scale with a 3mm by 3mm core. Several current designs of photonic network router [8,41,42] R are depicted in Fig. 14, where (a)-(c) are relatively large (∼ 500 um) in footprint as non-blocking 4X4 router and (d) is compact in footprint (∼ 70 um) as a blocking 4X4 router. High speed optical waveguide is illustrated as thick lines in

Figure 11: Illustration for within-core photonic interconnect planning v.s. traditional electrical wire planning, where (a)(b) are 2 possible electrical routing scenarios for pin1-1/2/3 and pin2-1; (c)(d)(e)(f) are 4 possible optical routing for core logics, (with-in processing unit) and optical packet routing (between processing units) as a holistic approach for chip multi-processor performance improvement for next generation super computing.

**Figure 12:** Illustration for an electro-optic co-synthesis CAD flow for timing improvement of within-core interconnect (targeting at future technology nodes)

**Figure 13:** Major components for constructing photonic Networks-on-Chip, where CORE is the electrical processor, R is the photonic layer network router, G is the gateway connecting electrical layer and photonic silicon layer, OWG stands for optical waveguide (components drawn in scale based on [8,41,42])

**Figure 14:** Four optical routers for photonic Network-on-Chip from [8,41,42], where (a)-(c) are non-blocking photonic network routers and (d) is a blocking photonic network router
In this paper, we adopt the photonic NoC architecture proposed by [41,43] for photonic layer core-to-core communication, as part of our proposed holistic photonic NoC, which will be analyzed and discussed in Section 5.

5. EVALUATION AND DISCUSSION

In this section, we will apply OIL to our proposed holistic photonic networks-on-chip architecture for some performance evaluations and CAD optimization explorations.

5.1 Performance Improvement Analysis

![Interconnect delays for nets](image)

Figure 15: Post routing interconnect delay histogram comparisons between electrical routing and proposed hybrid routing.

Fig. 15 illustrates a qualitative perspective of on-chip processing unit (core) performance improvement with the proposed CAD flow in Fig. 12, compared with pure electrical interconnect planning. On one hand, the low latency property of photonic interconnect contributes to timing improvement (higher clock frequency) of each core-on-chip; on the other hand, there is also modulation and demodulation time overhead (decreasing as device fabrication technology advances) for data to be converted to and from the optical layer. Under such a scenario, the optimal timing improvement (as marked in Fig. 15) corresponds to a proper mapping of a subset of electrical netlist from the processing core metal layers onto the photonic layer, which can be formulated as a CAD optimization problem under various constraints, such as power budget, optical interconnect insertion loss, photodetection threshold and integration density on the photonic layer, etc. Applying OIL, we can explore various trade-offs and CAD optimizations for on-chip photonic integration towards next generation high performance chip multi-processor.

5.2 Interconnect Insertion Loss Analysis

Insertion loss (power loss) is defined as follows in unit of dB, where $Power_{in}$ is the input photon power and $Power_{out}$ is the output photon power of a certain device:

$$dB = -10 \log_{10} \frac{Power_{out}}{Power_{in}}$$

Using OIL, photonic network-on-chip routers in Fig. 14 are analyzed as basic building blocks for core-to-core optical routing networks. Best/worst/average losses (coupling loss, waveguide crossing loss and waveguide propagation loss considered) for each router are simulated and reported in Fig. 16, from which we can see that Router(a) is the most lossy among all three non-blocking routers, while blocking routers such as Router(d) usually has small loss figure due to its simply-ple design structures (much less internal waveguide bending and/or couplings).

Based on these data, various architecture level analysis can be carried out, such as insertion loss distribution analysis for network packet routing paths to detect/test/validate possible optical-to-electrical data conversion failures at certain gateways on a chip, given a specific global network architecture.

5.3 Multi-Core Scalability Discussion

![Chip multi-processor scalability bottleneck](image)

Figure 17: Chip multi-processor scalability bottleneck (curve A,C) and potential improvement targets (curve B,D)

One of the major challenges for multi-core on chip scaling lies in on-chip memory access bandwidth. As illustrated in Fig. 17 curve A, chip performance is expected to decrease as more and more cores are integrated on-chip, with the saturation of on-chip memory access bandwidth resource (curve C). Promising solutions are demonstrated lately, such as 3D IC [28] and RAM-aware NoC routing methodology [23]. Particularly, in the Circa chip project [44] targeted at year 2017, dedicated 3D IC on-chip memory layer and on-chip photonic network layer are combined to deliver scalable memory access bandwidth and high throughput optical communication (e.g., achieving curve B and curve D in Fig. 17). Over the years, we expect to see more innovations along this line towards next generation super computing CMP design.

Other challenges need to be properly addressed towards viable on-chip photonic integration include but are not restricted to: high performance low power architecture design and CAD synthesis, further advancement of device fabrication and cost reduction for on-chip nanophotonic devices, etc.

6. CONCLUSION

In this paper, we proposed OIL (Optical Interconnect Library), a characterized collection of silicon nano-photonic devices for system level interconnect planning and low power high performance design/synthesis explorations towards a new holistic photonic Networks-on-Chip paradigm. Such an architecture incorporates on-chip packet routing (photonic NoC) and within-core wire routing (optical waveguide plan-
ning) onto a single optical layer for better photonic silicon integration towards future generation CMPs. The proposed architecture is analyzed and discussed with OIL components for power efficiency, communication latency and potential future work directions.

7. ACKNOWLEDGMENT

This work is supported in part by Texas Norman Hacker Advanced Research Program.

8. REFERENCES

[1] RSoft Photonics CAD Suite version 5.1.7, by RSOFT Inc.