A 4.5Tb/s 3.4Tb/s/W 64&#x00D7;64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS

Sudhir Satpathy; Korey Sewell; Thomas Manville; Yen-Po Chen; Ronald Dreslinski; Dennis Sylvester; Trevor Mudge; David Blaauw

doi:10.1109/ISSCC.2012.6177098

A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS

Satpathy, Sudhir, Sewell, Korey, Manville, Thomas, Chen, Yen-Po, Dreslinski, Ronald, Sylvester, Dennis, Mudge, Trevor, Blaauw, David

Źródło

2012 IEEE International Solid-State Circuits Conference > 478 - 480

Abstrakt

High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems [1,2]. Conventional routers use distinct logic blocks for routing data and handling arbitration [3,4]. At higher radices, connections between these blocks become a bottleneck, limiting router scalability and degrading performance. Recently, two switch topologies [5,6] merged the data-routing fabric with arbitration control, avoiding this bottleneck. However, [6] relies on centralized control for channel allocation, limiting performance, while [5] is restricted to a small set of fixed priorities, rendering input ports prone to starvation. In addition, ever larger CMPs will require continued increases in bandwidth over previous designs. To address these issues, we present a 64×64 single-stage swizzle-switch network (SSN) with 128b data buses (8192 total input/output wires). The SSN can connect any input to any output, including multicast. It has a peak measured throughput of 4.5Tb/s at 1.1V in 45nm SOI CMOS at 25°C. The SSN's key features are: 1) a single-cycle least-recently-granted (LRG) priority arbitration technique that reuses the already present input and output data buses and their drivers and sense amps; 2) an additional 4-level message-based priority arbitration for quality of service (QoS) with 2% logic and 3% wiring overhead; 3) a bidirectional bitline repeater that allows the router to scale to >8000 wires. These features result in a compact fabric (4.06mm²) with throughput gain of 2.1× over [5] at 3.4Tb/s/W efficiency, which improves to 7.4Tb/s/W at 600mV.

Identyfikatory

ISSN książki :	0193-6530
ISBN książki :	978-1-4673-0376-7
e-ISBN książki :	978-1-4673-0377-4 , 978-1-4673-0374-3 , 978-1-4673-0375-0
DOI	10.1109/ISSCC.2012.6177098