The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A 1Tbit/s bandwidth PHY is demonstrated through 2.5D CoWoS platform. Two chips: SOC and eDRAM have been fabricated in TSMC 40nm CMOS technology and stacked on another silicon interposer chip in 65nm technology. Total 1024 DQ bus operating in 1.1Gbit/s with Vmin=0.3V are proven in experimental results. A novel timing compensation mechanism is presented to achieve a low-power and small area eDRAM PHY...
A 3D IC heterogeneous chip integration of 65nm RF receiver, 28nm baseband processor, and 40nm DRAM on a proprietary CoWoS structure is demonstrated and its electrical characterization of KGS (Known Good Stack) has revealed a highly comparable system performance as compared to that of the KGD (Known Good Die) testing data. Moreover, an innovative system BIST (Built-in-Self-Test) scheme and methodology...
This work demonstrates a 3D vertical-gate (3DVG) NAND Flash with circuit-level techniques to overcome degradations in speed, yield, and reliability resulting from cross-layer process variations. The key enables include: (1) layer-aware program-verify-and-read (LA-PV&R), (2) layer-aware-bitline-precharge (LA-BP), and (3) a wave-propagation (WP) fail-bit detection (FBD) scheme. A fabricated 2-layer...
3D Integrated Circuit (3D-IC) opens architecture opportunities for improved SoC-to-memory interconnect bandwidth between dies. This paper presents the design of a two-tier 3D-IC composed of one NoC-based MPSoC and one multi-channel WideIO mobile SDRAM stacked in a face-to-back configuration. Measurements of the 3D-IC show that the targeted 12.8 GByte/s bandwidth is achieved in worst case conditions,...
A “scalable 3D-FPGA” using TSV interconnects is proposed. This FPGA was designed on the basis of homogeneous 3D-stacking to extend the logic scale in proportion to the number of stacked layers. To improve Z-axis transmission performance, a wafer-to-wafer stacking process for lowering the capacitance of TSV was developed. An “embedded TSV“ design for the shorter on-chip wirings was also devised. Moreover,...
A 6-port, 2-lane packet-switched input-buffered wormhole router forms the key building block of a 2×2 2D mesh network-on-chip (NoC). The router operates across a wide frequency (voltage) range of 1GHz (0.85V) to 67MHz (340mV), dissipating 28.5mW to 675µW and achieves 3.3X improvement in energy-efficiency at an optimum supply voltage (VOPT) of 400mV. The resilient router incorporates an end-to-end...
A 0.5V, 10MHz, 9mW image processor with 320 processing element (PE) SIMD and a 32bit CPU has been developed using 40-nm CMOS. High voltage clock distribution (HVCD) reduces the number of excessive hold buffers required in a 0.5-V logic circuit design, thereby reducing the area, delay, and energy of the SIMD by 14 %, 13%, and 6%, respectively. The 0.5-V SIMD with HVCD achieves an energy efficiency...
We have realized the characterization of MOSFET noise up to 3 GHz by locating a low-noise (LN) transimpedance amplifier (TIA) close to the devices to be tested (DUTs). A noise floor as low as 3 pA/√Hz was achieved by using an external high-voltage input. Moreover, a high-frequency noise probe equipped with a TIA IC was fabricated, with which measurements in a frequency range up to 800 MHz were achieved...
This paper presents the implementation details and silicon results of a 2.6GHz dual-core ARM Cortex A9 manufactured in a 28nm Ultra-Thin Body and BOX FD-SOI technology. The implementation is based on a fully synthesizable standard design flow, and the design exploits the great flexibility provided by FD-SOI technology, notably a wide Dynamic Voltage and Frequency Scaling (DVFS) range, from 0.6V to...
This work demonstrates the first fabricated nonvolatile TCAM using 2-transistor/2-resistive-storage (2T-2R) cells to achieve >10x smaller cell size than SRAM-based TCAMs at the same technology node. The test chip was designed and fabricated in IBM 90nm CMOS technology and mushroom phase-change memory (PCM) process. To ensure reliable search operation with such compact cells, two enabling techniques...
A ternary content-addressable memory (TCAM)-based hardware called nonvolatile “multi-functional CAM (MF-CAM)” is proposed for an ultra-low-energy “full-text search” system in recent data centers. The proposed nonvolatile MF-CAM-based full-text search engine can perform parallel comparison while eliminating leakage energy by hierarchical power gating. By the massively parallel comparison with the hierarchical...
This paper presents a novel 1Mb STT-MRAM for power and area reduction of cache memory in micro-processors. This memory adopts current-integral sensing scheme for high speed read, and uses advanced perpendicular STT-MRAM for high speed write to achieve 250 MHz operation, 17.8 mW read power and 46.5 mW write power per 256-b I/O. Using a processor simulator, it has been confirmed the total cache power...
A 1Mb STT-RAM with a 6T2MTJ cell is designed and fabricated using 90nm CMOS/MTJ process that can operate in 1.5nsec/2.1nsec random read/write cycle by adopting a background write scheme. It works around the problem of high error rate of MTJ switching in a short period of time at moderate drive current. The RAM is fast enough to be applicable to embedded memories such as L3 cache.
Resistive RAM (RRAM) faces two major design challenges: 1) cell area versus write current requirements; 2) cell current (ICELL) versus read disturbance. An RRAM using logic-process-based vertical parasitic-BJT (VPBJT) switches and correspondent cell array (VPBJT-CA) can achieve 4.5+x smaller macro area. To overcome temperature-dependent fluctuation in the base-emitter voltage difference (VBE) of BJT,...
For 20nm SoC products, we propose an SRAM macro with low dynamic and leakage power. This is achieved by adopting an interleave word-line and hierarchical bit-line scheme, in which minimum portions of circuits are activated when SRAM is accessed. Measured data confirms that the proposed 128kb SRAM realizes 600 mV operation, 2.1 µW/MHz active power and 82 % leakage power reduction.
A 14KB 8T-bitcell SRAM array is demonstrated in 22nm tri-gate CMOS with fine-grain dual-VCC assist techniques. VMIN limiting 8T-bitcell nodes are boosted selectively during read and write to improve overall chip-VMIN. Measurements show 130–270mV lower VMIN with 27–46% lower power at 0.4–1.6GHz for varying amounts of boosting, array activity and voltage regulator efficiency.
This work proposes an 8T cell with dual data-aware write-assist (D2AW) and negative read-WL (NRWL) schemes to increase the figure of merit (FOM): [cell stability (CS)*cycle frequency (f)]/[cell area (A)*minimum VDD (VDDmin)]. The column-based D2AW provides, for the first time, the solution to the trade-off between the row/column half-select (HS)-CS margins and the write margin (WM) thanks to the dual...
The 20-way set associative 2.5MB slice ported L3 cache for the multi-core Xeon® Processor uses 0.108 um2 cell in a 22nm tri-gate technology with 2.7TB maximum bandwidth. It is protected by double-error correction/triple-error detection ECC. The basic building block is designed to support floorplan style on each processor with large L3 cache. On die fuse storage enables high resolution repair coverage...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.