Kim

rozdział

End-to-end scalable FPGA accelerator for deep residual networks

Yufei Ma, Minkyu Kim, Yu Cao, Sarma Vrudhula, więcej

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

This work presents an efficient hardware accelerator design of deep residual learning algorithms, which have shown superior image recognition accuracy (>90% top-5 accuracy on ImageNet database). Two key objectives of the acceleration strategy are to (1) maximize resource utilization and minimize data movements, and (2) employ scalable and reusable computing primitives to optimize physical design...

rozdział

Smart City Service Acceleration on FPGAs

Youngsoo Kim, Marcus Garcia, Alan Chen, Nathan Wong, więcej

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService) > 243 - 248

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService)

Today's cyber-physical systems (CPS) for the emerging Smart cities includes hardware and software with intelligent sensing and controls. In Smart cities, the use of high definition images, videos, and context information has become a requirement for urban street data collection and processing. Field Programmable Gate Array enabled data centers and processing shows the great potential for its high...

rozdział

A novel zero weight/activation-aware hardware architecture of convolutional neural network

Dongyoung Kim, Junwhan Ahn, Sungjoo Yoo

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 > 1462 - 1467

2017 Design, Automation & Test in Europe Conference & Exhibition (DATE)

It is imperative to accelerate convolutional neural networks (CNNs) due to their ever-widening application areas from server, mobile to IoT devices. Based on the fact that CNNs can be characterized by a significant amount of zero values in both kernel weights and activations, we propose a novel hardware accelerator for CNNs exploiting zero weights and activations. We also report a zero-induced load...

rozdział

Accurate high-level modeling and automated hardware/software co-design for effective SoC design space exploration

Wei Zuo, Louis-Noel Pouchet, Andrey Ayupov, Taemin Kim, więcej

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC)

A desirable feature of a development tool for SoC design is that, given the important applications in the domain to be targeted by the SoC, a powerful hardware-software partitioning engine is available to determine which function(s) shall be mapped to hardware. However, to provide high-quality partitioning, this engine must be able to consider a rich design space of possible alternate hardware and...

rozdział

A programmable hardware accelerator for simulating dynamical systems

Jaeha Kung, Yun Long, Duckhwan Kim, Saibal Mukhopadhyay

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 403 - 415

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

The fast and energy-efficient simulation of dynamical systems defined by coupled ordinary/partial differential equations has emerged as an important problem. The accelerated simulation of coupled ODE/PDE is critical for analysis of physical systems as well as computing with dynamical systems. This paper presents a fast and programmable accelerator for simulating dynamical systems. The computing model...

rozdział

Live demonstration: An FPGA based hardware compression accelerator for Hadoop system

Sang Muk Lee, Jung Hwan Oh, Ji Hoon Jang, Seong Mo Lee, więcej

2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) > 744 - 745

2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)

Hadoop is an emerging data application for the big data processing. In Hadoop system, data compression is a significant part in processing big data effectively. Achieving this in software requires significant compute processing. In this paper we present the detailed design of a hardware compression accelerators. We also measure the performance of the hardware accelerators. Our analysis shows that...

rozdział

High-performance face detection with CPU-FPGA acceleration

Abinash Mohanty, Naveen Suda, Minkyu Kim, Sarma Vrudhula, więcej

2016 IEEE International Symposium on Circuits and Systems (ISCAS) > 117 - 120

2016 IEEE International Symposium on Circuits and Systems (ISCAS)

Face detection is a critical function in many embedded applications, such as computer vision and security. Although face detection has been well studied, detecting a large number of faces with different scales and excessive variations (pose, expression, or illumination) usually involves computationally expensive classification algorithms. These algorithms may divide an image into sub-windows at different...

rozdział

Hardware accelerator design for data centers

Serif Yesil, Muhammet Mustafa Ozdal, Taemin Kim, Andrey Ayupov, więcej

2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) > 770 - 775

2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

As the size of available data is increasing, it is becoming inefficient to scale the computational power of traditional systems. To overcome this problem, customized application-specific accelerators are becoming integral parts of modern system on chip (SOC) architectures. In this paper, we summarize existing hardware accelerators for data centers and discuss the techniques to implement and embed...

rozdział

Toward accelerating deep learning at scale using specialized hardware in the datacenter

Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, więcej

2015 IEEE Hot Chips 27 Symposium (HCS) > 1 - 38

2015 IEEE Hot Chips 27 Symposium (HCS)

This article consists of a collection of slides from the authors' conference presentation. Are FPGAs a Promising Target in the Datacenter for Deep Learning? Yes.

rozdział

Design and implementation of hardware accelerated VTEP in datacenter networks

Chang-Gyu Lim, Soo-Myung Pahk, Tae-Il Kim, Jong-Hyun Lee

2015 17th International Conference on Advanced Communication Technology (ICACT) > 745 - 748

2015 17th International Conference on Advanced Communication Technology (ICACT)

VXLAN (Virtual extensible Local Area Network) is an edge-overlay model that uses L2-in-L3 tunneling protocol. It has attracted attentions for multi-tenant datacenter networks. For the deployment of VXLAN in legacy networks, networks can include VXLAN gateways which forward traffic between VXLAN and non-VXLAN environments. This paper proposes the design of VXLAN gateways which are not in servers, but...

artykuł

A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, więcej

IEEE Micro > 2015 > 35 > 3 > 10 - 22

To advance datacenter capabilities beyond what commodity server designs can provide, the authors designed and built a composable, reconfigurable fabric to accelerate large-scale software services. Each instantiation of the fabric consists of a 6 x 8 2D torus of high-end field-programmable gate arrays (FPGAs) embedded into a half-rack of 48 servers. The authors deployed the reconfigurable fabric in...

rozdział

Source level offloading for special-purpose hardware accelerators

Shin-gyu Kim, Chaeseok Im, Minwook Ahn, Seungwon Lee

2015 IEEE International Conference on Consumer Electronics (ICCE) > 532 - 533

2015 IEEE International Conference on Consumer Electronics (ICCE)

This paper presents CLOSH, a source level offloading tool for special-purpose hardware accelerators. CLOSH is designed to make it easy to accelerate existing applications when their source code is available. We evaluated CLOSH with one application on our new TV platform, and found that required cycles are decreased by 27.4%.

rozdział

Efficient execution of memory access phases using dataflow specialization

Chen-Han Ho, Sung Jin Kim, Karthikeyan Sankaralingam

2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA) > 118 - 130

2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)

This paper identifies a new opportunity for improving the efficiency of a processor core: memory access phases of programs. These are dynamic regions of programs where most of the instructions are devoted to memory access or address computation. These occur naturally in programs because of workload properties, or when employing an in-core accelerator, we get induced phases where the code execution...

rozdział

A reconfigurable fabric for accelerating large-scale datacenter services

Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, więcej

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA) > 13 - 24

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)

Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we have designed and built a composable, reconfigurable fabric to accelerate portions of large-scale software services. Each instantiation of the...

rozdział

EMERALD: Characterization of emerging applications and algorithms for low-power devices

Chuanjun Zhang, Glenn G. Ko, Jung Wook Choi, Shang-nien Tsai, więcej

2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) > 122 - 123

2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Compute-intensive applications are emerging in intelligent home, retail store and automotive industries. These applications are becoming more sophisticated with new features rich in audio, video, image, and machine learning capabilities that demand heavy computations. We present the EMERALD (EMERging Applications and algorithms for Low power Device) workload suite. We profile the workloads to show...

rozdział

Deduplication in SSDs: Model and quantitative analysis

Jonghwa Kim, Choonghyun Lee, Sangyup Lee, Ikjoon Son, więcej

12 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST) > 1 - 12

2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)

In NAND Flash-based SSDs, deduplication can provide an effective resolution of three critical issues: cell lifetime, write performance, and garbage collection overhead. However, deduplication at SSD device level distinguishes itself from the one at enterprise storage systems in many aspects, whose success lies in proper exploitation of underlying very limited hardware resources and workload characteristics...

rozdział

A modified nonlinear guidance logic for a leader-follower formation flight of two UAVs

Do-Myung Kim, Suhyun Nam, Jinyoung Suk

2009 ICCAS-SICE > 5734 - 5739

2009 ICROS-SICE International Joint Conference. ICCAS-SICE 2009

This paper presents a guidance algorithm for formation flight of two UAVs. Since the nonlinear guidance algorithm have good properties to follow nonlinear flight trajectory based on geometric and kinematic, the nonlinear guidance algorithm is modified as a leader-follower station keeping formation control law for two UAVs using the relation of the nonlinear guidance algorithm to the proportional navigation...

artykuł

A Fast Method for Designing Time-Optimal Gradient Waveforms for Arbitrary $k$ -Space Trajectories

M. Lustig, Seung-Jean Kim, J.M. Pauly

IEEE Transactions on Medical Imaging > 2008 > 27 > 6 > 866 - 873

A fast and simple algorithm for designing time-optimal waveforms is presented. The algorithm accepts a given arbitrary multidimensional k-space trajectory as the input and outputs the time-optimal gradient waveform that traverses k-space along that path in minimum time. The algorithm is noniterative, and its run time is independent of the complexity of the curve, i.e., the number of switches between...

rozdział

Trick Play Method for HD H.264 Set-Top Box

Jin-Hwan Jeong, Ok-Gee Min, Yong-Ju Lee, Choon-Seo Park, więcej

2008 Digest of Technical Papers - International Conference on Consumer Electronics > 1 - 2

2008 International Conference on Consumer Electronics (ICCE '08)

We envision a mid-session control with 2-level speed scaling scheme that server can control content playback speed without client's aid. Also, our scheme doesn't need re-encoding, so server still can use well-known performance features. Therefore, it promises great advantages for set-top box makers.

rozdział

Communication-efficient hardware acceleration for fast functional simulation

Young-Il Kim, Wooseung Yang, Young-Su Kwon, Chong-Min Kyung

Proceedings. 41st Design Automation Conference, 2004. > 293 - 298

Proceedings. 41st Design Automation Conference, 2004.

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Kim

End-to-end scalable FPGA accelerator for deep residual networks

Smart City Service Acceleration on FPGAs

A novel zero weight/activation-aware hardware architecture of convolutional neural network

Accurate high-level modeling and automated hardware/software co-design for effective SoC design space exploration

A programmable hardware accelerator for simulating dynamical systems

Live demonstration: An FPGA based hardware compression accelerator for Hadoop system

High-performance face detection with CPU-FPGA acceleration

Hardware accelerator design for data centers

Toward accelerating deep learning at scale using specialized hardware in the datacenter

Design and implementation of hardware accelerated VTEP in datacenter networks

A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

Source level offloading for special-purpose hardware accelerators

Efficient execution of memory access phases using dataflow specialization

A reconfigurable fabric for accelerating large-scale datacenter services

EMERALD: Characterization of emerging applications and algorithms for low-power devices

Deduplication in SSDs: Model and quantitative analysis

A modified nonlinear guidance logic for a leader-follower formation flight of two UAVs

A Fast Method for Designing Time-Optimal Gradient Waveforms for Arbitrary $k$ -Space Trajectories

Trick Play Method for HD H.264 Set-Top Box

Communication-efficient hardware acceleration for fast functional simulation

Opcje filtrowania

Data publikacji

Typ publikacji

Słowa kluczowe

Czasopismo

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Kim

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Czasopismo

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu