Search results for: Yu Wang

Items from 1 to 20 out of 24 results

article

Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA

Kaiyuan Guo, Lingzhi Sui, Jiantao Qiu, Jincheng Yu, more

IEEE Transactions on Computer-Aided Design of Integrated Circuits and... > 2018 > 37 > 1 > 35 - 47

Convolutional neural network (CNN) has become a successful algorithm in the region of artificial intelligence and a strong candidate for many computer vision algorithms. But the computation complexity of CNN is much higher than traditional algorithms. With the help of GPU acceleration, CNN-based applications are widely deployed in servers. However, for embedded platforms, CNN-based solutions are still...

chapter

Exploring the Granularity of Sparsity in Convolutional Neural Networks

Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, more

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) > 1927 - 1934

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with zeros. The granularity of sparsity affects the efficiency of hardware architecture and the prediction accuracy. In this paper we quantitatively measure the accuracy-sparsity relationship with different granularity. Coarse-grained sparsity brings more regular sparsity pattern, making it easier for hardware...

chapter

An FPGA Design Framework for CNN Sparsification and Acceleration

Sicheng Li, Wei Wen, Yu Wang, Song Han, more

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 28

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

Convolutional neural networks (CNNs) have recently broken many performance records in image recognition and object detection problems. The success of CNNs, to a great extent, is enabled by the fast scaling-up of the networks that learn from a huge volume of data. The deployment of big CNN models can be both computation-intensive and memory-intensive, leaving severe challenges to hardware implementations...

chapter

Binary convolutional neural network on RRAM

Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, more

2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC) > 782 - 787

2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC)

Recent progress in the machine learning field makes low bit-level Convolutional Neural Networks (CNNs), even CNNs with binary weights and binary neurons, achieve satisfying recognition accuracy on ImageNet dataset. Binary CNNs (BCNNs) make it possible for introducing low bit-level RRAM devices and low bit-level ADC/DAC interfaces in RRAM-based Computing System (RCS) design, which leads to faster read-and-write...

chapter

Edge Detection Using Varied Local Edge Pattern Descriptor

Huaixin Yan, Yu Wang, Na Zhang

2016 International Conference on Virtual Reality and Visualization (ICVRV) > 114 - 118

2016 International Conference on Virtual Reality and Visualization (ICVRV)

Edge detection is an active and critical topic in the field of image processing, and plays a vital role for some important applications such as image segmentation, pattern classification, object tracking etc. In this paper, an approach using varied local edge pattern descriptor is proposed for edge detection. This method contains the following steps: firstly, Gaussian filter is used to smooth the...

chapter

Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware

Kaiyuan Guo, Lingzhi Sui, Jiantao Qiu, Song Yao, more

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 24 - 29

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Convolutional Neural Network (CNN) has become a successful algorithm in the region of artificial intelligence and a strong candidate for many applications. However, for embedded platforms, CNN-based solutions are still too complex to be applied if only CPU is utilized for computation. Various dedicated hardware designs on FPGA and ASIC have been carried out to accelerate CNN, while few of them explore...

chapter

Low power Convolutional Neural Networks on a chip

Yu Wang, Lixue Xia, Tianqi Tang, Boxun Li, more

2016 IEEE International Symposium on Circuits and Systems (ISCAS) > 129 - 132

2016 IEEE International Symposium on Circuits and Systems (ISCAS)

Deep learning, and especially Convolutional Neural Network (CNN, is among the most powerful and widely used techniques in computer vision. Applications range from image classification to object detection, segmentation, Optical Character Recognition (OCR), etc. At the same time, CNNs are both computationally intensive and memory intensive, making them difficult to be deployed on low power lightweight...

chapter

A data locality-aware design framework for reconfigurable sparse matrix-vector multiplication kernel

Sicheng Li, Yandan Wang, Wujie Wen, Yu Wang, more

2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) > 1 - 6

2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

Sparse matrix-vector multiplication (SpMV) is an important computational kernel in many applications. For performance improvement, software libraries designated for SpMV computation have been introduced, e.g., MKL library for CPUs and cuSPARSE library for GPUs. However, the computational throughput of these libraries is far below the peak floating-point performance offered by hardware platforms, because...

chapter

Switched by input: Power efficient structure for RRAM-based convolutional neural network

Lixue Xia, Tianqi Tang, Wenqin Huangfu, Ming Cheng, more

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)

Convolutional Neural Network (CNN) is a powerful technique widely used in computer vision area, which also demands much more computations and memory resources than traditional solutions. The emerging metal-oxide resistive random-access memory (RRAM) and RRAM crossbar have shown great potential on neuromorphic applications with high energy efficiency. However, the interfaces between analog RRAM crossbars...

chapter

Performance-centric register file design for GPUs using racetrack memory

Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, more

2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC) > 25 - 30

2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC)

The key to high performance for GPU architecture lies in massive threading to drive the large number of cores and enable overlapping of threading execution. However, in reality, the number of threads that can simultaneously execute is often limited by the size of the register file on GPUs. The traditional SRAM-based register file costs so large amount of chip area that it cannot scale to meet the...

chapter

How to initialize the CNN for small datasets: Extracting discriminative filters from pre-trained model

Guanwen Zhang, Jien Kato, Yu Wang, Kenji Mase

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) > 479 - 483

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)

In this paper, we study how to initialize the convolutional neural network (CNN) model for training on a small dataset. Specially, we try to extract discriminative filters from the pre-trained model for a target task. On the basis of relative entropy and linear reconstruction, two methods, Minimum Entropy Loss (MEL) and Minimum Reconstruction Error (MRE), are proposed. The CNN models initialized by...

chapter

People re-identification using deep convolutional neural network

Guanwen Zhang, Jien Kato, Yu Wang, Kenji Mase

2014 International Conference on Computer Vision Theory and Applications (VISAPP) > 3 > 216 - 223

2014 International Conference on Computer Vision Theory and Applications (VISAPP)

One key issue for people re-identification is to find good features or representation to bridge the gaps among different appearances of the same people, which is introduced by large variances in view point, illumination and non-rigid deformation. In this paper, we create a deep convolutional neural network (deep CNN) to solve this problem and integrate feature learning and re-identification into one...

chapter

A speaker recognition algorithm based on factor analysis

Xuanjing Shen, Yujie Zhai, Yu Wang, Haipeng Chen

2014 7th International Congress on Image and Signal Processing > 897 - 901

2014 7th International Congress on Image and Signal Processing (CISP)

Channel interference factor for the identification result is prevalent among the existing speaker recognition algorithms. In order to improve the accuracy of the algorithm, the paper utilizes the technique of latent factor analysis(LFA) to deal with the channel factors in the speaker's Gaussian Mixture Model(GMM). In the endpoint detection phase of speaker recognition, the algorithm introduces the...

chapter

Dynamic Stencil: Effective exploitation of run-time resources in reconfigurable clusters

Xinyu Niu, Jose G. F. Coutinho, Yu Wang, Wayne Luk

2013 International Conference on Field-Programmable Technology (FPT) > 214 - 221

2013 International Conference on Field-Programmable Technology (FPT)

Computing nodes in reconfigurable clusters are occupied and released by applications during their execution. At compile time, application developers are not aware of the amount of resources available at run time. Dynamic Stencil is an approach that optimises stencil applications by constructing scalable designs which can adapt to available run-time resources in a reconfigurable cluster. This approach...

chapter

A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration

Brahim Betkaoui, Yu Wang, David B. Thomas, Wayne Luk

2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors > 8 - 15

2012 IEEE 23rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)

In many application domains, data are represented using large graphs involving millions of vertices and billions of edges. Graph exploration algorithms, such as breadth-first search (BFS), are largely dominated by memory latency and are challenging to process efficiently. In this paper, we present a reconfigurable hardware methodology for efficient parallel processing of large-scale graph exploration...

chapter

Determinations of low breast screening uptake using geographically weighted regression model

Chen Chen, Yu Wang, Huabing Wan, Tao Cheng

2012 20th International Conference on Geoinformatics > 1 - 6

2012 20th International Conference on Geoinformatics

In recent years, the overall breast screening uptake rate in South West London is lower than national average figure. It is well acknowledged that population turnover, minutes for travel time to screening units, deprivation and culture factors impact on breast screening uptake from previous research. This paper focuses on the relationship between breast screening uptake and its determinant factors:...

chapter

Weighted Kernel Fuzzy C-Means Method for Gene Expression Analysis

Yu Wang, Maia Angelova

2012 Spring Congress on Engineering and Technology > 1 - 4

2012 Spring Congress on Engineering and Technology (S-CET)

Many clustering techniques have been proposed for the analysis of gene expression data. However, the optimal method for a given experimental dataset is still not resolved. Fuzzy c-means and kernel fuzzy c-means algorithm have been widely applied to gene expression data, but they give the equal weight to the genes and noises, which lead to results that are not stable or accurate. In this paper, we...

article

Real-Time GPU Surface Curvature Estimation on Deforming Meshes and Volumetric Data Sets

Wesley Griffin, Yu Wang, David Berrios, Marc Olano

IEEE Transactions on Visualization and Computer Graphics > 2012 > 18 > 10 > 1603 - 1613

Surface curvature is used in a number of areas in computer graphics, including texture synthesis and shape representation, mesh simplification, surface modeling, and nonphotorealistic line drawing. Most real-time applications must estimate curvature on a triangular mesh. This estimation has been limited to CPU algorithms, forcing object geometry to reside in main memory. However, as more computational...

chapter

Software-Based Detecting and Recovering from ECC-Memory Faults

Xingjun Zhang, Endong Wang, Dong Zhang, Yu Wang, more

2011 Third International Conference on Intelligent Networking and Collaborative Systems > 715 - 719

2011 Third International Conference on Intelligent Networking and Collaborative Systems (INCoS)

According to the problem that the ECC cannot correct the multibit error in ECC memory, this paper proposes a memory error processing method on software level. On the foundation of revising the Linux kernel code, the method can discover this area of influence area of memory error according to seek the process information mapping to the mistaken address. This way can avoid wastage to the user due to...

chapter

Gemma in April: A matrix-like parallel programming architecture on OpenCL

Tianji Wu, Di Wu, Yu Wang, Xiaorui Zhang, more

2011 Design, Automation&Test in Europe > 1 - 6

2011 Design, Automation & Test in Europe

Nowadays, Graphics Processing Unit (GPU), as a kind of massive parallel processor, has been widely used in general purposed computing tasks. Although there have been mature development tools, it is not a trivial task for programmers to write GPU programs. Based on this consideration, we propose a novel parallel computing architecture. The architecture includes a parallel programming model, named Gemma,...

Keywords:
KERNEL

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Yu Wang

Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA

Exploring the Granularity of Sparsity in Convolutional Neural Networks

An FPGA Design Framework for CNN Sparsification and Acceleration

Binary convolutional neural network on RRAM

Edge Detection Using Varied Local Edge Pattern Descriptor

Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware

Low power Convolutional Neural Networks on a chip

A data locality-aware design framework for reconfigurable sparse matrix-vector multiplication kernel

Switched by input: Power efficient structure for RRAM-based convolutional neural network

Performance-centric register file design for GPUs using racetrack memory

How to initialize the CNN for small datasets: Extracting discriminative filters from pre-trained model

People re-identification using deep convolutional neural network

A speaker recognition algorithm based on factor analysis

Dynamic Stencil: Effective exploitation of run-time resources in reconfigurable clusters

A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration

Determinations of low breast screening uptake using geographically weighted regression model

Weighted Kernel Fuzzy C-Means Method for Gene Expression Analysis

Real-Time GPU Surface Curvature Estimation on Deforming Meshes and Volumetric Data Sets

Software-Based Detecting and Recovering from ECC-Memory Faults

Gemma in April: A matrix-like parallel programming architecture on OpenCL

Filter options

Publication date

Content availability

Publication type

Keywords

Journal

INFONA - science communication portal

Search results for: Yu Wang

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options