Search results for: Guoyang Chen

Items from 1 to 7 out of 7 results

chapter

Sweet KNN: An Efficient KNN on GPU through Reconciliation between Redundancy Removal and Regularity

Guoyang Chen, Yufei Ding, Xipeng Shen

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 621 - 632

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Finding the k nearest neighbors of a query point or a set of query points (KNN) is a fundamental problem in many application domains. It is expensive to do. Prior efforts in improving its speed have followed two directions with conflicting considerations: One tries to minimize the redundant distance computations but often introduces irregularities into computations, the other tries to exploit the...

article

Optimizing Data Placement on GPU Memory: A Portable Approach

Guoyang Chen, Xipeng Shen, Bo Wu, Dong Li

IEEE Transactions on Computers > 2017 > 66 > 3 > 473 - 487

Modern GPUs feature complex memory system designs. One GPU may contain many types of memory of different properties. The best way to place data in memory is sensitive to many factors (e.g., program inputs, architectures), making portable optimizations of GPU data placement a difficult challenge. PORPLE is a recently proposed method that overcomes the difficulties by enabling online optimizations of...

chapter

OpenCL-based erasure coding on heterogeneous architectures

Guoyang Chen, Huiyang Zhou, Xipeng Shen, Josh Gahm, more

2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 33 - 40

2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Erasure coding, Reed-Solomon coding in particular, is a key technique to deal with failures in scale-out storage systems. However, due to the algorithmic complexity, the performance overhead of erasure coding can become a significant bottleneck in storage systems attempting to meet service level agreements (SLAs). Previous work has mainly leveraged SIMD (single-instruction multiple-data) instruction...

article

Enabling Portable Optimizations of Data Placement on GPU

Guoyang Chen, Bo Wu, Dong Li, Xipeng Shen

IEEE Micro > 2015 > 35 > 4 > 16 - 24

Modern GPU memory systems manifest more varieties, increasing complexities, and rapid changes. Different placements of data on memory systems often cause significant differences in program performance. Most current GPU programming systems rely on programmers to indicate the appropriate placements, but finding the appropriate placements is difficult for programmers in practice owing to the complexity...

chapter

Free launch: Optimizing GPU dynamic kernel launches through thread reuse

Guoyang Chen, Xipeng Shen

2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) > 407 - 419

2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Supporting dynamic parallelism is important for GPU to benefit a broad range of applications. There are currently two fundamental ways for programs to exploit dynamic parallelism on GPU: a software-based approach with software-managed worklists, and a hardware-based approach through dynamic subkernel launches. Neither is satisfactory. The former is complicated to program and is often subject to some...

chapter

PORPLE: An Extensible Optimizer for Portable Data Placement on GPU

Guoyang Chen, Bo Wu, Dong Li, Xipeng Shen

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 88 - 100

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

GPU is often equipped with complex memory systems, including globalmemory, texture memory, shared memory, constant memory, and variouslevels of cache. Where to place the data is important for theperformance of a GPU program. However, the decision is difficult for aprogrammer to make because of architecture complexity and thesensitivity of suitable data placements to input and architecturechanges.This...

chapter

SM-centric transformation: Circumventing hardware restrictions for flexible GPU scheduling

Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, more

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) > 497 - 498

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

To circumvent the limitation from the hardware scheduler on GPU, we create an SM-centric transformation technique. This technique enables complete control of the mapping between tasks and streaming multi-processors (SMs), and enables controlling the number of active thread blocks on each SM. Results show that our approach achieves better speedup than previous ones with kernel co-run cases.

Filter options

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Guoyang Chen

Sweet KNN: An Efficient KNN on GPU through Reconciliation between Redundancy Removal and Regularity

Optimizing Data Placement on GPU Memory: A Portable Approach

OpenCL-based erasure coding on heterogeneous architectures

Enabling Portable Optimizations of Data Placement on GPU

Free launch: Optimizing GPU dynamic kernel launches through thread reuse

PORPLE: An Extensible Optimizer for Portable Data Placement on GPU

SM-centric transformation: Circumventing hardware restrictions for flexible GPU scheduling

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options