Search results for: Wu-chun Feng

Items from 1 to 13 out of 13 results

chapter

Bridging the FPGA programmability-portability Gap via automatic OpenCL code generation and tuning

Konstantinos Krommydas, Ruchira Sasanka, Wu-chun Feng

2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 213 - 218

2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Programming FPGAs has been an arduous task that requires extensive knowledge of hardware design languages (HDLs), such as Verilog or VHDL, and low-level hardware details. With OpenCL support for FPGAs, the design, prototyping and implementation of an FPGA is increasingly moving towards a much higher level of abstraction, when compared to the intrinsically low-level nature of HDLs. On the other hand,...

article

MPI-ACC: Accelerator-Aware MPI for Scientific Applications

Ashwin M. Aji, Lokendra S. Panwar, Feng Ji, Karthik Murthy, more

IEEE Transactions on Parallel and Distributed Systems > 2016 > 27 > 5 > 1401 - 1414

Data movement in high-performance computing systems accelerated by graphics processing units (GPUs) remains a challenging problem. Data communication in popular parallel programming models, such as the Message Passing Interface (MPI), is currently limited to the data stored in the CPU memory space. Auxiliary memory systems, such as GPU memory, are not integrated into such data movement standards,...

chapter

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs

Konstantinos Krommydas, Ahmed E. Helal, Anshuman Verma, Wu-Chun Feng

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) > 198

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance has come at the significant expense of programmability, i.e., the performance-programmability gap. In particular,...

chapter

Delivering Parallel Programmability to the Masses via the Intel MIC Ecosystem: A Case Study

Kaixi Hou, Hao Wang, Wu-chun Feng

2014 43rd International Conference on Parallel Processing Workshops > 273 - 282

2014 43nd International Conference on Parallel Processing Workshops (ICCPW)

Moore's Law effectively doubles the compute power of a microprocessor every 24 months. Over the past decade, however, this doubling in performance has been due to the doubling of the number of cores in a microprocessor rather than clock speed increases. Perhaps nowhere is this more evident than with the Intel Xeon Phi coprocessor. This many core architecture exhibits not only massive inter-core parallelism...

chapter

SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction

Nataliya Timoshevskaya, Wu-chun Feng

2014 IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) > 1 - 6

2014 IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)

The suffix array and Burrows-Wheeler Transform are critical index structures in next generation sequence analysis. The construction of such index structures for mammalian-sized genomes can take thousands of seconds (i.e. tens of minutes). Its construction is complicated by computational overheads that coming from irregular or complex memory-access patterns. This paper rigorously characterizes the...

chapter

Locality-aware memory association for multi-target worksharing in OpenMP

Thomas R. W. Scogland, Wu-Chun Feng

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) > 515 - 516

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

Heterogeneity is an ever-growing challenge in computing. The clearest example is the increasing popularity of GPUs, and purpose-designed coprocessors such as Intel Xeon Phi. Even disregarding coprocessors, heterogeneity continues to increase with the rise in CPU core counts, adaptive per-core frequencies, and increasingly hierarchical and complex memory systems. Take a system with four memory nodes,...

chapter

On the Programmability and Performance of Heterogeneous Platforms

Konstantinos Krommydas, Thomas R.W. Scogland, Wu-Chun Feng

2013 International Conference on Parallel and Distributed Systems > 224 - 231

2013 International Conference on Parallel and Distributed Systems (ICPADS)

General-purpose computing on an ever-broadening array of parallel devices has led to an increasingly complex and multi-dimensional landscape with respect to programmability and performance optimization. The growing diversity of parallel architectures presents many challenges to the domain scientist, including device selection, programming model, and level of investment in optimization. All of these...

chapter

Wideband Channelization for Software-Defined Radio via Mobile Graphics Processors

Vignesh Adhinarayanan, Wu-Chun Feng

2013 International Conference on Parallel and Distributed Systems > 86 - 93

2013 International Conference on Parallel and Distributed Systems (ICPADS)

Wideband channelization is a computationally intensive task within software-defined radio (SDR). To support this task, the underlying hardware should provide high performance and allow flexible implementations. Traditional solutions use field-programmable gate arrays (FPGAs) to satisfy these requirements. While FPGAs allow for flexible implementations, realizing a FPGA implementation is a difficult...

chapter

Accelerating fast Fourier Transform for wideband channelization

Carlo del Mundo, Vignesh Adhinarayanan, Wu-chun Feng

2013 IEEE International Conference on Communications (ICC) > 4776 - 4780

ICC 2013 - 2013 IEEE International Conference on Communications

Wideband channelization is a compute-intensive task with performance requirements that are arguably greater than what current multi-core CPUs can provide. To date, researchers have used dedicated hardware such as field programmable gate arrays (FPGAs) to address the performance-critical aspects of the channelizer. In this work, we assess the viability of the graphics processing unit (GPU) to achieve...

chapter

Generalizing the Utility of GPUs in Large-Scale Heterogeneous Computing Systems

Shucai Xiao, Wu-chun Feng

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 2554 - 2557

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Graphics Processing Units (GPUs) have been widely used as accelerators in large-scale heterogeneous computing systems. However, current programming models can only support the utilization of local GPUs. When using non-local GPUs, programmers need to explicitly call API functions for data communication across computing nodes. As such, programming GPUs in large-scale computing systems is more challenging...

chapter

Optimizing Dynamic Programming on Graphics Processing Units via Adaptive Thread-Level Parallelism

Chao-Chin Wu, Jenn-Yang Ke, Heshan Lin, Wu-chun Feng

2011 IEEE 17th International Conference on Parallel and Distributed Systems > 96 - 103

2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS)

Dynamic programming (DP) is an important computational method for solving a wide variety of discrete optimization problems such as scheduling, string editing, packaging, and inventory management. In general, DP is classified into four categories based on the characteristics of the optimization equation. Because applications that are classified in the same category of DP have similar program behavior,...

chapter

Towards accelerating molecular modeling via multi-scale approximation on a GPU

M Daga, Wu-chun Feng, T Scogland

2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) > 75 - 80

2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)

Research efforts to analyze biomolecular properties contribute towards our understanding of biomolecular function. Calculating non-bonded forces (or in our case, electrostatic surface potential) is often a large portion of the computational complexity in analyzing biomolecular properties. Therefore, reducing the computational complexity of these force calculations, either by improving the computational...

chapter

Massively parallel genomic sequence search on the Blue Gene/P architecture

Heshan Lin, P. Balaji, R. Poole, C. Sosa, more

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 11

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

This paper presents our first experiences in mapping and optimizing genomic sequence search onto the massively parallel IBM Blue Gene/P (BG/P) platform. Specifically, we performed our work on mpiBLAST, a parallel sequence-search code that has been optimized on numerous supercomputing environments. In doing so, we identify several critical performance issues. Consequently, we propose and study different...

Filter options

Keywords:
OPTIMIZATION

Publication date

Set your own date range

Publication type

book (12)
article (1)

Keywords

PROGRAMMING (6)
GRAPHICS PROCESSING UNITS (5)
KERNEL (4)
BANDWIDTH (3)
GPU (3)
INSTRUCTION SETS (3)
PROGRAMMABILITY (3)
ACCELERATION (2)
COMPUTER ARCHITECTURE (2)
COPROCESSORS (2)
FIELD PROGRAMMABLE GATE ARRAYS (2)
GENOMICS (2)
GRAPHICS PROCESSING UNIT (2)
GRAPHICS PROCESSING UNIT (GPU) (2)
PARALLEL PROCESSING (2)
PERFORMANCE (2)
VECTORS (2)
WIDEBAND (2)
ADAPTIVE SYSTEMS (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
APPROXIMATION THEORY (1)
ARRAYS (1)
AVX (1)
BIOELECTRIC POTENTIALS (1)
BIOINFORMATICS (1)
BIOLOGY COMPUTING (1)
BIOMOLECULAR FUNCTION (1)
BIOMOLECULAR PROPERTIES (1)
BLUE GENE/P ARCHITECTURE (1)
BURROWS-WHEELER TRANSFORM (1)
CLASSIFICATION ALGORITHMS (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTATIONAL MODELING (1)
COMPUTER GRAPHICS (1)
CONCURRENT PROGRAMMING (1)
CUDA (1)
DATABASES (1)
DESKTOP PERSONAL COMPUTER (1)
DISTRIBUTED ARCHITECTURES (1)
DYNAMIC PROGRAMMING (1)
ELECTRIC POTENTIAL (1)
ELECTROSTATIC SURFACE POTENTIAL (1)
ELECTROSTATICS (1)
FINITE IMPULSE RESPONSE FILTERS (1)
FLOYD-WARSHALL (1)
FPGA (1)
GENETICS (1)
GENOMIC SEQUENCE SEARCH MAPPING (1)
GENOMIC SEQUENCE SEARCH OPTIMIZATION (1)
GRAPH (1)
HARDWARE (1)
HETEROGENEOUS (HYBRID) SYSTEMS (1)
HIGH DEFINITION VIDEO (1)
INDEXES (1)
INTEL MIC (1)
INTEL XEON PHI (1)
IRREGULAR MEMORY ACCESS (1)
LARGE-SCALE BIOINFORMATICS PROBLEM (1)
LARGE-SCALE SUPERCOMPUTERS (1)
LAYOUT (1)
LIBRARIES (1)
LOAD MANAGEMENT (1)
MAGNETIC CORES (1)
MANYCORE (1)
MASSIVE PARALLEL GENOMIC SEQUENCE SEARCH CODE (1)
MATHEMATICAL MODEL (1)
MIC (1)
MICROBIAL GENOME DATABASE (1)
MOBILE COMMUNICATION (1)
MOBILE GPU (1)
MOLECULAR BIOPHYSICS (1)
MOLECULAR MODELING (1)
MULTI-SCALE APPROXIMATION (1)
MULTICORE PROCESSING (1)
MULTISCALE APPROXIMATION (1)
MULTISCALE APPROXIMATION ALGORITHM (1)
NONBONDED FORCES (1)
NVIDIA KEPLER K20 (1)
OPENACC (1)
OPENCL (1)
OPENDWARFS (1)
OPTIMISATION (1)
PARALLEL ARCHITECTURES (1)
PARALLEL COMPUTING (1)
PARALLEL MACHINES (1)
PARALLEL SYSTEMS (1)
PARALLELISM (1)
PERFORMANCE EVALUATION (1)
PHYSIOLOGICAL MODELS (1)
PIPELINE PROCESSING (1)
POLYPHASE FILTER BANKS (1)
PRODUCTIVITY (1)
PROGRAM PROCESSORS (1)
RANDOM ACCESS MEMORY (1)
RESOURCE MANAGEMENT (1)
RUNTIME (1)
SCIENTIFIC INFORMATION SYSTEMS (1)
SHIFT REGISTERS (1)
more

INFONA - science communication portal

Search results for: Wu-chun Feng

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options