Search results

Items from 1 to 9 out of 9 results

chapter

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks

Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 276 - 286

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Today, machine learning based on neural networks has become mainstream, in many application domains. A small subset of machine learning algorithms, called Convolutional Neural Networks (CNN), are considered as state-ofthe- art for many applications (e.g. video/audio classification). The main challenge in implementing the CNNs, in embedded systems, is their large computation, memory, and bandwidth...

article

A Scalable Massively Parallel Motion and Disparity Estimation Scheme for Multiview Video Coding

Caoyang Jiang, Saeid Nooshabadi

IEEE Transactions on Circuits and Systems for Video Technology > 2016 > 26 > 2 > 346 - 359

Multiview video coding (MVC) has recently received considerable attention. It is proposed as an extension of H.264/Advanced Video Coding (AVC) standard for multiple video source compression. To resolve the extremely high computational complexity of MVC (and in fact other AVC techniques), suitable parallel algorithms need to be developed that are amenable to implementation on low-cost massively parallel...

chapter

V-PFORDelta: Data Compression for Energy Efficient Computation of Time Series

Abdullah Al Hasib, Juan M. Cebrian, Lasse Natvig

2015 IEEE 22nd International Conference on High Performance Computing (HiPC) > 416 - 425

2015 IEEE 22nd International Conference on High Performance Computing (HiPC)

Chip multiprocessors (CMPs) and heterogeneous architectures have become predominant in all market segments, from embedded to high performance computing. These architectures exacerbate on-chip data requirements, creating additional pressure on the memory subsystem. Consequently, efficient utilization of on-chip memory space becomes critical for data intensive applications. A promising means of addressing...

chapter

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments

John Jenkins, James Dinan, Pavan Balaji, Nagiza F. Samatova, more

2012 IEEE International Conference on Cluster Computing > 468 - 476

2012 IEEE International Conference on Cluster Computing (CLUSTER)

Lack of efficient and transparent interaction with GPU data in hybrid MPI+GPU environments challenges GPU acceleration of large-scale scientific computations. A particular challenge is the transfer of noncontiguous data to and from GPU memory. MPI implementations currently do not provide an efficient means of utilizing data types for noncontiguous communication of data in GPU memory. To address this...

chapter

A Multilevel Parallel Intra Coding for H.264/AVC Based on CUDA

Huayou Su, Nan Wu, Chunyuan Zhang, Mei Wen, more

2011 Sixth International Conference on Image and Graphics > 76 - 81

2011 Sixth International Conference on Image & Graphics (ICIG)

In this paper, we propose a multilevel parallel intra coding for H.264/AVC based on computed unified device architecture (CUDA). The proposed parallel algorithm improves the parallelism between 4x4 blocks within a macro block (MB) by throwing off some inappreciable prediction modes. By partitioning a frame into multi-slice, the parallelism between MBs can be exploited. In addition, a scalable parallel...

chapter

Parallel Streaming Intra Prediction for Full HD H.264 Encoding

Ju Ren, Yi He, Huayou Su, Mei Wen, more

2010 5th International Conference on Embedded and Multimedia Computing > 1 - 6

2010 5th International Conference on Embedded and Multimedia Computing (EMC 2010)

Intra prediction is the most important intensive computing component in H.264 intra frame coder. Its high computational costs give huge pressure to most current embedded programmable processors, especially in real-time HD H.264 video encoding. Stream processing model, an emerging parallel processing model supported by GPUs and most programmable processors, bridges the gap between flexible programmable...

chapter

Optimization of AVS Exp-Golomb Code Algorithm with Stream Processor

Zhenhua Xu, Mei Yu, Gangyi Jiang, Chao Huang

2009 Third International Symposium on Intelligent Information Technology Application > 1 > 485 - 488

2009 Third International Symposium on Intelligent Information Technology Application

Storm processor is a stream-based prototype processor designed for media processing. It has good performance and high efficiency for modern media processing and signal processing applications. It exploits the large amounts of parallelism available in many signal processing applications yet achieves high power efficiency by managing data movement directly with an on-chip register-file hierarchy and...

chapter

Software parallel CAVLC encoder based on stream processing

Ju Ren, Yi He, Wei Wu, Mei Wen, more

2009 IEEE/ACM/IFIP 7th Workshop on Embedded Systems for Real-Time Multimedia > 126 - 133

2009 IEEE/ACM/IFIP 7th Workshop on Embedded Systems for Real-Time Multimedia. ESTIMedia 2009

Real-time encoding of high-definition H.264 video is a challenge to current embedded programmable processors. Emerging stream processing methods supported by most GPUs and programmable processors provide a powerful mechanism to achieve surprising high performance in media/signal processing, which bring an opportunity to deal with this challenge. However, traditional serial CAVLC has highly input-dependent...

chapter

H.264/AVC motion estimation implmentation on Compute Unified Device Architecture (CUDA)

Wei-Nien Chen, Hsueh-Ming Hang

2008 IEEE International Conference on Multimedia and Expo > 697 - 700

2008 IEEE International Conference on Multimedia and Expo (ICME)

Due to the rapid growth of graphics processing unit (GPU) processing capability, using GPU as a coprocessor to assist the central processing unit (CPU) in computing massive data becomes essential. In this paper, we present an efficient block-level parallel algorithm for the variable block size motion estimation (ME) in H.264/AVC with fractional pixel refinement on a computer unified device architecture...

Filter options

Data set:
ieee
Keywords:
KERNEL
ENCODING
PARALLEL PROCESSING

Publication date

Set your own date range

Publication type

book (8)
article (1)

Keywords

GPU (5)
VIDEO CODING (5)
GRAPHICS PROCESSING UNIT (4)
COMPUTER ARCHITECTURE (3)
INSTRUCTION SETS (3)
CUDA (2)
DATA COMPRESSION (2)
HIGH DEFINITION VIDEO (2)
MEDIA PROCESSING (2)
PIXEL (2)
RANDOM ACCESS MEMORY (2)
STORMS (2)
STREAMING MEDIA (2)
—MULTIVIEW (1)
ACCELERATORS (1)
ADAPTIVE CODES (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ARRAYS (1)
AUDIO CODING (1)
AUDIO VIDEO CODING STANDARD (1)
AUTOMATIC VOLTAGE CONTROL (1)
AVS (1)
AVS EXP-GOLOMB CODE ALGORITHM (1)
AVS VIDEO COMPRESSION STANDARD (1)
AVX2) (1)
BLOCK-LEVEL PARALLEL ALGORITHM (1)
BRANCH INSTRUCTION (1)
CAVLC (1)
CHINESE NATIONAL VIDEO CODING STANDARD (1)
CODING (1)
COMPRESSION ALGORITHMS (1)
COMPUTATIONAL MODELING (1)
COMPUTE UNIFIED DEVICE ARCHITECTURE (CUDA) (1)
COMPUTER UNIFIED DEVICE ARCHITECTURE (1)
CONTEXT-ADAPTIVE VARIABLE-LENGTH CODING (1)
CONVOLUTION (1)
CONVOLUTIONAL NEURAL NETWORKS (1)
COPROCESSOR (1)
COPROCESSORS (1)
DATATYPE (1)
EMBEDDED PROGRAMMABLE PROCESSORS (1)
EXP-GOLOMB CODE (1)
FLEXIBLE PROGRAMMABLE PROCESSORS (1)
FRACTIONAL PIXEL REFINEMENT (1)
FULL HD H.264 ENCODING (1)
FULL HD H.264 VIDEO SEQUENCE (1)
GRAPHICS PROCESSING UNIT (GPU) (1)
GRAPHICS PROCESSING UNITS (1)
H.264 (1)
H.264 INTRA FRAME CODER (1)
H.264-AVC (1)
H.264-AVC MOTION ESTIMATION (1)
H.264/AVC (1)
HARDWARE (1)
HIGH-DEFINITION H.264 VIDEO (1)
HIGH-PERFORMANCE SPECIAL-PURPOSE PROCESSORS (1)
HIGH-RESOLUTION VIDEO APPLICATION (1)
IMAGE CODING (1)
INTRA CODING (1)
LOOP INSTRUCTION (1)
MEDIA STREAMING (1)
MEMORY MANAGEMENT (1)
MICROPROCESSOR CHIPS (1)
MOTION ESTIMATION (1)
MPI (1)
MULTILEVEL (1)
NEURAL NETWORKS (1)
NVIDIA (1)
ONCHIP REGISTER-FILE HIERARCHY (1)
OPTIMISATION (1)
OPTIMIZATION (1)
PARALLEL (1)
PARALLEL ALGORITHMS (1)
PARALLEL PROCESSING MODEL (1)
PARALLEL STREAMING INTRA PREDICTION (1)
PARALLELISM (1)
PERFORMANCE AND ENERGY EFFICIENCY (1)
PREDICTION METHODS (1)
PROGRAM PROCESSORS (1)
PROGRAMMABLE PROCESSORS (1)
REAL-TIME ENCODING (1)
REAL-TIME HD H.264 VIDEO ENCODING (1)
RECONFIGURABLE COMPUTING (1)
SIGNAL PROCESSING (1)
SIGNAL PROCESSING APPLICATIONS (1)
SIMD (SSE (1)
SOFTWARE PARALLEL CAVLC ENCODER (1)
STORM (1)
STORM PROCESSOR (1)
STREAM (1)
STREAM PROCESSING (1)
STREAM PROCESSING MODEL (1)
STREAM PROCESSOR (1)
STREAM-BASED PROTOTYPE PROCESSOR DESIGN (1)
SYSTEM AND CORE ENERGY CONSUMPTION (1)
THROUGHPUT (1)
TIME SERIES ANALYSIS (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options