Search results

Items from 1 to 15 out of 15 results

chapter

Experiences with UPC on TILE-64 processor

O Serres, A Anbar, S Merchant, T El-Ghazawi

2011 Aerospace Conference > 1 - 9

2011 IEEE Aerospace Conference

Partitioned global address space (PGAS) programming model presents programmers with a globally shared address space with locality awareness and one-sided communication constructs. The shared address space and the one-sided communication constructs enhance ease-of-use of PGAS based languages and the locality awareness enables programmers and the runtime systems to achieve higher performance. Thus PGAS...

chapter

Fine-grain OpenMP runtime support with explicit communication hardware primitives

P Tendulkar, V Papaefstathiou, G Nikiforos, S Kavadias, more

2011 Design, Automation&Test in Europe > 1 - 4

2011 Design, Automation & Test in Europe

We present a runtime system that uses the explicit on-chip communication mechanisms of the SARC multi-core architecture, to implement efficiently the OpenMP programming model and enable the exploitation of fine-grain parallelism in OpenMP programs. We explore the design space of implementation of OpenMP directives and runtime intrinsics, using a family of hardware primitives; remote stores, remote...

chapter

An open electronic system level multi-SPARC virtual platform and its toolchain

Pin-Hao Fang, Yu-Lin Wang, Zhong-Ho Chen, A W Y Su, more

2010 International Computer Symposium (ICS2010) > 478 - 482

2010 International Computer Symposium (ICS 2010)

We present a multi-core virtual platform which follows single-core architecture, SPARC v8, available as an open source development suite. The proposed multi-SPARC system operates at electronic system level to accelerate its simulation speed. TLM channels are devised to connect the processors. To simplify the use of the proposed virtual platform, we define some specific APIs for data transaction and...

chapter

Fussli: A portable framework for exploiting hybrid task, data and pipeline parallelism on multi-cores

Xiaoye Wang, Ting Zhang

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 11 > V11-88 - V11-95

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

Parallelism is the most important mean to exploit the computation potential of multi-core processors. Real applications, particularly, commercial applications often have strong dependence that has to be respected. In order to achieve reasonably good performance, hybrid parallelism schemes usually need to be applied in these applications. Furthermore, parallel applications with task and pipeline parallelism...

chapter

Designing and Implementing a Portable, Efficient Inter-core Communication Scheme for Embedded Multicore Platforms

Shih-Hao Hung, Wen-Long Yang, Chia-Heng Tu

2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications > 303 - 308

2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2010)

In the recent years, multicore processor designs have become increasingly popular for embedded applications, but diversified inter-core communication mechanisms have led to the difficulties in software development, integration and migration. A unified, portable, and efficient inter-core communication mechanism would have helped reduce these difficulties significantly, but such a solution did not exist...

chapter

Rapid Application Development on Multi-processor Reconfigurable Systems

Linfeng Ye, J Diguet, G Gogniat

2010 International Conference on Field Programmable Logic and Applications > 285 - 290

2010 International Conference on Field Programmable Logic and Applications (FPL 2010)

Considering the ability to perform multi-processor architecture systems on FPGA, partial reconfiguration is an opportunity to improve weak soft-core performances by specializing coprocessors according to context-dependent application needs. But at the application level, there is a need for straightforward programming models that allow applications to be easily mapped on an ad hoc architecture without...

chapter

Processor affinity and MPI performance on SMP-CMP clusters

Chi Zhang, Xin Yuan, Ashok Srinivasan

2010 IEEE International Symposium on Parallel&Distributed Processing, Workshops and Phd Forum (IPDPSW) > 1 - 8

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW 2010)

Clusters of Symmetric MultiProcessing (SMP) nodes with multi-core Chip-Multiprocessors (CMP), also known as SMP-CMP clusters, are becoming ubiquitous today. For Message Passing interface (MPI) programs, such clusters have a multi-layer hierarchical communication structure: the performance of intra-node communication is usually higher than that of inter-node communication; and the performance of intra-node...

chapter

Structuring the execution of OpenMP applications for multicore architectures

Francois Broquedis, Olivier Aumage, Brice Goglin, Samuel Thibault, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

The now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user friendliness of shared memory on the one side, and memory access scalability and efficiency on the other side. However, to get high performance out of such machines requires a dynamic mapping of application tasks and data onto the underlying...

chapter

Adaptive Multi-versioning for OpenMP Parallelization via Machine Learning

Xuan Chen, Shun Long

2009 15th International Conference on Parallel and Distributed Systems > 907 - 912

2009 IEEE 15th International Conference on Parallel and Distributed Systems (ICPADS 2009)

The introduction of multi-core architectures generates a higher demand for parallelism in order to fully exploit the potential of modern computers. It is of vital importance that a compiler can allocate parallel workload in a cost-aware manner in order to achieve optimal performance on a multi-core architecture. This paper presents an adaptive OpenMP-based mechanism capable of generating a reasonable...

chapter

NUMA-ICTM: A parallel version of ICTM exploiting memory placement strategies for NUMA machines

M. Castro, L.G. Fernandes, C. Pousa, J.-F. Mehaut, more

2009 IEEE International Symposium on Parallel&Distributed Processing > 1 - 8

2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

In geophysics, the appropriate subdivision of a region into segments is extremely important. ICTM (interval categorizer tesselation model) is an application that categorizes geographic regions using information extracted from satellite images. The categorization of large regions is a computational intensive problem, what justifies the proposal and development of parallel solutions in order to improve...

chapter

Application profiling on Cell-based clusters

H. Dursun, K.J. Barker, D.J. Kerbyson, S. Pakin

2009 IEEE International Symposium on Parallel&Distributed Processing > 1 - 8

2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the ldquoCellrdquo processor). Specifically, we examine Cell-centric MPI programs on hybrid clusters containing multiple Opteron and Cell processors per node such as those used in the petascale Roadrunner system. Our implementation incurs less than 3.2 mus of overhead...

chapter

Parallelization with Automatic Parallelizing Compiler Generating Consumer Electronics Multicore API

T. Miyamoto, S. Asaka, H. Mikami, M. Mase, more

2008 IEEE International Symposium on Parallel and Distributed Processing with Applications > 600 - 607

2008 IEEE International Symposium on Parallel and Distributed Processing with Applications

Multicore processors have been adopted for consumer electronics like portable electronics, mobile phones, car navigation systems, digital TVs and games to obtain high performance with low power consumption. The OSCAR automatic parallelizing compiler has been developed to utilize these multicores easily. Also, a new consumer electronics multicore application program interface (API) to use the OSCAR...

chapter

Lightweight DMA management mechanisms for multiprocessors on FPGA

A. Tumeo, M. Monchiero, G. Palermo, F. Ferrandi, more

2008 International Conference on Application-Specific Systems, Architectures and Processors > 275 - 280

2008 International Conference on Application-Specific Systems, Architectures and Processors (ASAP)

This paper presents a multiprocessor system on FPGA that adopts Direct Memory Access (DMA) mechanisms to move data between the external memory and the local memory of each processor. The system integrates all standard DMA primitives via a fast Application Programming Interface (API) and relies on interrupts having also the possibility to manage a command list. This interface allows to program the...

chapter

MPI-Based Adaptive Task Migration Support on the HS-Scale System

N. Saint-Jean, P. Benoit, G. Sassatelli, L. Torres, more

2008 IEEE Computer Society Annual Symposium on VLSI > 105 - 110

2008 IEEE Computer Society Annual Symposium on VLSI

In this article, we present an original MPI-based adaptive task migration support for the HS-Scale system. Our previous communication API was modified in order to be MPI compliant. In order to enable task migration without any MMU, a Position Independent Code compilation technique is implemented. The self-adaptability is based on monitoring information collected at run-time by each processing element...

chapter

Developing Embedded Kernel for System-On-a-Chip Platform of Heterogeneous Multiprocessor Architecture

Jing Chen, Jian-Hong Liu

12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'6) > 246 - 250

12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications

It has been common that modern embedded system products are built on platforms with system-on-a-chip (SOC) in which two or more different processor cores are put into one single chip and form the architecture of heterogeneous multiprocessor. Although providing high performance at low cost, such architecture brings new design challenges as well as increased complexity in developing embedded software...

Filter options

Data set:
ieee
Keywords:
COMPUTER ARCHITECTURE
MULTIPROCESSING SYSTEMS
APPLICATION PROGRAM INTERFACES

Publication date

Set your own date range

Content availability

Available (14)
None (1)

Keywords

HARDWARE (6)
MESSAGE PASSING (6)
EMBEDDED SYSTEMS (4)
PROGRAM PROCESSORS (4)
RUNTIME (4)
DATA MINING (3)
FIELD PROGRAMMABLE GATE ARRAYS (3)
FPGA (3)
MAGNETIC CORES (3)
MICROPROCESSORS (3)
PARALLEL PROCESSING (3)
PROGRAMMING (3)
PROTOCOLS (3)
SYNCHRONIZATION (3)
SYSTEM-ON-CHIP (3)
API (2)
APIS (2)
BENCHMARK TESTING (2)
C LANGUAGE (2)
COMPUTATIONAL MODELING (2)
LIBRARIES (2)
MICROPROCESSOR CHIPS (2)
MPI (2)
MULTI-CORE (2)
MULTI-THREADING (2)
MULTICORE PROCESSING (2)
PARALLEL ARCHITECTURES (2)
PARALLEL PROGRAMMING (2)
PARALLELISING COMPILERS (2)
PROBABILITY DENSITY FUNCTION (2)
PROGRAM COMPILERS (2)
SOFTWARE (2)
SOFTWARE COMPLEXITY (2)
YARN (2)
64 CORE PROCESSOR (1)
AD HOC ARCHITECTURE (1)
AD HOC NETWORKS (1)
ADAPTIVE SYSTEMS (1)
AEROSPACE COMPUTING (1)
AEROSPACE SYSTEM (1)
ALGORITHM DESIGN AND ANALYSIS (1)
APPLICATION PROGRAM INTERFACE (1)
APPLICATION PROGRAMMING INTERFACE (1)
APPLICATION PROGRAMS (1)
APPLICATION SOFTWARE (1)
APPLICATION TASKS (1)
ARCHITECTURE SCALABILITY (1)
ARM7TDMI CORE (1)
AUTOMATIC PARALLELIZING COMPILER (1)
BANDWIDTH (1)
BERKELEY UPC COMPILER (1)
BIOLOGICAL SYSTEM MODELING (1)
C5409 DSP CORE (1)
CACHE BANKS (1)
CACHE STORAGE (1)
CELL PROCESSOR (1)
CELL-BASED CLUSTERS (1)
CELL-CENTRIC MPI PROGRAMS (1)
CLOCKS (1)
CLUSTERING ALGORITHMS (1)
COLLECTIVE OPERATION (1)
COMPUTING SYSTEM (1)
CONCURRENT COMPUTING (1)
CONSUMER ELECTRONICS (1)
CONTEXT (1)
CONTEXT DEPENDENT APPLICATION (1)
COPROCESSORS (1)
CORE 2 DUO MACHINE (1)
COUNTING CIRCUITS (1)
DATA COMMUNICATION (1)
DATA DISTRIBUTION (1)
DATA MODELS (1)
DATA TRANSACTION (1)
DATA TRANSFER (1)
DECODING (1)
DIRECT MEMORY ACCESS MECHANISMS (1)
DISTRIBUTED DATABASES (1)
DMA (1)
DYNAMIC BEHAVIOR MONITORING (1)
DYNAMIC LOADING (1)
DYNAMIC SCHEDULING (1)
ELECTRONIC SYSTEM LEVEL AND TRANSACTION LEVEL MODELING (1)
ELECTRONICS PACKAGING (1)
EMBEDDED APPLICATIONS (1)
EMBEDDED KERNEL DEVELOPMENT (1)
EMBEDDED MULTICORE PLATFORMS (1)
EMBEDDED MULTIPROCESSOR ARCHITECTURE (1)
EMBEDDED SOFTWARE DEVELOPMENT (1)
EMBEDDED SYSTEM (1)
EMBEDDED SYSTEM PRODUCT (1)
ENGINES (1)
EXPLICIT COMMUNICATION HARDWARE PRIMITIVES (1)
FEATURE EXTRACTION (1)
FILE ORGANISATION (1)
FINE-GRAIN OPENMP RUNTIME SUPPORT (1)
FINE-GRAIN PARALLELISM (1)
FORESTGOMP RUNTIME SYSTEM (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options