Search results

Items from 1 to 10 out of 10 results

article

LAWC: Optimizing Write Cache Using Layout-Aware I/O Scheduling for All Flash Storage

Kalidas Ganesh, Youngjae Kim, Monobrata Debnath, Sungyong Park, more

IEEE Transactions on Computers > 2017 > 66 > 11 > 1890 - 1902

Flash memory-based SSD-RAIDs are swiftly replacing conventional hard disk drives by exhibiting improved performance and stability, especially in I/O-intensive environments. However, the variations in latency and throughput occurring due to uncoordinated internal garbage collection cripples further boosting of performance. In addition, the unwanted variations in each SSD can influence the overall performance...

chapter

FPGA implementation of vertically parallel minimum and maximum values determination in array of numbers

Ivan Tsmots, Vasyl Rabyk, Oleksa Skorokhoda, Volodymyr Antoniv

2017 14th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM) > 234 - 236

2017 14th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM)

The vertically parallel method and structures for calculating maximum and minimum values in one-dimensional and two-dimensional arrays have been developed. Developed structures have been implemented using FPGA. Parameters of the structures have been estimated.

chapter

Automatic OpenCL Code Generation for Multi-device Heterogeneous Architectures

Pei Li, Elisabeth Brunet, Francois Trahay, Christian Parrot, more

2015 44th International Conference on Parallel Processing > 959 - 968

2015 44th International Conference on Parallel Processing (ICPP)

Using multiple accelerators, such as GPUs or Xeon Phis, is attractive to improve the performance of large data parallel applications and to increase the size of their workloads. However, writing an application for multiple accelerators remains today challenging because going from a single accelerator to multiple ones indeed requires to deal with potentially non-uniform domain decomposition, inter-accelerator...

chapter

GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing

Tetsuya Odajima, Taisuke Boku, Toshihiro Hanawa, Jinpil Lee, more

2012 41st International Conference on Parallel Processing Workshops > 97 - 106

2012 41st International Conference on Parallel Processing Workshops (ICPPW)

In this paper, we propose a solution framework to enable the work sharing of parallel processing by the coordination of CPUs and GPUs on hybrid PC clusters based on the high-level parallel language XcalableMPdev. Basic XcalableMP enables high-level parallel programming using sequential code directives that support data distribution and loop/task distribution among multiple nodes on a PC cluster. XcalableMP-dev...

chapter

An Empirical Performance Study of Chapel Programming Language

Nan Dun, Kenjiro Taura

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 497 - 506

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper we evaluate the performance of the Chapel programming language from the perspective of its language primitives and features, where the micro benchmarks are synthesized from our lessons learned in developing molecular dynamics simulation programs in Chapel. Experimental results show that most language building blocks have comparable performance to corresponding hand-written C code, while...

chapter

A technique for moving large data sets over high-performance long distance networks

Bradley W. Settlemyer, Jonathan D. Dobson, Stephen W. Hodson, Jeffery A. Kuehn, more

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST) > 1 - 6

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)

In this paper we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although performance studies of wide area networks have been a frequent topic of interest, performance analyses have tended to focus on network latency characteristics and peak throughput using network traffic generators. In this study we instead...

chapter

Availability-Aware Cache Management with Improved RAID Reconstruction Performance

Suzhen Wu, Bo Mao, Dan Feng, Jianxi Chen

2010 13th IEEE International Conference on Computational Science and Engineering > 229 - 236

2010 IEEE 13th International Conference on Computational Science and Engineering (CSE 2010)

The RAID reconstruction performance has a significant impact on the availability of RAID-structured storage systems due to the high disk failure rate. Most existing cache managements for RAID-structured storage systems focus on improving the performance or the energy efficiency, while they do not intent to improve the RAID availability by boosting the RAID reconstruction process. In this paper, we...

chapter

Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems

Jinpil Lee, Mitsuhisa Sato

2010 39th International Conference on Parallel Processing Workshops > 413 - 420

2010 39th International Conference on Parallel Processing Workshops (ICPPW)

Although MPI is a de-facto standard for parallel programming on distributed memory systems, writing MPI programs is often a time-consuming and complicated process. XcalableMP is a language extension of C and Fortran for parallel programming on distributed memory systems that helps users to reduce those programming efforts. XcalableMP provides two programming models. The first one is the global view...

chapter

JOR: A Journal-guided Reconstruction Optimization for RAID-Structured Storage Systems

Suzhen Wu, Dan Feng, Hong Jiang, Bo Mao, more

2009 15th International Conference on Parallel and Distributed Systems > 609 - 616

2009 IEEE 15th International Conference on Parallel and Distributed Systems (ICPADS 2009)

This paper proposes a simple and practical RAID reconstruction optimization scheme, called JOurnal-guided Reconstruction (JOR). JOR exploits the fact that significant portions of data blocks in typical disk arrays are unused. JOR monitors the storage space utilization status at the block level to guide the reconstruction process so that only failed data on the used stripes is recovered to the spare...

article

Synchronization and computing capabilities of linear asynchronous structures

R. J. Lipton, R. E. Miller, L. Snyder

00016th Annual Symposium on Foundations of Computer Science (sfcs 01975) > 1975 > 19 - 28

16th Annual Symposium on Foundations of Computer Science (sfcs 1975)

A model is defined in which questions concerning delay bounded asynchronous parallel systems may be investigated. Persistence and determinacy are introduced for this model. These two conditions are shown to be sufficient to guarantee that a synchronous execution policy can be relaxed to an asynchronous execution policy with no change to the result of the computation. In addition, the asynchronous...

Filter options

Data set:
ieee
Keywords:
SYNCHRONIZATION
ARRAYS
PERFORMANCE EVALUATION

Publication date

Set your own date range

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options