The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Distributed computing environments have evolved from in-house clusters to Grids and now Cloud platforms. We, as others, provide HPC benchmarks results over Amazon EC2 that show a lower performance of Cloud resources compared to private resources., So, it is not yet clear how much of impact Clouds will have in high performance computing (HPC). But hybrid Grid/Cloud computing may offer opportunities...
Virtualization technology is currently widely used due to its benefits on high resource utilization, flexible manageability and powerful system security. However, its use for high performance computing (HPC) is still not popular due to the unclearness of the virtualization overheads. It's worthy to evaluate the virtualization cost and to find the performance bottleneck when running HPC applications...
As the system scales up continuously, the problem of power consumption for high performance computing (HPC) system becomes more severe. Heterogeneous system integrating two or more kinds of processors, could be better adapted to heterogeneity in applications and provide much higher energy efficiency in theory. Many studies have shown heterogeneous system is preferable on energy consumption to homogeneous...
The speed of the memory subsystem often constrains the performance of large-scale parallel applications. Experts tune such applications to use hierarchical memory subsystems efficiently. Hardware accelerators, such as GPUs, can potentially improve memory performance beyond the capabilities of traditional hierarchical systems. However, the addition of such specialized hardware complicates code porting...
Network congestion is an important factor affecting the performance of large scale jobs in supercomputing clusters, especially with the wide deployment of multi-core processors. The blocking nature of current day collectives makes such congestion a critical factor in their performance. On the other hand, modern interconnects like InfiniBand provide us with many novel features such as Virtual Lanes...
Although MPI is a de-facto standard for parallel programming on distributed memory systems, writing MPI programs is often a time-consuming and complicated process. XcalableMP is a language extension of C and Fortran for parallel programming on distributed memory systems that helps users to reduce those programming efforts. XcalableMP provides two programming models. The first one is the global view...
In light of its powerful computing capacity and high energy efficiency, GPU (graphics processing unit) has become a focus in the research field of HPC (High Performance Computing). CPU-GPU heterogeneous parallel systems have become a new development trend of super-computer. However, the inherent unreliability of the GPU hardware deteriorates the reliability of super-computer. We have researched on...
With the introduction of the TI-09 platforms at the DoD Supercomputing Resource Centers (DSRCs), users now have access to significantly faster and larger systems for their computations, including the first systems with greater than 10,000 cores. We wanted to benchmark these latest systems and compare to the older platforms while assessing their computational environments.
Virtualization overhead is the main reason for the slow diffusion of virtualization techniques into high-performance computing environments. This paper discusses the issues linked to the performance evaluation of virtual clusters, and presents the implementation of a service that automates the benchmarking and results collection procedure in an existing cloud-on-GRID environment.
Solid state drives (SSDs) allow single-drive performance that is far greater than disks can produce. Their low latency and potential for parallel operations mean that they are able to read and write data at speeds that strain operating system I/O interfaces. Additionally, their performance characteristics expose gaps in existing benchmarking methodologies. We discuss the impact on Linux system design...
Fault tolerance will be a fundamental imperative in the next decade as machines containing hundreds of thousands of cores will be installed at various locations. In this context, the traditional checkpoint/restart model does not seem to be a suitable option, since it makes all the processors roll back to their latest checkpoint in case of a single failure in one of the processors. In-memory message...
In conjunction with Moore's Law, computer speeds are expected to double approximately every two years, but with the current challenges that computer manufacturers are facing to double speeds of individual processors, due to various reasons, such as processor temperatures, multiprocessor architectures have become more popular nowadays. Eventually, this has led to an increased interest in standards...
The use of Field Programmable Gate Arrays (FPGAs) in the area of High Performance Computing (HPC) to accelerate computations is well known. We present here a case where FPGAs can be used to speed up communication instead of computation. Current interconnects for HPC are in particular missing support for fine grain communication, which is increasingly found in various applications. In order to overcome...
Energy efficiency and parallel I/O performance have become two critical measures in high performance computing (HPC). However, there is little empirical data that characterize the energy-performance behaviors of parallel I/O workload. In this paper, we present a methodology to profile the performance, energy, and energy efficiency of parallel I/O access patterns and report our findings on the impacting...
Applications of different categories contain varying levels of data, instruction and thread-level parallelism inherently. It's important to explore the potential coarse-grain thread-level parallelism in different applications to guide the computing resources allocation problem in multicore chips. Up to now, lots of depth researches have been mainly concentrated in the desktop applications. In order...
As multi-core processors become increasingly mainstream, architects have likewise become more interested in how best to make use of the computing capacity of the CPU, for instance, through multiple simultaneous threads or processes of execution with OpenMP or MPI. At the same time, the increasingly mature and prevailing virtualization technique in server consolidation and HPC promotes the emergence...
The introduction of affordable infrastructure on demand, specifically Amazon's Elastic Compute Cloud (EC2), has had a significant impact in the business IT community and provides reasonable and attractive alternatives to locally-owned infrastructure. For scientific computation however, the viability of EC2 has come into question due to its use of virtualization and network shaping and the performance...
Currently, parallel platforms based on large scale hierarchical shared memory multiprocessors with Non-Uniform Memory Access (NUMA) are becoming a trend in scientific High Performance Computing (HPC). Due to their memory access constraints, these platforms require a very careful data distribution. Many solutions were proposed to resolve this issue. However, most of these solutions did not include...
The building blocks of emerging Petascale massively parallel processing (MPP) systems are multi-core processors with four or more cores as a single processing element and a customized network interface. The resulting memory and communication hierarchy of these platforms are now exposed to application developers and end users by creating a hierarchical or multi-core aware message-passing (MPI) programming...
Today's computer systems have become unbelievably complex. Nowadays register-level design is an overwhelming task, especially in the embedded system area where the time-to-market is very short. Platform based design shifts the challenge on how to tune parametric platforms to achieve the best performance at the smallest cost. This task, called multi-objective design space exploration, requires accurate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.