The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The growing demands in IT services for improving efficiency and quality at low cost to handle complex compute requirements has led to the integration of High performance computing (HPC) systems and cloud infrastructure in data centers. Earlier, HPC systems were limited to academic and research institutions and engineering laboratories. However, the emergence of cloud infrastructures and their successful...
This paper proposes a detailed performance evaluation of an algorithm using spanning tree that automatically exploits the parallelism and determines an execution order of multiple kernel programs in distributed environment. In stream-based computing, efficient parallel execution requires careful scheduling of the invocation of the kernel programs. By mapping a kernel to a node and an I/O stream between...
For the traditional processing tasks of underwater data analysis system, including finite impulse response filter banks, fast Fourier transform, conventional beamforming and target tracking, the algorithm flow is relatively steady with little alternative control, which is to say that it is highly parallel. Besides that, the processing tasks need to deal with enormous data, which means great computational...
As the computational power of high performance computing (HPC) systems continues to increase by using a huge number of CPU cores or specialized processing units, extreme-scale applications are increasingly prone to faults. Consequently, the HPC community has proposed many contributions to design resilient HPC applications. These contributions may be system-oriented, theoretical or numerical. In this...
This article consists of a collection of slides from the author's conference presentation. The following slides are presented to introduce the general features of one of our products, instead of any commitment about it. It is for information purposes only, and may not be incorporated into any contract. It is not suggested to make purchasing decisions accordingly. The development, release, and timing...
Massively parallel computers have found significant interest from researchers in recent years. These machines have been used for complex and sophisticated simulations, such as, the brain functions simulation. Due to the increase in the power demand by massively parallel computers and it's move into the exascale computing in future, temperature and power consumption has become a major constrains, many...
Heterogeneous computing is gaining attention from both industry and academia nowadays. One driving factor for heterogeneous computing is the power efficiency. GPU and FPGA have been reported to achieve much higher power efficiency over CPU on many applications. Comparisons between GPU and FPGA show different characteristics of GPU and FPGA in accelerated computing. Some tasks run better on GPU, some...
The Design of GPU(Graphical Processing Unit) will well suitable for express the data parallel computations because GPU will specialized for parallel and today's digital images in medical are huge volume of collections in every day, however medical imaging produces demand to improve the medical diagnosis and procedures. This survey is provide graphical processing computations and hardware require to...
High performance computing is a new and rapidly growing optical interconnect market needed to address this segment's steadily increasing bandwidth needs. This tutorial will discuss the trends, requirements, trade-offs and technology for this market.
Graphics Processing Units (GPUs) have enabled significant improvements in computational performance compared to traditional CPUs in several application domains. Until recently, GPUs have been programmed using C/C++ based methods such as CUDA (NVIDIA) and OpenCL (NVIDIA and AMD). Using these approaches, Fortran Numerical Weather Prediction (NWP) codes would have to be completely re-written to take...
Recently, GPGPU has been adopted well in the High Performance Computing (HPC) field. The limited global memory bandwidth poses a great challenge to many GPGPU programmers trying to exploit parallelism within the CPU-GPU heterogeneous platform. In this paper, we choose SWIM, a typical memory intensive application from the SPEC OMP 2001 benchmark suite, for case study. We attempt to optimize the performance...
It is studied about parallel algorithm of lattice Boltzmann method. The data's arrangement, commutation and computational progress are redesigned in a marriage of message passing interface and general purpose graphic processing Units. On the single-GPU, novel techniques appearing in shader model 3.0 such as frame buffer object (FBO), multiple-channels-rendering and, rendering-to-textures are used...
This article consists of a collection of slides from the authors' conference presentation. Some of the topics discussed include: a brief introduction to Godson processors; the architecture of the Godson-3 multicore processor; physical implementation; and PetaFLOPS and TeraFLOPs.
In recent years, Field Programmable Gate Arrays (FPGAs) have been used for High Performance Computing (HPC). Because there is a significantly difference between configuration speed of FPGA and execution speed of Central Processing Unit (CPU), the difference causes performance degradation. To resolve of this problem, we proposed MPLD as a new Programmable Logic Device (PLD) architecture with high speed...
This article deals with communication performance of a multiprocessor system implemented using award-wining BCM 1480 multi-core chips. Our system uses high-performance HyperTransport links to interconnect constituent chips, realizing cache-coherent non-uniform memory access. It takes advantage of hardware support from the BCM 1480 chip to attain very impressive communication performance among constituent...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.