The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper investigates and studies the acceleration of irregular/regular algorithms via Integrate Graphic Processing Unit (Integrated GPU) known as Accelerated Processing Unit (APU) that is fused on the same die with the CPU, and Discrete Graphic Processing Unit (GPU), while answering the question of How potential is the APU for applications with iregular data structures such as trees knowing that...
Currently, with the development of high performance computing, multicore system and heterogeneous system have become the transformation that is taking place. However, Promoting performance of processor has encountered bottlenecks of heat and power by means of Moore's Law, one or more CPUs can't meet requirements of a large number of computing. The use of Heterogeneous Computing Platform is becoming...
Our cloud-based IT world is founded on hyper-visors and containers. Containers are becoming an important cornerstone, which is increasingly used day-by-day. Among different available frameworks, docker has become one of the major adoptees to use containerized platform in data centers and enterprise servers, due to its ease of deploying and scaling. Further more, the performance benefits of a lightweight...
Hardware errors are no longer exceptions in modern cloud data centers. Although virtualization provides software failure isolation among different virtual machines (VM), the virtualization infrastructure including the hypervisor and privileged VMs remains vulnerable to hardware errors. What makes matters worse is that such errors are unlikely bounded by the virtualization boundary and may lead to...
In the cloud computing environment, one of the most important module is the Scheduler. As the most popular open-source cloud platform, OpenStack provides us with a massive amount of scheduling strategies. But there is no one considering of the hierarchies of the VMs and hosts. We will guarantee the security of VM through these hierarchies. Although OpenStack is abundant in scheduling strategies, none...
Mobile devices play a vital role for handling emergency situations. During emergency, it is very difficult to collect necessary information from the mobile devices if there is unavailability of networks. In this work, an Energy Efficient Emergency Management System named as E3M has been proposed. E3M supports peer-to-peer communication between mobile devices if a mobile device does not find any suitable...
A process-scheduling algorithm is a fundamental operating system function that manages the assignment of CPU (Central Processing Unit) processes. It aims to make the system efficient, fast, and fair, allowing as many processes as possible to make the best use of the CPU at any given time. Understanding scheduling algorithms and their impact in practice is a challenging and time-consuming task for...
Accelerators have emerged as an important component of modern cloud, datacenter, and HPC computing environments. However, launching tasks on remote accelerators across a network remains unwieldy, forcing programmers to send data in large chunks to amortize the transfer and launch overhead. By combining advances in intra-node accelerator unification with one-sided Remote Direct Memory Access (RDMA)...
GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications have not been investigated in depth. While error propagation has been extensively investigated for non-GPU applications, GPU applications have a very different programming model which can have a significant effect on error propagation...
The Graphics processors or GPUs have become in a few years powerful tools for applications that require a massively parallel computing. Currently include the applications in multimedia processing, the engineering science and image processing in real time. They offer many advantages such as acceleration of treatment and down energy consumption from an equivalent CPU power. In this paper, we will show...
While many linear algebra libraries have been developed to optimize their performance, no linear algebra library considers their energy efficiency at the library design time. In this paper, we present GreenLA - an energy efficient linear algebra software package that leverages linear algebra algorithmic characteristics to maximize energy savings with negligible overhead. GreenLA is (1) energy efficient:...
The CG research community has a renewed interest on rendering algorithms based on path space integration, mainly due to new approaches to discover, generate and exploit relevant light paths while keeping the numerical integrator unbiased or, at the very least, consistent. Simultaneously, the current trend towards massive parallelism and heterogeneous environments, based on a mix of conventional computing...
Cloud computing provides an opportunity to users to outsource their data and applications. However, data privacy is one of the key challenges for the users who are outsourcing data on some transparent cloud servers. Data encryption is the best option to protect users' data privacy on the cloud. However, computation overheads of encryption methods could be expensive to some small computing machines,...
Throughout three iterations and six years we have developed a project-based course in HPC for single-box computers tailored to science students in general. The course is based on strong premises: showing that assembly is what actually runs on machines, dividing parallelism in three dimensions (ILP, DLP, TLP), and using them incrementally in a single numerical simulation throughout the course working...
Graph analysis is becoming increasingly important in many research fields - biology, social sciences, data mining - and daily applications - path finding, product recommendation. Many different large-scale graph-processing systems have been proposed for different platforms. However, little effort has been placed on designing systems for hybrid CPU-GPU platforms.In this work, we present HyGraph, a...
The evolution of massively parallel supercomputers make palpable two issues in particular: the load imbalance and the poor management of data locality in applications. Thus, with the increase of the number of cores and the drastic decrease of amount of memory per core, the large performance needs imply to particularly take care of the load-balancing and as much as possible of the locality of data...
This paper proposes a detailed performance evaluation of an algorithm using spanning tree that automatically exploits the parallelism and determines an execution order of multiple kernel programs in distributed environment. In stream-based computing, efficient parallel execution requires careful scheduling of the invocation of the kernel programs. By mapping a kernel to a node and an I/O stream between...
The Variable Preconditioned (VP) Krylov subspace method with communication avoiding (CA) technique is adopted for the solver of a linear system obtained from electromagnetic analysis, and the numerical features are investigated. A massive communication time between processing units or GPU/MIC is the problem for parallelization efficiency that needs to be solved. Although κ-skip Krylov subspace method...
In most computer programs and general-purpose computing environments, the precision of any calculation is limited by the word size of the computer. However, for some applications, such as cryptography, this precision is not sufficient. In these cases, it is necessary to use multiple-precision numbers. Operations on such numbers in most computer software are implemented by third party libraries that...
The performance of a ROS application is a function of the individual performance of its constituent nodes. Since ROS nodes are typically configurable (parameterised), the specific parameter values adopted will determine the level of performance generated. In addition, ROS applications may be distributed across multiple computation devices, thus providing different options for node allocation. We address...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.