The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Power consumption and high compute density are the key factors to be considered when building a compute node for the upcoming Exascale revolution. Current architectural design and manufacturing technologies are not able to provide the requested level of density and power efficiency to realise an operational Exascale machine. A disruptive change in the hardware design and integration process is needed...
Heterogeneous computing platforms containing a wide range of computing resources from CPUs to specialized hardware accelerators is the trend today resulting from the physical limitations on processors speed and the increasing demand for computing performance. Hence many optimization strategies are studied to get better throughput and lower energy consumption in heterogeneous systems. Various memory...
For modern parallel applications, modeling their general execution characteristics, such as power and time, is difficult due to a great many factors affecting software-hardware interactions, which is also exacerbated by the dearth of measuring and monitoring tools for novel architectures, such as Intel Xeon Phi processors. To address this modeling challenge, the present work proposes to employ the...
It is commonly the case that a small number of widely used applications make up a large fraction of the workload of HPC centers. Predicting the performance of important applications running on specific processors enables HPC centers to design best performing system configurations and to insure good performance for the most popular applications on new systems. In the analyses presented in this paper...
Cost models play an important role for the efficient implementation of software systems. These models can be embedded in operating systems and execution environments to optimize execution at run time. Even though non-uniform memory access (NUMA) architectures are dominating today's server landscape, there is still a lack of parallel cost models that represent NUMA system sufficiently. Therefore, the...
Today's applications for HW/SW-systems, such as the Internet-of-Things, often demand SoC architectures where sophisticated firmware is running on fairly simple processors. Designers face the challenge of meeting high requirements for these systems regarding their efficiency and dependability under severe cost constraints. Targeting such applications this paper presents a new technique to generate...
Exascale computing is facing a gap between the ever increasing demand for application performance and the underlying chip technology that does no longer deliver the expected exponential increases in CPU performance. The industry is now progressively moving towards dedicated accelerators to deliver high performance and better energy efficiency. However, the question of programmability still remains...
By decoupling the software and hardware, the software and cognitive radar (SRadar&CRadar) support the alteration of their functions by remapping different software in a uniform platform. This feature can make the radar more flexible, and has become a significant trend of the radars. In order to map efficiently, a framework including resource model and mapping algorithm is needed, which has been...
Computer systems software and hardware architectures have become increasingly complex today. Meanwhile, cyberattacks are becoming more and more sophisticated and target any software or hardware components of these systems. Several isolation mechanisms, at the software and the hardware layers, are now available to provide the best protection against these widespread attacks. This paper is aimed at...
Organizing data into its natural grouping based on intrinsic characteristics is the most sensible thing to do with unlabeled data. Mean shift is a non-parametric mode seeking algorithm widely used for data clustering, image segmentation and object tracking, but its use in real time applications is limited because of its high computational cost. In this paper we propose a hybrid, sequentially unfolded,...
Energy-efficient computing is crucial to achieving exascale performance. Power capping and dynamic voltage/frequency scaling may be used to achieve energy savings. The Intel Xeon Phi implements a power capping strategy, where power thresholds are employed to dynamically set voltage/frequency at the runtime. By default, these power limits are much higher than the majority of applications would reach...
PWCS (Probabilistic Write / Copy-Select) is a new kind of lock-free synchronization mechanism with wait-free characteristics proposed by Nicholas Mc Guire at the 13th real-time Linux workshop, which utilizes the inherent randomness of the modern computer systems. It aims at addressing the multi-reader - single-writer problem in Linux. Based on the original label-based PWCS, we propose a hash-based...
Programming heterogeneous multiprocessor architectures is a real challenge dealing with a huge design space. Computer-aided design and development tools try to circumvent this issue by simplifying instantiation mechanisms. However, energy consumption is not well supported in most of these tools due to the difficulty to obtain fast and accurate power estimation. To this aim, this paper proposes and...
This paper investigates the optimal number of processors to execute a parallel job, whose speedup profile obeys Amdahl's law, on a large-scale platform subject to fail-stop and silent errors. We combine the traditional checkpointing and rollback recovery strategies with verification mechanisms to cope with both error sources. We provide an exact formula to express the execution overhead incurred by...
Shared memory is a critical issue for large distributed systems. Despite several data coherency protocols have been proposed, the selection of the protocol that best suits to the application requirements and system constraints remains a challenge. The development of multi-coherency systems, where different protocols can be deployed during runtime, appears to be an interesting alternative. In order...
High performance in modern computing platforms requires programs to be parallel, distributed, and run on heterogeneous hardware. However programming such architectures is extremely difficult due to the need to implement the application using multiple programming models and combine them together in ad-hoc ways. To optimize distributed applications both for modern hardware and for modern programmers...
We present SpExSim, a software tool for quickly surveying legacy code bases for kernels that could be accelerated by FPGA-based compute units. We specifically aim for low development effort by considering the use of C-based high-level hardware synthesis, instead of complex manual hardware designs. SpExSim not only exploits the spatially distributed model of computation commonly used on FPGAs, but...
High level simulation and modeling techniques have matured significantly over the last years and have become more and more important in practice, e.g., in the industrial hardware development and especially the automotive domain. Complex and detailed modeling requires a lot of time during preparation and execution, is quite error prone and thus, reduces the average time-to-market significantly. One...
Split-execution computing leverages the capabilities of multiple computational models to solve problems, but splitting program execution across different computational models incurs costs associated with the translation between domains. We analyze the performance of a split-execution computing system developed from conventional and quantum processing units (QPUs) by using behavioral models that track...
The relentless push in technology scaling driven by Moore's law has witnessed fantastic gains in the quantities of transistors available on chips. Computer architects have exploited the extra transistors by incorporating several computing cores within a single processor. Heterogeneous processing in particular has become a useful technique for dealing with ever-present power and memory restrictions...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.