The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present an accurate online scalability prediction model for data-parallel programs on NUMA many-core systems. Memory contention is considered to be the major limiting factor of program scalability as data parallelism limits the amount of synchronization or data dependencies between parallel work units. Reflecting the architecture of NUMA systems, contention is modeled at the last-level caches of...
Programmable accelerators (PA) are receiving increased attention in domain-specific architecture designs to provide more general support for customization. In a PA-rich system, computational kernels are compiled into predefined PA templates and dynamically mapped to real PAs at runtime. This imposes a demanding challenge on the compiler side - that is, how to generate high-quality PA mapping code...
Graph-structured data has come into wide use in various fields where graphs are the natural data structure to model networks. Therefore, the comparison between two graphs becomes a research focus. Traditional approaches for graph comparison face the common problem: either increasing the runtime for large graphs or simplifying the representation of graphs which ignores part of their topological information...
We are proposing a novel framework that ameliorates locality-aware parallel programming models, by defining hierarchical data locality model extension. We also propose a hierarchical thread partitioning algorithm. This algorithm synthesizes hierarchical thread placement layouts that targets minimizing the program's overall communication costs. We demonstrated the effectiveness of our approach using...
Traditional means of gathering performance data are tracing, which is limited by the available storage, and profiling, which has limited accuracy. Performance modeling is often used to interpret the tracing data and generate performance predictions. We aim to complement the traditional data collection mechanisms with online performance modeling, a method that generates performance models while the...
Taking an existing large-scale simulation model of the German toll system we identify the typical workload by profiling the runtime behavior. Crucial performance hot spots are identified and related to the real-world application to analyze and evaluate the observed efficiency. In a benchmark approach we compare the observed performance to different simulation frameworks.
While Electronic Design Automation made the shift towards system design and high-level design methods keep on emerging, there is hardly any open framework which allows researchers to quickly prototype novel synthesis algorithms. We present System#, an open source system level design framework based on C#. System# tries to bridge the productivity gap by covering modeling, simulation, code transformations...
This paper links a well-investigated formalism for describing dynamic structured discrete event systems and a modelling methodology for runtime reconfigurable systems. The theory behind dynamic structured discrete event systems is used to back a generic SystemC simulation model derived from and developed for runtime reconfigurable systems. The coupling of formalism and model effects in particular...
Data sanitization has been studied in the context of architectures for high assurance systems, language-based information flow controls, and privacy-preserving data publication. A range of sanitization strategies has been developed to address the wide variety of data content and contexts that arise in practice. It is therefore tempting to separate the complex downgrading operations into untrusted...
The many-core revolution brought forward by recent advances in computer architecture has created immense challenges in the writing of parallel programs for High Performance Computing (HPC). Development of parallel HPC programs remains an art, and a universaldoctrine for synchronization, scheduling and execution in general has not been found for many-core/multi-core architectures. These issues are...
As data sizes continue to increase, the concept of active storage is well fitted for many data analysis kernels. Nevertheless, while this concept has been investigated and deployed in a number of forms, enabling it from the parallel I/O software stack has been largely unexplored. In this paper, we propose and evaluate an active storage system that allows data analysis, mining, and statistical operations...
Distributed systems with shared memory and more than one consistency model for shared data are often restricted in use or inflexible for programmers. This paper describes details of our transactional distributed memory system, that provides several consistency models for shared memory. To this end Rainbow OS implements so-called split objects guaranteeing the integrity of heap structures and providing...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.