The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs). Previous available benchmarks for multiprocessors have focused on high-performance computing applications and used a limited number of synchronization methods. PARSEC includes emerging applications in recognition, mining and...
PWCS (Probabilistic Write / Copy-Select) is a new kind of lock-free synchronization mechanism with wait-free characteristics proposed by Nicholas Mc Guire at the 13th real-time Linux workshop, which utilizes the inherent randomness of the modern computer systems. It aims at addressing the multi-reader - single-writer problem in Linux. Based on the original label-based PWCS, we propose a hash-based...
FPGAs have grown considerably in the past years. In the meantime it is possible to implement several soft-core processors in one FPGA. This enables considerable parallelism for the developer. Unfortunately, most application code is still available in sequential form. Thus, in this contribution we present a tool that enables the automated transformation of an application into a streaming pipeline using...
Ranking is an important operation in web searching. Among many ranking algorithms, PageRank is a most notable one. However, sequential PageRank computing on a large web-link graph is not efficient. To address such limitation, parallel PageRank implemented on Message Passing Interface (MPI) is a viable choice. Generally speaking, MPI-PageRank will be implemented using a root node and many computing,...
We present a hierarchical test and repair flow for shared BISR (Built-In Self-Repair) in asynchronous multi-processors. The flow partitions the memories local to a processor in groups and treats the groups as a whole when doing the repair. The flow runs automatically with few interventions except at the beginning stage. It can be used effectively for practical industrial test and repair. Its test...
Multi-relational concept discovery is a predictive learning task that aims to discover descriptions of a target concept in the light of past experiences. Parallelization has emerged as a solution to deal with efficiency and scalability issues relating to large search spaces in concept discovery systems. In this work, we describe a parallelization method for the ILP-based concept discovery system called...
There has always been a strong relationship between computer applications and computer architectures. Advances in computer architecture enable new usage models; new usage models challenge new architectures. For many decades, the interplays between applications and architectures have resulted in significant progress in the computer technologies. Recently, the computer industry adopts the multi/many-core...
Estimating the execution time of programs has always been a concern in computer science. With the emergence of multi-core processors, this concern has found new perspectives and new parameters affect the runtime performance of parallel applications. To estimate the execution time of parallel applications, we investigate the overheads caused by parallelizing an application by identifying the overheads...
This paper describes the design and application of an execution-driven parallel simulator for predicting performance of Large-Scale Parallel Computers. The simulator can be used in hardware validation and software development for large-scale parallel computers. It simulates processors of each node, network components and disk I/O components. To illustrate the capabilities of our simulator, we describe...
Personal high performance computer (PHPC) requires lower cost and high performance. The Teraflops PHPC systems with special accelerator units like GPGPU have been presented, but they have difficulties in programming, compatibility and applicability. In this paper, we present HPP-PHPC, a hybrid architecture of heterogeneous processors connected by non-coherent off-chip system bus. The performance of...
An automatic parallelization method for tightly-nested loops running on multi-core system has been proposed. First, according to the physical characteristics of multi-core processors, a way has been presented to solve the problem on dada locality during data decomposition; Second, for increasing parallel granularity of tight nested loops, the method discussed in this article studied computation decomposition...
The development of parallel algorithms for batch and single pattern back propagation training of a multilayer perceptron and the research of their efficiency on a general-purpose parallel computer are presented in this paper. The multilayer perceptron model and the sequential batch and single pattern training algorithms are theoretically described. An algorithmic description of the parallel versions...
In the process of treatments to large-scale 3-D prestack Kirchhoff depth migration, the required memory for imagining spaces is large, even more than the total memory of the single node. So this paper proposes the partitioning algorithm based on the process groups in the environment of distributed memory. Partitioning to the imaging spaces is going on in a group, and the processes of the group achieve...
Distributed simulation techniques are commonly used to improve the speed and scalability of wireless sensor network simulators. However, accurate simulations of dynamic interactions of sensor network applications incur large synchronization overheads and severely limit the performance of existing distributed simulators. In this paper, we present two novel techniques that significantly reduce such...
A high bandwidth critical path monitor (1 sample/ cycle at 4-5 GHz) capable of providing real-time timing margin information to a variable voltage/frequency scaling control loop is described. The critical path monitor tracks the critical path delay to within 1 FO2 inverter delay with a standard deviation less than 3 FO2 delays over process, voltage, temperature, and workload. The CPM is sensitive...
Many Web services are expected to run with high degree of security and dependability. To achieve this goal, it is essential to use a Web-services compatible framework that tolerates not only crash faults, but Byzantine faults as well, due to the untrusted communication environment in which the Web services operate. In this paper, we describe the design and implementation of such a framework, called...
As Grid networks grows, the complexity of resource management in Grid networks dramatically increases. To manage the Grid resources efficiently, policy based resource management system is suitable. However, the policy which made at a moment could not be appropriate to other time because the condition of the grid resources with time space changes. Thus, in this paper, we propose asynchronous policy-based...
This paper addresses the problem of how to adapt an algorithm designed for fixed topology networks to produce the intended results, when run in a network whose topology changes dynamically, in spite of encountering topological changes during its execution. We present a simple and unified procedure, called a reset procedure, which, when combined with the static algorithm, achieves this adaptation....
Fault-free digital systems can fail as a result of metastable behavior when asynchronous inputs have critical timing combinations. The problem of metastable behavior is generally considered to be unavoidable in digital systems that synchronize asynchronous inputs. This correspondence extends previous results on the unavoidability of metastable behavior. The set of inputs to the digital system is generalized...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.