The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
GPU-based computing has become one of the popular high performance computing fields. The field is called GPGPU. This paper is focused on design and implementation of a uniform GPGPU application that is optimized for both the legacy and the recent GPU architectures. As a typical example of such the GPGPU application, this paper will discuss the uniform implementation of the Caravel a platform. Especially...
Two new multiprocessor architectures to accelerate the simulation of multi-agent worlds based on the massively parallel GCA (Global Cellular Automata) model are presented. The GCA model is suited to describe and simulate different multi-agent worlds. The designed and implemented architectures mainly consist of a set of processors (NIOS II) and a network. The multiprocessor systems allow the implementation...
Dependability of many-core processors is a very important topic. To improve the dependability, we propose the Smart Core system, which is a smart many-core system with redundant cores and multifunction routers. The multifunction router has three functions: copying packets, changing the destinations of packets, and rendezvousing and comparing two packets from different nodes. Using these additional...
As supercomputers scale to a million processor cores and beyond, the underlying resource management architecture needs to provide a flexible mechanism to manage the wide variety of workloads executing on the machine. In this paper we describe the novel approach of the Blue Gene/Q (BG/Q) supercomputer in addressing these workload requirements by providing resource management services that support both...
Computing is now shifting towards multiprocessing. The fundamental goal of multiprocessing is improved performance through the introduction of additional hardware threads or cores (referred to as “cores” for simplicity). Modern network stacks can exploit parallel cores to allow either message-based parallelism or connection-based parallelism as a means to enhance performance. OpenSolaris has redesigned...
Modern embedded multiprocessors are complex systems that often require years to design and verify. A significant factor is that engineers must allocate a disproportionate share of their effort to ensure that modern FPGA chips architecture behave correctly. This paper proposes a design and creation of embedded multiprocessors architecture system focusing on its design area and performance. Embedded...
The complexity of MP2SoC architectures to come is such that many issues arise simultaneously, such as multicore programming, system performance, reliability, scalability, etc. The key to solve these issues is self-adaptability: the chips to come have to integrate the required software and hardware means to monitor and self-react to the various kinds of events that are likely to occur during chip's...
Summary form only given. The dynamic reconfiguration of hardware stands for the change of hardware while the system is operating. Its benefit is the adaption to different computing requirements. For instance, an improved use of communication networks can be achieved: Many networks reveal the characteristic that connections between specific communication partners show a smaller latency than others...
This paper present a new dimension-oriented routing algorithm for Mesh-of-tree (MoT) based Network-on-Chip (NoC) architecture. The addressing scheme is considerably simplified that enables us to reduce the minimum flit-size to 16-bits, compared to 32-bits in the previously reported works. The same level of throughput and average latency could be achieved with a 43.86% reduction in area and 43% reduction...
The microprocessor of the year 2020 will have 1000 cores on it, and unless you get involved, it will either just be an array of cores thrown over the transom for you to figure out what to do with, or it will be easy to use but run like a turtle, compared to what it could do. These two extremes are not unlikely, unless those with applications get involved. Most of the gurus of computer architecture...
Parallelism is the most important mean to exploit the computation potential of multi-core processors. Real applications, particularly, commercial applications often have strong dependence that has to be respected. In order to achieve reasonably good performance, hybrid parallelism schemes usually need to be applied in these applications. Furthermore, parallel applications with task and pipeline parallelism...
This article describes high-speed logic associative multiprocessor for concurrent analyzing information represented in analytic, graph- and table forms of associative relations to search, recognize and make a decision in n-dimensional vector discrete space. Vectorlogical process models of actual applications, for which the quality of solution is estimated by the proposed integral non-arithmetical...
Recently, GPU has evolved into a highly parallel, multithreading, many core processor with tremendous computational capability and very high memory bandwidth. At the same time, multi-core CPU evolution continued and today's CPUs have 4-8 cores which offer dramatically increased performance and power savings characteristics. We are aware of very few works that consider both devices cooperating to solve...
Technology scaling is having an increasingly detrimental effect on microprocessor reliability, with increased variability and higher susceptibility to errors. At the same time, as integration of chip multiprocessors increases, power consumption is becoming a significant bottleneck that could threaten their growth. To deal with these competing trends, energy-efficient solutions are needed to deal with...
With the increase in the design complexity of MPSoC architectures, estimating power consumption is very complex and time consuming at lower level of abstraction. We propose a methodology using ArchC named Power-ArchC for a fast high-level estimation of processor power consumption. Power values are obtained by an instruction level power characterization at gate level. The requirements for power evaluation...
Next-generation real-time systems will be increasingly based on heterogeneous MPSoC design paradigms, where predictability and performance will be key issues to deal with. Such issues can be tackled both at the hardware level, by embedding technologies such as TDMA busses, and at the OS level, where suitable scheduling techniques can improve performance and reduce energy consumption. Among these,...
Early design space exploration (DSE) is a key ingredient in system-level design of MPSoC-based embedded systems. The state of the art in this field typically still explores systems under a single, fixed application workload. In reality, however, the applications are concurrently executing and contending for system resources in such systems. As a result, the intensity and nature of application demands...
Silicon technology scaling is continuously enabling denser integration capabilities. However, this comes at the expense of higher variability and susceptibility to wear-out. With an escalating number of on-chip components expected to be defective in near-future chips, modern parallel systems, such as Chip Multi-Processors (CMP), become especially vulnerable to these faults. Just a single link failure...
The following topics are dealt with: high performance architecture; synchronous interfaces; cache architecture; cryptography; real-time systems; signal processing; multiprocessor systems; and networks-on-chip.
This paper examines the initial parallel implementation of SCATTER, a computationally intensive inelastic neutron scattering routine with polycrystalline averaging capability, for the General Utility Lattice Program (GULP). Of particular importance to structural investigation on the atomic scale, this work identifies the computational features of SCATTER relevant to a parallel implementation and presents...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.