The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Networks-on-chip in many-core embedded systems consume large portions of the chip's area, cost, delay and power. In real-time embedded systems meeting the real time targets is critical. Therefore networks-on-chip must provide a communication infrastructure with worst case delays acceptably low to meet the time deadlines. This requirement directly translates into scalable networks with low diameters...
Oil reservoir simulation helps in extracting oil and in optimal well placement. This paper presents the parallelization, development, performance analysis, and profiling of a 2-phase oil-water reservoir simulator on a heterogeneous multi-core STI Cell computer. Despite the largely interdependent nature of the oil reservoir model equations, we obtained speedups of 6x with a 1D reservoir data. We boosted...
Massively deployed inside Sony PS3 platforms, the STI Cell Broadband Engine is a multi-core processor with a PowerPC host processor (PPE) and 8 synergic processor engines (SPEs). In this paper, we describe three image processing applications which we implemented on the Cell BE. We report the performance measured on one Cell blade with varying numbers of synergic processor engines enabled, and with...
Based on the recent design trend from giant chip-vendors, multicore systems are being deployed with multilevel caches to achieve higher levels of performance. Supporting real-time applications on multicore systems becomes a great challenge as caches are power hungry and caches make the execution time predictability worse. Studies show that timing predictability can be improved using cache locking...
The IBM Cell Broadband Engine (BE) is a multi-core processor with a PowerPC host processor (PPE) and 8 synergic processor engines (SPEs). The Cell BE architecture is designed to improve upon conventional processors in terms of memory latency, bandwidth and power computation. In this paper, we discuss the parallelization, implementation and performance of a video surveillance application on the IBM...
Signal processing has been implemented in many computing devices that are successfully being used in mission-critical NASA programs, military operations, and medical devices. The popularity and demand of signal processing systems are increasing in many other domains including commercial products. Many applications in signal processing systems need tremendous amount of processing speed. In addition...
This paper makes the case for the Hyper-Ring as the interconnect or NoC for many-cores. While other prominent candidates for many-core interconnect such as the torus and mesh have superior bisection bandwidth to the HR, their cost, number of links and chip area are much higher than the HR. The worst-case latency or maximum hop count is relatively inferior on the mesh, while that of the HR is comparative...
Artificial neural networks have been employed in diverse applications ranging from control, to pattern recognition and classification. While password detection can be implemented with a digital electronic circuit with non-volatile memory, this implementation is prone to hacking. In this paper, we present a 3-layer feedforward neural network which we have designed, trained and tested for secure password...
The IBM Cell Broadband Engine (BE) is a multi-core processor with a PowerPC host processor (PPE) and 8 synergic processor engines (SPEs). The Cell BE architecture is designed to improve upon conventional processors in terms of memory latency, bandwidth and power computation. In this paper, we discuss the parallelization, implementation and performance of the edge detection image processing application...
The IBM Cell Broadband Engine (BE) is a multi-core processor with a PowerPC host processor (PPE) and 8 synergic processor engines (SPEs). The Cell BE architecture is designed to improve upon conventional processors in terms of memory latency, bandwidth and compute power. In this paper, we describe a 2D graphics algorithm for image resizing which we parallelized and developed on the Cell BE. We report...
We analyze COSBI's open source mark (OSMark) benchmark in measuring CPU performance in desktop personal computers. We select and focus on selected tests of the benchmark targeting the CPU which we also profile. We run the benchmark on two personal computers with single and dual cores. We then collect performance event counts for the selected tests to characterize the benchmark's workload and we correlate...
Cache memory improves performance by reducing the speed gap between the CPU and the main memory. However, the execution time becomes unpredictable due to the cache's adaptive and dynamic behavior. Real-time applications are subject to operational deadlines and predictability is considered necessary to support them. Studies show that for embedded systems, cache locking helps determine the worst case...
We previously proposed a heterogeneous 16-core architecture and a nearest neighbor priority-based thread scheduling algorithm which values core affinity and permits inter-core thread migration in [1]. In this paper, we run simulation experiments under localized core affinities limited to a very small number of specific cores within core classes. These localized schedules represent applications that...
In this paper, an approach to vehicle license plate localization is described. The algorithm starts by isolating objects, in the image, that can be possible candidates of characters in a license plate. It then uses distances between objects and their relative positions to identify possible groupings (series) of characters that could belong to a license plate. The algorithm then uses a novel character...
We modified the Java code of the MOSS simulator to develop a robust virtual memory simulator which allows the user to easily switch between different page replacement algorithms including FIFO, LRU, and optimal replacement algorithms. The simulator clearly demonstrates the behavior of the page replacement algorithms in a virtual memory system, and provides a convenient way to obtain their page fault...
We present a framework for cross-layer optimization in small, resource-constrained systems which require a high degree of optimization. We argue that often these systems allow for a departure from conventional network stack design principles opening up broad opportunities for optimizations. We examine these new opportunities and propose a design strategy to take advantage of them. Simulation results...
Artificial intelligence and physics are two components of 3D games. While they bring the laws of nature and more realism to games, they may be intensive enough to leave a sizable dent on the game 's performance and lead to a serious reduction in fame rate. In this paper, we discuss a game implementation solution with multi-core architectures which hides the computational load of the game intelligence...
With stunning visual effects, 3DMarkreg emerged as the leading PC benchmark for 3D gaming performance. Its tests are at the cutting edge of consumer graphics and push the limit of 3D rendering with spectacular scenes, and state of the art lighting techniques. The benchmark scores help quickly differentiate the platforms with state of the art graphic cards and processors from those with older components...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.