The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Mobility management is an important issue for publish/subscribe systems to support mobile clients. The objectives of mobility management for publish / subscribe are to achieve short handoff delay and low message overhead, while at the same time guaranteeing reliable message delivery. Although mobility management has been extensively studied, the indirect communication style of publish/subscribe systems...
Modern interconnects and corresponding high performance MPIs have been feeding the surge in the popularity of compute clusters and computing applications. Recently with the introduction of the iWARP (Internet wide area RDMA protocol) standard, RDMA and zero-copy data transfer capabilities have been introduced and standardized for Ethernet networks. While traditional Ethernet networks had largely been...
Clusters of symmetric multiprocessors (SMP) are more commonplace than ever in achieving high- performance. Scientific applications running on clusters employ collective communications extensively. Using shared memory communication among co- located processes on SMP nodes as well as remote direct memory access (RDMA) operations for inter- node communication and trying to overlap them is a proven technique...
With the developments of network technologies, many mechanisms have been introduced to improve system performance in cluster systems by exploiting remote idle memory. However, none of them can satisfy the requirements from different applications. Most methods can only improve the performance of a particular type of applications but not for others. One important reason is they failed to provide unified...
The problem of scheduling divisible loads on a distributed bus network with start-up costs is considered. In this paper, we present a novel algorithm to obtain one-round installment load distribution solutions by utilizing multiple data transfer streams. We demonstrate how our proposed algorithm works by means of multiple illustrative examples. We compare the makespan generated by our proposed algorithm...
The spidergon scheme is a commercial NoC (Network On-Chip) proposed recently to address the demand for a fixed and optimized topology to realize cost effective multi-processor SoC (MPSoC) development. The increasing diversity of the applications quality of service requirements may, however, inhibit employing a particular architecture for a wide range of applications, unless the performance it delivers...
Compute clusters are consuming more power at higher densities than ever before. This results in increased thermal dissipation, the need for powerful cooling systems, and ultimately a reduction in system reliability as temperatures increase. Over the past several years, the research community has reacted to this problem by producing software tools such as HotSpot and Mercury to estimate system thermal...
A new deadlock-free adaptive routing algorithm is proposed for n-dimensional meshes with only two virtual channels, where a virtual channel can be shared by two consecutive planes without any cyclic channel dependency. A message is routed along a series of planes. The proposed planar adaptive routing algorithm is enhanced to a fully adaptive routing version for 3-dimensional meshes using the idle...
Quality of service (QoS) mechanisms allowing users to request for turn-around time guarantees for their jobs have recently generated much interest. In our previous work we had designed a framework, QoPS, to allow for such QoS. This framework provides an admission control mechanism that only accepts jobs whose requested deadlines can be met and, once accepted, guarantees these deadlines. However, the...
The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing. However, existing research shows that such a reactive fault tolerance approach can only improve system productivity marginally. Leveraging the recent progress made in the field of failure prediction, we propose fault-driven...
A traditional application scheduler running on a parallel cluster only supports static scheduling where the number of processors allocated to an application remains fixed throughout the lifetime of the job. Due to unpredictability in job arrival times and varying resource requirements, static scheduling can result in idle system resources thereby decreasing the overall system throughput. In this paper...
As the increase of genome database size, there are increasing number of methods for detecting sequence similarity and increasing demands for genome sequence search and alignment services. It is a challenge to scale up the computer systems for serving these demands in a timely manner. This paper tackles this problem from a novel perspective, which treats the sequence search requests as content requests...
The real-time image forming in future, high-end synthetic aperture radar systems is an example of an application that puts new demands on computer architectures. The initial question is whether it is at all possible to meet the demands with state-of-the-art technology or foreseeable new technology. It is therefore crucial to understand the computational flow, with its associated memory, bandwidth...
We propose and evaluate a novel approach for automatic parallelization. The approach uses traces as units of parallel work. We discuss the benefits and challenges of the use of traces and propose an execution model for automatic parallelization based on traces. We implement a system that demonstrates the benefits and addresses the challenges of using traces for data-parallel applications in an offline...
Although data parallelism is a well-known computational model, there are few programming systems that are both easy to program (for simple applications) and able to work across administrative domains. For data sets (e.g., collections of image data) that are often inherently distributed, there is a need for a simple data-parallel programming system. We describe the design, implementation, and an evaluation...
A flooding-based search mechanism is often used in unstructured P2P systems. Although a flooding-based search mechanism is simple and easy to implement, it is vulnerable to overlay distributed denial-of-service (DDoS) attacks. Most previous security techniques protect networks from network-layer DDoS attacks, but cannot be applied to overlay DDoS attacks. Overlay flooding-based DDoS attacks can be...
NFS has traditionally used TCP or UDP as the underlying transport. However, the overhead of these stacks has limited both the performance and scalability of NFS. Recently, high-performance network such as InfiniBand have been deployed. These networks provide low latency of a few microseconds and high bandwidth for large messages up to 20 Gbps. Because of the unique characteristics of NFS protocols,...
Broadcast in radio-based wireless networks has been a difficult problem. When a node broadcasts, all nodes within its radio coverage will attempt to relay the message by rebroadcasting, causing excessive radio communication in the region that leads to the broadcast storm problem. Most previous solutions assumed a perfect radio condition with a static, circular coverage. However, in real situations,...
The fast and accurate evaluation of transcendental functions (e.g. exp, log, sin, and atan) is quite important in many domains. We implement a software inline function library that can be called from KernelC programming language to compute 8 typical functions on Imagine architecture. By exploiting some of the key features of Imagine architecture, we have been able to provide single precision transcendental...
There is a growing trend to insert application intelligence into network devices. Processors in this type of application-oriented networking (AON) devices are required to handle both packet-level network I/O intensive operations as well as XML message-level CPU intensive operations. In this paper, we investigate performance effect of dual processing via (1) hyperthreading, (2) uni-processor to dual-...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.