The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Previous research has demonstrated the potential benefits of thermal aware load placement and thermal mapping in cool-intensive environments such as data centers. However, it has proved difficult to apply existing techniques to live data centers because of models that are either unrealistic, require extensive sensing instrumentation, or because their creation is disruptive to the data center services...
The dark side of Moore's Law is our society's insatiable need to constantly upgrade our computing devices. The high cost in manufacturing energy, materials and disposal is more worrisome the increasing number of smartphones. Repurposing smartphones for educational purpose is a promising idea and shown success in recent years. Our previous work has shown that although different components in smartphones...
In late 2009, the National Institute for Computational Sciences placed in production the world's fastest academic supercomputer (third overall), a Cray XT5 named Kraken, with almost 100,000 compute cores and a peak speed in excess of one Petaflop. Delivering over 50% of the total cycles available to the National Science Foundation users via the TeraGrid, Kraken has two missions that have historically...
Large parallel machines with hundreds of thousands of processors are being built. Recent studies have shown that ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with relatively small amount of memory. Fully distributed load balancing...
It is imperative to consider the concept of sustainable portable computing as the role of such devices increases our lives. With the emergence of the cloud computing paradigm, there will be an increased reliance on wireless communication from portable computing devices to more powerful centralized servers. This paradigm shift to `thin-clients' presents an opportunity to make portable computing more...
In this work we focus on solutions to an emerging threat to cloud-based services namely that of data seizures within a shared multiple customer architecture. We focus on the problem of securing distributed data storage in a cloud computing environment by designing a specialized multi-tenant data-storage architecture. The architecture we present not only provides high degrees of availability and confidentiality...
Grid and cloud schedulers benefit from predictable service for their choices in allocating jobs on remote servers/clusters. Predictable service on local clusters supports fairness and user satisfaction. The paper looks into servers that employ batch scheduling and support time sharing and/or space partitioning of the available resources among different parallel-job workloads. This provides the basis...
Providing recall-guaranteed search is critical for P2P networks. While building semantic overlay improves search performance, existing designs suffer from a tradeoff between search time and search quality (i.e. high recall). Moreover, they require to use high control overhead for overlay maintenance. In this paper, we present rSearch to achieve fast search with guaranteed high recall. The rSearch-enabled...
High End Computing (HEC) systems are being deployed with eight to sixteen compute cores, with 64 to 128 cores/node being envisioned for exascale systems. MVAPICH2 is a popular implementation of MPI-2 specifically designed and optimized for InfiniBand, iWARP and RDMA over Converged Ethernet (RoCE). MVAPICH2 is based on MPICH2 from ANL. Recently MPICH2 has been redesigned with an effort to optimize...
The current trend towards multi-core/manycore and accelerated architectures presents challenges, both in portability, and also in the choices that developers must make on how to use the resources that these architectures provide. This paper explores some of the possibilities that are enabled by the Open Computing Language (OpenCL), and proposes a programming model that will allow developers and scientists...
Balancing fairness, user performance, and system performance is a critical concern when developing and installing parallel schedulers. Sandia uses a customized scheduler to manage many of their parallel machines. A primary function of the scheduler is to ensure that the machines have good utilization and that users are treated in a "fair" manner. A separate compute process allocator (CPA)...
Data storage technologies have been recognized as one of the major dimensions of information management along with the network infrastructure and applications. The prosperity of cloud computing requires the migration from server-attached storage to network-based distributed storage. Along with variant advantages, distributed storage also poses new challenges in creating a secure and reliable data...
Exploiting the performance of today's processors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in thread and cache topology. LIKWID is a set of command-line utilities that addresses four key problems: Probing the thread and cache topology of a shared-memory node, enforcing thread-core affinity on a program, measuring performance counter...
We propose a performance estimation technique for a multi-core segmented bus platform, SegBus. The technique enables us to assess the performance aspects of any specific application on a particular platform configuration, modeled in Unified Modeling Language (UML). We present methods to transform Packet Synchronous Data Flow (PSDF) and Platform Specific Model (PSM) models of the application into Extensible...
Many SoCs adopt multicore architectures. As a result, embedded programmers are also facing the challenge of parallel programming. We propose a parallel skeleton library that can be used on embedded multicores. Our library is implemented in standard C++ using template features. We propose two parallel skeletons to support common program patterns on multicores. In our skeleton library, programmers can...
With the growing interest of cloud computing and carbon emission reduction, how to build energy efficient cloud architecture becomes a crisis issue for service providers. In this paper, we propose a power-aware cloud architecture based on DRBL (Diskless Remote Boot in Linux), cpufreqd and xenpm. We also introduce a low-cost smart metering system based on open hardware Arduino board. Composing with...
The positioning technique is the key technique for developing geographic applications, like location based services. The Global Positioning System (GPS) is a common approach for positioning in vehicular navigations. Although GPS can provide absolute position information, the accuracy of GPS is not enough for personal navigations. What is worse, GPS does not work well indoors. Instead, Inertial Measurement...
Intrusion detection is one of the most important services in a smart home, which requires to monitor intrusion events and to react against them. A Wireless Sensor and Actor Network (WSAN) has a set of sensor nodes for monitoring events and a set of high capability nodes, called actor nodes, for reacting to the events. It can provide an infrastructure for building the intrusion detection system in...
Task scheduling is one of the most prominent problems in the era of parallel computing. We find scheduling algorithms in every domain of computer science, e.g., mapping multiprocessor tasks to clusters, mapping jobs to grid resources, or mapping fine-grained tasks to cores of multicore processors. Many tools exist that help understand or debug an application by presenting visual representations of...
One of the main issues in heterogeneous reconfigurable computing is the well-known processor/memory bottleneck. Due to the memory bandwidth limitations, the performance of execution of an application can dramatically increase via the efficient usage of the memory. In this paper, we present tQUAD, a new tool for the memory bandwidth usage analysis. This tool is capable of delivering detailed temporal...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.