The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk I/O overhead could significantly reduce performance. In this paper, we propose GraphH to enable high-performance...
A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such...
Property-graphs are becoming popular for Intrusion Detection Systems (IDSs) because they allow to leverage distributed graph processing platforms in order to identify malicious network traffic patterns. However, a benchmark for studying their performance when operating on big data has not yet been reported. In general, benchmarking a system involves the execution of workloads on datasets, where both...
Platform as a Service (PaaS) clouds provide part of the hardware/software stack and related services to tenant applications. Increased load is handled elastically by scaling, which either modifies the number of instances an application has available on the cloud or increases their available resources. However, because all these instances run inside isolated containers, experience gained by the first...
Nowadays, Graphics Processing Unit (GPU) is essential for general-purpose high-performance computing, because of its dominant performance in parallel computing compare to that of CPU. There have been many successful trials on the use of GPU in virtualized environment. Especially, NVIDIA Docker obtained a most practical way to bring GPU into the container-based virtualized environment. However, most...
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are...
We present EclipseMR, a novel MapReduce framework prototype that efficiently utilizes a large distributed memory in cluster environments. EclipseMR consists of double-layered consistent hash rings - a decentralized DHT-based file system and an in-memory key-value store that employs consistent hashing. The in-memory key-value store in EclipseMR is designed not only to cache local data but also remote...
The last few years saw the emergence of 64-bit ARM SoCs targeted for mobile systems and servers. Mobile-class SoCs rely on the heterogeneous integration of a mix of CPU cores, GPGPU cores, and accelerators, whereas server-class SoCs instead rely on integrating a larger number of CPU cores with no GPGPU support and a number of network accelerators. Previous works, such as the Mont-Blanc project, built...
In high-performance computing (HPC), end-to-end workflows are typically utilized to gain insights from scientific simulations. An end-to-end workflow consists of scientific simulation and data analysis, and can be executed in-situ, in-transit, and offline. Existing studies on end-to-end workflows have largely focused on the high-performance execution approaches. However, the emerging heterogeneous...
Various extensions of TCP/IP have been proposed to reduce network latency; examples include Explicit Congestion Notification (ECN), Data Center TCP (DCTCP) and several proposals for Active Queue Management (AQM). Combining these techniques requires adjusting various parameters, and recent studies have found that it is difficult to do so while obtaining both high performance and low latency. This is...
Studying the interaction among applications, MPI runtimes, and the fabric they run on is critical to understanding application performance. There exists no high-performance and scalable tool that enables understanding this interplay on modern multi-petaflop systems. Designing such a tool is non-trivial and involves multiple components including 1) data profiling/collection from network/MPI library,...
Scientific data sets, which grow rapidly in volume, are often attached with plentiful metadata, such as their associated experiment or simulation information. Thus, it becomes difficult for them to be utilized and their value is lost over time. Ideally, metadata should be managed along with its corresponding data by a single storage system, and can be accessed and updated directly. However, existing...
In situ workflows contain tasks that exchange messages composed of several data fields. However, a consumer task may not necessarily need all the data fields from its producer. For example, a molecular dynamics simulation can produce atom positions, velocities, and forces; but some analyses require only atom positions. The user should decide whether to specialize the output of a producer task for...
Stream processing applications continuously process large amounts of online streaming data in real-time or near real-time. They have strict latency constraints, but they are also vulnerable to failures. Failure recoveries may slow down the entire processing pipeline and break latency constraints. Upstream backup is one of the most widely applied fault-tolerant schemes for stream processing systems...
Relational databases are well suited for vertical scaling; however, specialized hardware can be expensive. Conversely, NewSQL and NoSQL data stores are designed to scale horizontally. NewSQL databases provide ACID transaction support; however, joins are limited to the partition keys, resulting in restricted query expressiveness. On the other hand, NoSQL databases are designed to scale out on commodity...
Efficiently programming shared-memory machines is a difficult challenge because mapping application threads onto the memory hierarchy has a strong impact on the performance. However, optimizing such thread placement is difficult: architectures become increasingly complex and application behavior changes with implementations and input parameters, e.g problem size and number of threads. In this work,...
Most applications running on supercomputers achieve only a fraction of a system's peak performance. It has been demonstrated that the co-scheduling of applications can improve the overall system utilization. However, following this approach, applications need to fulfill certain criteria such that the mutual slowdown is kept at a minimum. In this paper, we present an HPC scheduler that applies co-scheduling...
Resource usage data, collected using tools such as TACC_Stats, capture the resource utilization by nodes within a high performance computing system. We present methods to analyze the resource usage data to understand the system performance and identify performance anomalies. The core idea is to model the data as a three-way tensor corresponding to the compute nodes, usage metrics, and time. Using...
Almost all performance analysis tools in the HPC space perform some form of aggregation to compute summary information of a series of performance measurements, from summations to more complex operations like histograms. Aggregation not only reduces data volumes and consequently storage space requirements and overheads, but is also crucial to extract insights from recorded measurement data. In current...
Accelerated clusters, which are distributed memory systems equipped with accelerators, have been used in various fields. For accelerated clusters, programmers often implement their applications by a combination of MPI and CUDA (MPI+CUDA). However, the approach faces programming complexity issues. This paper introduces the XcalableACC (XACC) language, which is a hybrid model of XcalableMP (XMP) and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.