The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The in-memory data processing framework, Apache Spark, has been stealing the limelight for low-latency interactive applications, iterative and batch computations. Our early experience study [17] has shown that Apache Spark can be enhanced to leverage advanced features (e.g., RDMA) on highperformance networks (e.g., InfiniBand and RoCE) to improve the performance of shuffle phase. With the fast evolving...
The limitation of local storage space in the HPC environments has placed an unprecedented demand on the performance of the underlying shared parallel file systems. This has necessitated a scalable solution for running Big Data middleware (e.g., Hadoop) on HPC clusters. In this paper, we propose Boldio, a hybrid and resilient key-value storebased Burst-Buffer system Over Lustre for accelerating I/O-intensive...
The most popular Big Data processing frameworks of these days are Hadoop MapReduce and Spark. Hadoop Distributed File System (HDFS) is the primary storage for these frameworks. Big Data frameworks like Hadoop MapReduce and Spark launch tasks based on data locality. In the presence of heterogeneous storage devices, when different nodes have different storage characteristics, only locality-aware data...
Virtualization has become a central role in HPC Cloud due to easy management and low cost of computation and communication. Recently, Single Root I/O Virtualization (SR-IOV) technology has been introduced for high-performance interconnects such as InfiniBand and can attain near to native performance for inter-node communication. However, the SR-IOV scheme lacks locality aware communication support,...
Increasing number of MPI applications are being ported to take advantage of the compute power offered by GPUs. Data movement on GPU clusters continues to be the major bottleneck that keeps scientific applications from fully harnessing the potential of GPUs. Earlier, GPU-GPU inter-node communication has to move data from GPU memory to host memory before sending it over the network. MPI libraries like...
Hadoop is the de-facto standard platform for large-scale data analytic applications. In spite of high availability and reliability guarantees, Hadoop Distributed File System (HDFS) suffers from huge I/O bottlenecks for storing the tri-replicated data blocks. The I/O overheads intrinsic to the HDFS architecture degrade the application performance. In this paper, we present a novel design (MEM-HDFS)...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.