The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The non-contiguous access pattern of many scientific applications results in a large number of I/O requests, which can seriously limit the data-access performance. Collective I/O has been widely used to address this issue. However, the performance of collective I/O could be dramatically degraded in today's high-performance computing systems due to the increasing shuffle cost caused by highly concurrent...
Scientific workflow involves data generation, data analysis, and knowledge discovery. As the data volume exceeds a few terabytes (TB) in a single simulation run, the data movement, which happens among data generation, data analysis, and knowledge discovery, becomes a bottleneck in most scientific big data applications. Our previous work shows that reusing the analysis results can have a significant...
Scientific I/O libraries, like PnetCDF, ADIOS, and HDF5, have been commonly used to facilitate the array-based scientific dataset processing. The underlying physical data layout information, however, is usually hidden from the upper layer's logical access. Such mismatching can lead to poor I/O. In this research, we have observed performance degradation in the case of concurrent sub-array accesses,...
High-end computing (HEC) applications and simulations have become increasingly data intensive. The pressure on the storage system capability has substantially increased in recent years. Traditional hard disk drives (HDD) are dominant storage devices in HEC, but suffer seek time delays and rotational latencies. The emerged storage class memory such as Solid State Drive (SSD) provides a new promising...
The non-contiguous access pattern of many scientific applications results in a large number of I/O requests, which can seriously limit the data-access performance. Collective I/O has been widely used to address this issue. However, the performance of collective I/O could be dramatically degraded in today's high-performance computing system due to the increasing shuffle cost caused by highly concurrent...
Scientific datasets and libraries, such as HDF5, ADIOS, and NetCDF, have been used widely in many data intensive applications. These libraries have their special file formats and I/O functions to provide efficient access to large datasets. When the data size keeps increasing, these high level I/O libraries face new challenges. Recent studies have started to utilize database techniques such as indexing...
Scientific datasets, such as HDF5 and PnetCDF, have been used widely in many scientific applications. These data formats and libraries provide essential support for data analysis in scientific discovery and innovations. In this research, we present an approach to boost data analysis, namely Fast Analysis with Statistical Metadata (FASM), via data sub setting and integrating a small amount of statistics...
Many high-end computing applications in critical areas of science and technology are becoming more and more data intensive. These applications transfer large amounts of data from storage nodes to compute nodes for processing, which is costly and bandwidth consuming. The data movement often dominates the applications' run time. Active storage provides a promising solution for these applications by...
Active storage provides an effective method to mitigate the I/O bottleneck problem of data intensive high performance computing applications. It can reduce the amount of data transferred as the application runs by moving appropriate computations close to the data. Prior research has achieved considerable progress in developing several active storage prototypes. However, existing studies have neglected...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.