The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Carrying out even the simplest performance benchmark requires considerable knowledge of statistics and computer systems, and painstakingly following many error-prone steps, which are distinct skill sets yet essential for getting statistically valid results. As a result, many performance measurements in peer-reviewed publications are flawed. Among many problems, they fall short in one or more of the...
In archival systems, storage media are often replaced much earlier than their expected service life in exchange for other benefits of new media, such as higher capacity, bandwidth, and I/O operations per second, or lower costs. In an era of decreasing media density growth rates, retiring media early by considering only short-term benefits while discarding potential long-term cost benefits could have...
Benchmarks are widely used to perform apples-to-apples comparison in a controlled and reliable fashion. Benchmarks must model real world workload behavior. In recent years, to meet web scale demands, Key-Value (KV) stores have emerged as a vital component of cloud serving systems. The Yahoo! Cloud Serving Benchmark (YCSB) has emerged as the standard benchmark for evaluating key-value systems, and...
Shingled Magnetic Recording (SMR) is a means of increasing the density of hard drives that brings a new set of challenges. Due to the nature of SMR disks, updating in place is not an option. Holes left by invalidated data can only be filled if the entire band is reclaimed, and a poor band compaction algorithm could result in spending a lot of time moving blocks over the lifetime of the device. We...
Maintaining information privacy is challenging when sharing data across a distributed long-term datastore. In such applications, secret splitting the data across independent sites has been shown to be a superior alternative to fixed-key encryption; it improves reliability, reduces the risk of insider threat, and removes the issues surrounding key management. However, the inherent security of such...
High-performance parallel storage systems, such as those used by supercomputers and data centers, can suffer from performance degradation when a large number of clients are contending for limited resources, like bandwidth. These contentions lower the efficiency of the system and cause unwanted speed variances. We present the Automatic Storage Contention Alleviation and Reduction system (ASCAR), a...
Growth in disk capacity continues to outpace advances in read speed and device reliability. This has led to storage systems spending increasing amounts of time in a degraded state while failed disks reconstruct. Users and applications that do not use the data on the failed or degraded drives are negligibly impacted by the failure, increasing the perceived performance of the system. We leverage this...
For three decades, Kryder's law correctly predicted an exponential increase in bit density on disk platters, leading to an exponential drop in cost per gigabyte, and thus to an entrenched expectation that if data could be stored for a few years the incremental cost of storing it forever would be minimal. However, disk now is over 7 times as expensive as Kryder's law would have predicted, and industry...
Most encryption used today obfuscates data behind a secret key or a problem believed to be computationally complex. One can fundamentally think of it as delayed release for a determined adversary. This approach is not well suited for long-term archival of sensitive data. Additionally, issues such as key rotation, and lost or exposed keys, make keeping such archives up to date very difficult. As a...
Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata...
There is a large body of work-such as system administration and intrusion detection-that relies upon storage system logs and snapshots. These solutions rely on accurate system records, however, little effort has been made to verify the correctness of logging instrumentation and log reliability. We present a solution, called ExDiff, that uses expectation differencing to validate storage system logs...
This paper explores the benefits and limitations of in-storage processing on current Solid-State Disk (SSD) architectures. While disk-based in-storage processing has not been widely adopted, due to the characteristics of hard disks, modern SSDs provide high performance on concurrent random writes, and have powerful processors, memory, and multiple I/O channels to flash memory, enabling in-storage...
Deduplicating in-line data on primary storage is hampered by the disk bottleneck problem, an issue which results from the need to keep an index mapping portions of data to hash values in memory in order to detect duplicate data without paying the performance penalty of disk paging. The index size is proportional to the volume of unique data, so placing the entire index into RAM is not cost effective...
Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of filelevel activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific...
Tracking archival usage and data migration in a long term supercomputing system is critical to understanding not only how users' needs and habits have changed over time, but also how the archive itself evolves in response to these external factors. Yet this type of study has not previously been performed. To address this need, we conducted an in-depth comparison of user initiated file activity on...
Shingled Magnetic Recording technology is expected to play a major role in the next generation of hard disk drives. But it introduces some unique challenges to system software researchers and prototype hardware is not readily available for the broader research community. It is crucial to work on system software in parallel to hardware manufacturing, to ensure successful and effective adoption of this...
The ever-growing amount of data requires highly scalable storage solutions. The most flexible approach is to use storage pools that can be expanded and scaled down by adding or removing storage devices. To make this approach usable, it is necessary to provide a solution to locate data items in such a dynamic environment. This paper presents and evaluates the Random Slicing strategy, which incorporates...
Storage Class Memory (SCM) has become increasingly popular in enterprise systems as well as embedded and mobile systems. However, replacing hard drives with SCMs in current storage systems often forces either major changes in file systems or suboptimal performance, because the current block-based interface does not deliver enough information to the device to allow it to optimize data management for...
Ultimately the performance and success of a shingled write disk (SWD) will be determined by more than the physical hardware realized, but will depend on the data layouts employed, the workloads experienced, and the architecture of the overall system, including the level of interfaces provided by the devices to higher levels of system software. While we discuss several alternative layouts for use with...
Flash memory has become increasingly popular in today's storage systems. However, replacing hard drives with flash memory in current systems often either requires major file system changes or causes performance degradation due to the limitations of block-based interface and out-of-place updates required by flash. To alleviate this problem, we propose an object-based model for flash memory that gives...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.