The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The following topics are dealt with: solid-state devices; data security; data reliability; file systems; quality of service; disk recording; memory recording; and data access.
We take an overview of the CERN Advanced Storage (CASTOR) version 2 system and its usage at CERN while serving the High Energy Physics community. We further explore some of the observations made between 2005 and 2010 while managing this multi-petabyte distributed storage system.
A challenging issue in performance evaluation of parallel storage systems through trace-driven simulation is to accurately characterize and emulate I/O behaviors in real applications. The correlation study of inter-arrival times between I/O requests, with an emphasis on I/O-intensive scientific applications, shows the necessity to further study the self-similarity of parallel I/O arrivals. This paper...
The block erase requirement in NAND Flash devices leads to the need for garbage collection. Garbage collection results in write amplification, that is, to an increase in the number of physical page programming operations. Write amplification adversely impacts the limited lifetime of a NAND Flash device, and can add significant system overhead unless a large spare factor is maintained. This paper proposes...
Continuous data stream processing systems have offered limited support for data persistence in the past, for three main reasons: First, online, real-time queries examine current streaming data and (under the assumption of no server failures) do not require access to past data; second, stable storage devices are commonly thought to be constraining system throughput and response times when compared...
Storage is increasingly becoming a vector for data compromise. Solutions for protecting on-disk data confidentiality and integrity to date have been limited in their effectiveness. Providing authenticated encryption, or simultaneous encryption with integrity information, is important to protect data at rest. In this paper, we propose that disks augmented with non-volatile storage (e.g., hybrid hard...
The NAND flash memory based storage has faster read, higher power savings, and lower cooling cost compared to the conventional rotating magnetic disk drive. However, in case of flash memory, read and write operations are not symmetric. Write operations are much slower than read operations. Moreover, frequent update operations reduce the lifetime of the flash memory. Due to the faster read performance,...
Solid State Drives (SSDs) are now becoming a part of main stream computers. Even though disk scheduling algorithms and file systems of today have been optimized to exploit the characteristics of hard drives, relatively little attention has been paid to model and exploit the characteristics of SSDs. In this paper, we consider the use of SSDs from the file system standpoint. To do so, we derive a performance...
The I/O performances of flash memory solid-state disks (SSDs) are increasing by exploiting parallel I/O architectures. However, the reliability problem is a critical issue in building a large-scale flash storage. We propose a novel Redundant Arrays of Inexpensive Disks (RAID) architecture which uses the delayed parity update and partial parity caching techniques for reliable and high-performance flash...
Solid State Drives (SSD's) have shown promise to be a candidate to replace traditional hard disk drives, but due to certain physical characteristics of NAND flash, there are some challenging areas of improvement and further research. We focus on the layout and management of the small amount of RAM that serves as a cache between the SSD and the system that uses it. Of the techniques that have previously...
This paper presents our replacement algorithm named RED for storage caches. RED is exclusive. It can eliminate the duplications between a storage cache and its client cache. RED is high performance. A new criterion Resident Distance is proposed for making an efficient replacement decision instead of Recency and Frequency. Moreover, RED is non-intrusive to a storage client. It does not need to change...
Data deduplication systems discover and remove redundancies between data blocks. The search for redundant data blocks is often based on hashing the content of a block and comparing the resulting hash value with already stored entries inside an index. The limited random IO performance of hard disks limits the overall throughput of such systems, if the index does not fit into main memory. This paper...
The significant IO improvements of Solid State Disks (SSD) over traditional rotational hard disks makes it an attractive approach to integrate SSDs in tiered storage systems for performance enhancement. However, to integrate SSD into multi-tiered storage system effectively, automated data migration between SSD and HDD plays a critical role. In many real world application scenarios like banking and...
While there are many financial and practical reasons to prefer tape storage over disk for various applications, the difficultly of using tape in a general way is a major inhibitor to its wider usage. We present a file system that takes advantage of a new generation of tape hardware to provide efficient access to tape using standard, familiar system tools and interfaces. The Linear Tape File System...
As disk volume grows rapidly with terabyte disk becoming a norm, RAID reconstruction time in case of a failure takes prohibitively long time. This paper presents a new RAID architecture, S2-RAID, allowing the disk array to reconstruct very quickly in case of a disk failure. The idea is to form skewed sub RAIDs (S2-RAID) in the RAID structure so that reconstruction can be done in parallel dramatically...
Reduction of disk drive power consumption is a challenging task, particularly since the most prevalent way of achieving it, powering down idle disks, has many undesirable side-effects. Some hard disk drives support acoustic modes, meaning they can be configured to reduce the acceleration and velocity of the disk head. This reduces instantaneous power consumption but sacrifices performance. As a result,...
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at...
Power consumption is an increasingly impressing concern for data servers as it directly affects running costs and system reliability. Prior studies have shown most memory space on data servers are used for buffer caching and thus cache replacement becomes critical. Temporally concentrating memory accesses to a smaller set of memory chips increases the chances of free riding through DMA overlapping...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.