TR-07-6.pdf

``DiskSeen: exploiting disk layout and access history to enhance I/O prefetch" 
 
Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang

Proceedings of the 2007 USENIX Annual Technical Conference, (USENIX'07), 
Santa Clara, California, June 17-22, 2007. 

Abstract

Current disk prefetch policies in major operating systems track access 
patterns at the level of the file abstraction. While this is useful for 
exploiting application-level access patterns, file-level prefetching cannot 
realize the full performance improvements achievable by prefetching. 
There are two reasons for this. First, certain prefetch opportunities can 
only be detected by knowing the data layout on disk, such as the contiguous 
layout of file metadata or data from multiple files. Second, non-sequential 
access of disk data (requiring disk head movement) is much slower than 
sequential access, and the penalty for mis-prefetching a `random' block, 
relative to that of a sequential block, is correspondingly more costly.
To overcome the inherent limitations of prefetching at the logical file level, 
we propose to perform prefetching directly at the level of disk layout, and 
in a portable way. Our technique, called DiskSeen, is intended to be 
supplementary to, and to work synergistically with, file-level prefetch 
policies, if present. DiskSeen tracks the locations and access times of disk 
blocks, and based on analysis of their temporal and spatial relationships, 
seeks to improve the sequentiality of disk accesses and overall prefetching 
performance.

Our implementation of the DiskSeen scheme in the Linux 2.6 kernel shows that 
it can significantly improve the effectiveness of prefetching, reducing 
execution times by 20%-53% for micro-benchmarks and real applications such 
as grep, CVS, and TPC-H.