TR-05-14.pdf

``DULO: an effective buffer cache management scheme to exploit both
temporal and spatial localities"

Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang

Proceedings of the 4th USENIX Conference on File and Storage Technologies
(FAST'05), San Francisco, CA, December 14-16, 2005.


Abstract

A hard disk drive has a non-uniform access performance that is heavily
affected by the locations of requested blocks and disk head during
accesses. Sequentiality of requested blocks on disks, or their spatial
locality, is critical to the performance of disks, where accesses to a
sequentially placed disk blocks can have an order of magnitude higher
bandwidth than that of the accesses to the randomly placed blocks.
While I/O scheduling and prefetching can effectively exploit spatial
locality and dramatically improve disk bandwidth for workloads with
dominant sequential accesses, their ability to deal with workloads mixed
with sequential and random data accesses, such as those in Web services,
databases, and scientific computing applications, is very limited.

Since spatial locality of cached blocks is largely ignored and only
temporal locality is considered in system buffer cache management, disk
performance for workloads without dominant sequential accesses can be
seriously degraded. To address this problem, we propose a scheme called
DULO (DUal LOcality), which exploits both temporal and spatial localities
in buffer cache management.  Leveraging the filtering effect of the buffer
cache, DULO can influence the I/O request stream by making the requests
passed to disk more sequential, significantly increasing the effectiveness
of I/O scheduling and prefetching for disk performance improvements.

DULO has been intensively evaluated by both trace-driven simulations and
its implementation in the most recent Linux kernel 2.6.11. In simulations
and system measurements, various application workloads have been tested,
including Web Server, TPC benchmarks, and scientific application programs.
Our experiments show it can significantly increase system throughput
and reduce program execution times.