TR-11-2.pdf

"Essential roles of exploiting internal parallelism of flash memory based
solid state drives in high-speed data processing",


Feng Chen, Rubao Lee, and Xiaodong Zhang 

Proceedings of 17th International Symposium on High Performance  
Computer Architecture (HPCA-17), San Antonio, Texas, February 12-16, 2011.

Abstract

Flash memory based solid state drives (SSDs) have
shown a great potential to change storage infrastructure
fundamentally through their high performance and low
power. Most recent studies have mainly focused on addressing
the technical limitations caused by special requirements
for writes in flash memory. However, a unique merit of an
SSD is its rich internal parallelism, which allows us to offset
for the most part of the performance loss related to technical
limitations by significantly increasing data processing
throughput.

In this work we present a comprehensive study of essential
roles of internal parallelism of SSDs in high-speed data
processing. Besides substantially improving I/O bandwidth
(e.g. 7.2x), we show that by exploiting internal parallelism,
SSD performance is no longer highly sensitive to access
patterns, but rather to other factors, such as data access 
interferences and physical data layout. Specifically, through
extensive experiments and thorough analysis, we obtain the
following new findings in the context of concurrent data
processing in SSDs. (1) Write performance is largely independent
of access patterns (regardless of being sequential
or random), and can even outperform reads, which is
opposite to the long-existing common understanding about
writes on SSDs. (2) One performance concern comes from
interference between concurrent reads and writes, which
causes substantial performance degradation. (3) Parallel
I/O performance is sensitive to physical data-layout mapping,
which is largely not observed without parallelism. (4)
Existing application designs optimized for magnetic disks
can be suboptimal for running on SSDs with parallelism.
Our study is further supported by a group of case studies
in database systems as typical data-intensive applications.
With these critical findings, we give a set of recommendations
to application designers and system architects for
exploiting internal parallelism and maximizing the performance
potential of SSDs.