TR-09-2.pdf

"Understanding intrinsic characteristics and system implications of
flash memory based solid state drives",


Feng Chen, David Koufaty, and Xiaodong Zhang

Proceedings of 2009 ACM SIGMETRICS Conference on Measurement and Modeling
of Computer Systems}, (SIGMETRICS/Performance 2009), Seattle, WA, June 15-19,
2009.


Abstract

Flash Memory based Solid State Drive (SSD) has been called a "pivotal 
technology" that could revolutionize data storage systems. Since SSD shares 
a common interface with the traditional hard disk drive (HDD), both 
physically and logically, an effective integration of SSD into the storage 
hierarchy is very important. However, details of SSD hardware implementations 
tend to be hidden behind such narrow interfaces. In fact, since 
sophisticated algorithms are usually, of necessity, adopted in SSD controller 
firmware, more complex performance dynamics are to be expected in SSD than 
in HDD systems. Most existing literature or product specifications on 
SSD just provide high-level descriptions and standard performance data, 
such as bandwidth and latency. 

In order to gain insights into the unique performance characteristics of SSD, 
we have conducted intensive experiments and measurements on different types 
of state-of-the-art SSDs, from low-end to high-end products. We have 
observed several unexpected performance issues and uncertain behavior of 
SSDs, which have not been reported in the literature. For example, we found 
that fragmentation could seriously impact performance  by a factor of over 
14 times on a recently announced SSD. Moreover, contrary to the common 
belief that accesses to SSD are uncorrelated with access patterns, we 
found a strong correlation between performance and the randomness of data 
accesses, for both reads and writes. In the worst case, average latency 
could increase by a factor of 89 and bandwidth could drop to only 0.025MB/sec. 
Our study reveals several unanticipated aspects in the performance dynamics 
of SSD technology that must be addressed by system designers and 
data-intensive application users in order to effectively place it in the 
storage hierarchy.