Random Read/Write Speed

The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.

Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.

Desktop Iometer - 4KB Random Read (4K Aligned)

Low queue depth random read performance sees a significant regression compared to the Vertex 4. OCZ derives the Vector's specs at a queue depth of 32, at which it'll push 373MB/s of 4KB random reads. As Intel has established in the past, low queue depth random read performance of around 40 - 50MB/s is sufficient for most client workloads as we'll soon see in our trace based storage bench suite.

Desktop Iometer - 4KB Random Write (4K Aligned) - 8GB LBA Space

Low queue depth random write performance is a very different story, here the Vector pretty much equals the Vertex 4's already excellent score.

Many of you have asked for random write performance at higher queue depths. What I have below is our 4KB random write test performed at a queue depth of 32 instead of 3. While the vast majority of desktop usage models experience queue depths of 0 - 5, higher depths are possible in heavy I/O (and multi-user) workloads:

Desktop Iometer - 4KB Random Write (8GB LBA Space QD=32)

Crank up the queue depth and the Vector does well, but Samsung's SSD 840 Pro manages a nearly 10% performance advantage here.

Steady State 4KB Random Write Performance

OCZ will surely derive enterprise versions of the Vector and its Barefoot 3 controller, but I was curious to see what steady state 4KB random write performance looked like on the drive. I grabbed some of our Enterprise Iometer results from the S3700 review and trimmed out the non-SATA drives. The results are hugely improved compared to the Vertex 4:

Enterprise Iometer - 4KB Random Write

Keep in mind this isn't an enterprise drive, and thus it's not too surprising to see significantly higher numbers here from other enterprise drives but the improvement over the Vertex 4 is substantial. Note that Samsung's SSD 840 Pro lands somewhere in between the Vector and Vertex 4.

Introduction & The Drive Sequential IO Performance
POST A COMMENT

151 Comments

View All Comments

  • dj christian - Thursday, November 29, 2012 - link

    What is SZ80/100 in the graphs, what do they stand for? Reply
  • Anand Lal Shimpi - Wednesday, November 28, 2012 - link

    You are correct, I ran a 100% span of the 4KB/QD32 random write test. The right way to do this test is actually to gather all IO latency data until you hit steady state, which you can usually do on most consumer drives after just a couple of hours of testing. The problem is the resulting dataset ends up being a pain to process and present.

    There is definitely a correlation between spare area and IO consistency, particularly on drives that delay their defragmentation routines quite a bit. If you look at the Intel SSD 710 results you'll notice that despite having much more spare area than the S3700, consistency is clearly worse.

    As your results show though, for an emptier drive IO consistency isn't as big of a problem (although if you continued to write to it you'd eventually see the same issues as all of that spare area would get used up). I think there's definitely value in looking at exactly what you're presenting here. The interesting aspect to me is this tells us quite a bit about how well drives make use of empty LBA ranges.

    I tend to focus on the worst case here simply because that ends up being what people notice the most. Given that consumers are often forced into a smaller capacity drive than they'd like, I'd love to encourage manufacturers to pursue architectures that can deliver consistent IO even with limited spare area available.

    Take care,
    Anand
    Reply
  • jwilliams4200 - Wednesday, November 28, 2012 - link

    Anand wrote:
    "As your results show though, for an emptier drive IO consistency isn't as big of a problem (although if you continued to write to it you'd eventually see the same issues as all of that spare area would get used up)."

    Actually, all of my tests did use up all the spare area, and had reached steady state during the graph shown. Perhaps you have misunderstood how I did my tests. I just overprovisioned it so that it had almost as much spare area as the Intel S3700. Otherwise, I was doing the same thing as you did in your tests.

    The conclusion to be drawn is that the Intel S3700 is not all that special. You can approach the same performance as the S3700 with a consumer SSD, at least with a Samsung 840 Pro, just by overprovisioning enough.

    Look at this one again:

    http://i.imgur.com/Vvo1H.png

    It reaches steady state somewhere between 80 and 120GB. The spare area is used up at about 62GB and the speed drops precipitously, but then there is a span where the speed actually increases slightly, and then levels out somewhere around 80-120GB.

    Note that steady state is about 110MB/sec. That is about 28K IOPS. Not as good as the Intel S3700, but certainly approaching it.
    Reply
  • Ictus - Wednesday, November 28, 2012 - link

    Hey J, thanks for taking the time to reply to me in the other comment.
    I think my question is even more noobish than you have assumed.

    "I just overprovisioned it so that it had almost as much spare area as the Intel S3700. Otherwise, I was doing the same thing as you did in your tests."

    I am confused because I thought the only way to "over-provision" was to create a partition that didn't use all the available space??? If you are merely writing raw data up to the 80% full level, what exactly does over provisioning mean? Does the term "over provisioning" just mean you didn't fill the entire drive, or you did something to the drive?
    Reply
  • jwilliams4200 - Wednesday, November 28, 2012 - link

    No, overprovisioning generally just means that you avoid writing to a certain range of LBAs (aka sectors) on the SSD. Certainly one way to do that is to create a partition smaller than the capacity of the SSD. But that is completely equivalent to writing to the raw device but NOT writing to a certain range of LBAs. The key is that if you don't write to certain LBAs, however that is accomplished, then the SSD's flash translation table (FTL) will not have any mapping for those LBAs, and some or all SSDs will be smart enough to use those unmapped-LBAs as spare area to improve performance and wear-leveling.

    So no, I did not "do something to the drive". All I did was make sure that fio did not write to any LBAs past the 80% mark.
    Reply
  • gattacaDNA - Sunday, December 02, 2012 - link

    "The conclusion to be drawn is that the Intel S3700 is not all that special. You can approach the same performance as the S3700 with a consumer SSD, at least with a Samsung 840 Pro, just by overprovisioning enough."

    WOW - this is an interesting discussion which concludes that by simply over-provisioning a consumer SSD by 20-30% those units can approach the vetted S3700! I had to re-read those posts 2x to be sure I read that correctly.

    It seems some later posts state that if the workload is not sustained (drive can recover) and the drive is not full, that the OP has little to no benefit.

    So is an best bang really just not fill the drives past 75% of the available area and call it a day?
    Reply
  • jwilliams4200 - Sunday, December 02, 2012 - link

    The conclusion I draw from the data is that if you have a Samsung 840 Pro (or similar SSD, I believe several consumer SSDs behave similarly with respect to OP), and the big one -- IF you have a very heavy, continuous write workload, then you can achieve large improvements in throughput and huge improvements in maximum latency if you overprovision at 80% (i.e., leave 20% unwritten or unpartitioned)

    Note that such OP is not needed for most desktop users, for two reasons. First, most desktop users will not fill the drive 100% and as long as they have TRIM working, and if the drive is only filled to 80% (even if the filesystem covers all 100%), then it should behave as if it were actually overprovisioned at 80%. Second, most desktop users do not continuously write tens of Gigabytes of data without pause.
    Reply
  • gattacaDNA - Sunday, December 02, 2012 - link

    Thank You. That's what my take-away is as well. Reply
  • jwilliams4200 - Wednesday, November 28, 2012 - link

    By the way, I am not sure why you say the data sets are "a pain to process and present". I have written some test scripts to take the data automatically and to produce the graphs automatically. I just hot-swap the SSD in, run the script, and then come back when it is done to look at the graphs.

    Also, the best way to present latency data is in a cumulative distribution function (CDF) plot with a normal probability scale on the y-axis, like this:

    http://i.imgur.com/RcWmn.png

    http://i.imgur.com/arAwR.png

    One other tip is that it does not take hours to reach steady state if you use a random map. This means that you do a random write to all the LBAs, but instead of sampling with replacement, you keep a map of the LBAs you have already written to and don't randomly select the same ones again. In other words, write each 4K-aligned LBA on a tile, put all the tiles in a bag, and randomly draw the tiles out but do not put the drawn tile back in before you select the next tile. I use the 'fio' program to do this. With an SSD like the Samsung 840 Pro (or any SSD than can do 300+ MB/s 4KQD32 random writes), you only have to write a little more than the capacity of the SSD (eg., 256GB + 7% of 256GB) to reach steady state. This can be done in 10 or 20 minutes on fast SSDs.
    Reply
  • Brahmzy - Wednesday, November 28, 2012 - link

    I consistently over-provision every single SSD I use by at least 20%. I have had stellar performance doing this with 50-60+ SSDs over the years.

    I do this on friend's/family's builds and tell anybody I know to do this with theirs. So, with my tiny sample here, OP'ing SSDs is a big deal, and it works. I know many others do this as well. I base my purchase decisions with OP in mind. If I need 60GB of space, I'll buy a 120GB. If I need 120GB of usable space, I'll buy a 250GB drive, etc.

    I think it would be valuable addition to the Anand suite of tests to account for this option that many of us use. Maybe a 90% OP write test and maybe an 80% OP write test. Assuming there's a constitent difference between the two.
    Reply

Log in

Don't have an account? Sign up now