Inside the Drive

I already went over the S3700's underlying architecture, including the shift from a B-tree indirection table to a direct mapped flat indirection table which helped enable this increase in performance consistency. I'll point you at that article for more details as to what's going on underneath the hood. For high level drive details, the excerpts below should give you most of what you need.

The S3700 comes in four capacities (100, 200, 400 and 800GB) and two form factors (2.5" and 1.8"). The 1.8" version is only available at 200GB and 400GB capacities. Intel sees market potential for a 1.8" enterprise SSD thanks to the increasing popularity of blade and micro servers. The new controller supports 8 NAND channels, down from 10 in the previous design as Intel had difficulty hitting customer requested capacity points at the highest performance while populating all 10 channels. 6Gbps SATA and AES-256 are both supported by the new controller.

The S3700's chassis is 7.5mm thick and held together by four screws. The PCB isn't screwed in to the chassis instead Intel uses three plastic spacers to keep the board in place once the drive is put together. Along one edge of the drive Intel uses two 35V 47µF capacitors, enough to allow the controller to commit any data (and most non-data) to NAND in the event of a power failure. The capacitors in the S3700 are periodically tested by the controller. In the event that they fail, the controller disables all write buffering and throws a SMART error flag. Intel moved away from surface mount capacitors with the S3700 to reduce PCB real estate, which as you can see is very limited on the 2.5" drive. The S3700 supports operation on either 12V, 5V or both power rails - a first for Intel. Power consumption is rated at up to 6W under active load (peak power consumption can hit 8.2W), which is quite high and will keep the S3700 from being a good fit for a notebook.

The S3700 is a replacement to the Intel SSD 710 (the 710 will be EOLed sometime next year), and thus uses Intel's 25nm HET-MLC (High Endurance Technology) NAND. The S3700 is rated for full 10 drive writes per day (4KB random writes) for 5 years.

Intel SSD DC S3700 Endurance (4KB Random Writes, 100% LBA)
  100GB 200GB 400GB 800GB
Rated Endurance 10DW x 5 years 10DW x 5 years 10DW x 5 years 10DW x 5 years
Endurance in PB 1.825 PB 3.65 PB 7.3 PB 14.6 PB

That's the worst case endurance on the drive, if your workload isn't purely random you can expect even more writes out of the S3700 (around 2x that for a sequential workload). Intel sent us a 200GB sample which comes equipped with 264GB of 25nm Intel HET-MLC NAND. Formatted capacity of the drive is 186GB in Windows, giving you a total of 78GB (78GiB technically) of spare area for wear leveling, block recycling, redundancy and bad block replacement. Note that the percent of overprovisioning on the S3700 is tangibly less than on the 710:

Intel HET-MLC SSD Overprovisioning Comparison
  Advertised Capacity Total NAND on-board Formatted Capacity in Windows MSRP
Intel SSD 710 200GB 320GB 186GB $800
Intel SSD DC S3700 200GB 264GB 186GB $470

Intel is able to guarantee longer endurance on the S3700 compared to the 710, with less spare area and built using the same 25nm HET-MLC NAND technology. The key difference here is the maturity of the process and firmware/controller. Both have improved to the point where Intel is able to do more with less.

 

Because of the odd amount of NAND on board, there are 14 x 16GB, 1 x 32GB and 1 x 8GB NAND packages on this 200GB PCB. Each package uses between 1 and 4 8GB NAND die. Note that this is Intel's first SSD to use BGA mounted NAND devices. The controller itself is also BGA mounted, the underfill from previous generations is gone. The 8-channel controller is paired with 256MB of DDR3-1333 DRAM (the second pad is for a second DRAM used for the 800GB drive to reach 1GB of total DRAM capacity). Intel does error correction on all memories (NAND, SRAM and DRAM) in the S3700.

Pricing is much more reasonable than the Intel SSD 710. While the 710 debuted at around $6.30/GB, the Intel SSD DC S3700 is priced at $2.35/GB. It's still more expensive than a consumer drive, but the S3700 launches at the most affordable cost per GB of any Intel enterprise SSD. A non-HET version would likely be well into affordable territory for high-end desktop users. The S3700 is sampling to customers now, however widespread availability won't be here until the end of the year/beginning of Q1 2013.

Intel SSD DC S3700 Pricing (MSRP)
  100GB 200GB 400GB 800GB
Price $235 $470 $940 $1880

The S7300's performance is much greater than any previous generation Intel enterprise SATA SSD:

Enterprise SSD Comparison
  Intel SSD DC S3700 Intel SSD 710 Intel X25-E Intel SSD 320
Capacities 100 / 200 / 400 / 800GB 100 / 200 / 300GB 32 / 64GB 80 / 120 / 160 / 300 / 600GB
NAND 25nm HET MLC 25nm HET MLC 50nm SLC 25nm MLC
Max Sequential Performance (Reads/Writes) 500 / 460 MBps 270 / 210 MBps 250 / 170 MBps 270 / 220 MBps
Max Random Performance (Reads/Writes) 76K / 36K 38.5K / 2.7K IOPS 35K / 3.3K IOPS 39.5K / 600 IOPS
Endurance (Max Data Written) 1.83 - 14.6PB 500TB - 1.5PB 1 - 2PB 5 - 60TB
Encryption AES-256 AES-128 - AES-128
Power Safe Write Cache Y Y N Y

Intel is also promising performance consistency with its S3700. At steady state Intel claims the S3700 won't vary its IOPS by more than 10 - 15% for the life of the drive. Most capacities won't see more than a 10% variance in IO latency (or performance) at steady state. Intel has never offered this sort of a guarantee before because its drives would vary quite a bit in terms of IO latency.

Intel also claims to be able to service 99.9% of all 4KB random IOs (QD1) in less than 500µs:

The lower latency operation is possible through the use of a flat indirection table (and some really well done firmware) in the new controller. What happens now is there's a giant array with each location in the array mapped to a specific portion of NAND. The array isn't dynamically created and, since it's a 1:1 mapping, searches, inserts and updates are all very fast. The other benefit of being 1:1 mapped with physical NAND is that there's no need to defragment the table, which immediately cuts down the amount of work the controller has to do. Drives based on this new controller only have to keep the NAND defragmented (the old controller needed to defragment both logical and physical space).

The downside to all of this is the DRAM area required by the new flat indirection table. The old binary tree was very space efficient, while the new array is just huge. It requires a large amount of DRAM depending on the capacity of the drive. In its largest implementation (800GB), Intel needs a full 1GB of DRAM to store the indirection table. By my calculations, the table itself should require roughly 100MB of DRAM per 100GB of storage space on the drive itself. There's a bit of space left over after you account for the new indirection table. That area is reserved for a cache of the controller's firmware so it doesn't have to read from slow flash to access it.

Once again, there's no user data stored in the external DRAM. The indirection table itself is physically stored in NAND (just cached in DRAM), and there are two large capacitors on-board to push any updates to non-volatile storage in the event of power loss.

It sounds like a simple change, but building this new architecture took quite a bit of work. With a drive in hand, we can put Intel's claims to the test. And there's one obvious place to start...

Introduction Consistent Performance: A Reality?
POST A COMMENT

30 Comments

View All Comments

  • JonnyDough - Thursday, November 15, 2012 - link

    There are a ton of new technologies that could replace NAND. There might even be a "betamax" or "HD DVD" in there that miss the mark and lose out to some better or cheaper tech. We'll just have to wait and see what comes to market and catches on. It won't be mere enthusiasts or gamers who decide, it will be the IT industry. It usually is. Reply
  • mckirkus - Tuesday, November 06, 2012 - link

    On interesting point to note is that if you run benchmarks on a RAMDisk, you get random 4k write IOPS in the neighborhood of 600MB/s. So in that regard, flash has a long way to go before the 6Gbit/s limitations of SATA 3.0 really hurt enterprise performance. Reply
  • extide - Tuesday, November 06, 2012 - link

    I am not sure I understand this. First of all random 4K against a ramdisk will be HIGHLY dependent on the hardware, and I am sure you could see wayy better numbers than 600MB/sec. Also, 600MB/sec is pretty close to 6Gbit/sec, anyways. Reply
  • jwilliams4200 - Friday, November 09, 2012 - link

    I think mckirkus is trying to say that there is a lot of headroom before sustained 4KiB random I/O SSD throughput will saturate a SATA 6Gbps link.

    For example, the sustained QD32 4KiB random write speed for the S3700 is apparently less than 150MB/s (35K IOPS). It will need to double and double again before it saturates a 6Gbps SATA link
    Reply
  • mayankleoboy1 - Saturday, November 10, 2012 - link

    How long do we have to wait before SATA Express drives and interface get commercial ? Reply
  • justaviking - Saturday, November 10, 2012 - link

    If I read this the "Update" section correctly, Oracle recommends modifying their settings to change the way the log files are written.

    Would it be possible to re-run the the Swingbench tests using the modified settings? I'd love to see how performance changes, especially on THIS drive, and then also on some others for comparison purposes.
    Reply
  • blackbrrd - Saturday, November 10, 2012 - link

    I am guessing most people will run their Oracle database behind a raid card with some nvram to cache, which would remove the problem if the raid controller combined the writes. It would be interesting to see the performance behind a typical raid controller card with nvram cache. Reply
  • iwod - Sunday, November 11, 2012 - link

    I am a regular Anandtech Reader, ( actually it is on my RSS Feeds so i read it everyday ) and i dont ever record Anand doing a Review on Toshiba SSD. So when i saw the performance of the MK4001 i had to look it up in Google to know it is an SAS SLC Enterprise SSD.

    The article did eventually have a brief mention of its Spec. But i thought it was very late in the article. Would have help it the spec was actually listed out before hand.

    It seems to me the Magic is actually in the software and not the hardware. A 1:1 mapping of NAND data Address table making Random Read and Write a consistent behaviour seems more like Software magic and could easily be made on any other SSD Controller with enough amount of RAM in it. The only hardware side of things that requires this tweak is ECC Memory.

    And again we are fundamentally limited by Port Speed.
    Reply
  • mmrezaie - Monday, November 12, 2012 - link

    I agree! Reply
  • alamundo - Monday, November 12, 2012 - link

    Given the enterprise focus, this drive seems to be competitive with the Intel 910 PCI card. It would be interesting to see the 3700 benchmarked against the 910. Reply

Log in

Don't have an account? Sign up now