Endurance Ratings: How They Are Calculated

One of the questions I quite often face is about the manufacturers' endurance ratings. Go back two or three years and nobody had any endurance limits in their client SSDs but every SSD released in the past year or so has an endurance limitation associated with it. Why did that happen? Let's open up the situation a bit.

A few years ago, many enterprises would just go and buy regular consumer SSDs and use them in their servers. Generally there is nothing wrong with that because there are scenarios where enterprises can get by with client-grade hardware, but the problem was that a share of the enterprises knew that the drives weren't durable enough for their needs. However, they also knew that if they wore out the drive before the warranty ran out, the manufacturer would have to replace it.

Obviously that wasn't very good business for the manufacturers because for one drive sold, more than one had to be given away for free. At the same time less customers were buying the more expensive, high profit enterprise drives. Without disrupting the client market by either increasing prices or reducing quality, the manufacturers decided to start including a maximum endurance rating, which would invalidate the warranty if exceeded.

The equation for endurance is rather simple. All you need to take into account is the capacity of the drive, the P/E cycles of the NAND and the wear leveling and write amplification factors. When all that is put into an equation, it looks like this:

Notice that the correct term for TBW is TeraBytes Written, not TotalBytes Written although both are fairly widely used. The hardest part in calculating the TBW is figuring out the wear leveling and write amplification factors because these are workload depedent. Hence manufacturers often use a worst case 4KB random write scenario to come up with the TBW figure as this ensures that the end-user cannot have a more demanding workload with higher write amplification.

For the uninitiated, the wear leveling factor (WLF) in this context means the maximum stress that the wear leveling method would put onto the most heavily cycled block compared to the average number of cycles. A factor of two would mean that the most heavily cycled block would have twice the number of cycles compared to the average. Write amplification factor (WAF), on the other hand, refers to the ratio of host and NAND writes. A factor of two would in this case mean that for every megabyte that the host writes, two megabytes are written to the NAND. These two factors go hand in hand in the sense that a small WLF results in higher WAF because the drive will do more internal reorganization operations to cycle all blocks equally, which consumes NAND writes.

The interesting part about TBWs is that they actually give us a way to estimate the combined wear leveling and write amplification factor of the drive. In the case of 120GB M500DC, that would be a surprising 0.72x. Obviously you can go lower than 1x without using some form of compression but the 120GB M500DC actually has 192GiB of NAND onboard that extends the endurance. If we used that figure to calculate the combined WLF and WAF, it would be 1.24x, which is much more reasonable. For some reason the JEDEC spec defines the capacity as the usable capacity even for endurance calculations but in the end it doesn't matter what figure you change as they are all related to each other (e.g. with 120GB used as the capacity, the P/E cycles could be higher than 3,000 because the over-provisioned NAND adds cycles).

Ultimately none of the manufacturers are willing to disclose the exact details of how they calculate their endurance ratings but at the high-level this is how it's done according to JEDEC's standards. Furthermore, I wouldn't rule out the possibility that some OEMs artificially lower the ratings for their consumer drives just to make sure they are not used by enterprises. In the end, there isn't really a way for us to find out whether the TBW is accurate or not since the efficiency factors are not easily measurable by third parties like us.

Micron M500DC: Features Performance Consistency
POST A COMMENT

37 Comments

View All Comments

  • Samus - Wednesday, April 23, 2014 - link

    I think the price is ridiculous, nearly twice as expensive as the reliable Intel S3500 and almost as expensive as the uber-superior S3700. Makes no sense. Reply
  • ZeDestructor - Wednesday, April 23, 2014 - link

    Lot's of lack of time in some sections...

    Granted, new benchmarks, but IMO that should be split off to a seperate article and the entire thing delayed for publishing to get the tests done. Otherwise, excellent revewing as always.
    Reply
  • okashira - Wednesday, April 23, 2014 - link

    If you want a drive with good speed , low price and amazing endurance, just pick up a used Samsung 830 for cheap.
    People have tested them to 25,000 cycles. That's 10+ PB for a 512GB drive, for just $300 or less. And I suspect their data retention is superior as well.
    Reply
  • Solid State Brain - Wednesday, April 23, 2014 - link

    Thing is, while older consumer drives with quality MLC NAND might appear to have an exceptional P/E rating until failure (which occurs when wear is so high that the data retention gets so short the uncorrectable bit error rate so extreme that the controller can't keep the drive in a working state anymore, not even when powered), there's no way their manufacturer will guarantee such usage.

    On a related note, all consumer Samsung 840 drives (with TLC memory) I've seen pushed through in stress endurance testings posted on the internet have reached at least ~3200-3500 P/E cycles until failure and didn't start show any SMART error before 2800-2900 cycles, which means that the approximate ~1800-2000 P/E rating (for the stated TBW endurance with sequential workloads) for TLC-NAND datacenter enterprise Samsung SSDs drives (@ a 3 months data retention) makes much sense. But again, no way Samsung will offer any guarantee for such usage with consumer or workstation drives. they will just tell you they are tested for consumer/light workloads.

    Real endurance figures for NAND memory in the SSD market has to be one of the industry's best kept secrets.
    Reply
  • AnnonymousCoward - Friday, April 25, 2014 - link

    Ever think of doing a real world test, measuring "time"? Everyone should know synthetic benchmarks for hard drives are meaningless. Why don't you do a roundup of drives and compare program load time, file copy time, boot time, and encoding time. Am I a freakin genius to think of this? Reply
  • MrPoletski - Saturday, April 26, 2014 - link

    Why does every single performance consistency graph say 4KB random write QD 32? Reply
  • markoshark - Sunday, April 27, 2014 - link

    I'm wondering if any testing is done with a 30/70 read/write ratio - Most i've seen is 70% read.
    With enterprise drives, they are often rebadged and used in SANs - Would be interesting to see how they compare in write-intensive enviroments (VDI)
    Reply

Log in

Don't have an account? Sign up now