Investigation: Is Your SSD More Reliable Than A Hard Drive?

Data Center Feedback: Fewer Than 100 SSDs

Cost per gigabyte continues to be the barrier that prevents even large institutions from procuring thousands of SSDs at a time. But just because we don’t have access to truly massive deployments of solid-state drives doesn’t mean we can’t shed some light on SSD reliability in the real world based on smaller organizations. We put the call out to many of our IT friends and we managed to get some interesting feedback from a few data centres.

No Support Linux Hosting doesn’t get specific about the number of drives it has installed, but company reps tell us it uses "an extensive number" of SSDs. We know they’re dealing with fewer than 100 SSDs, and usage is broken down in the following manner:

  • 40 GB X25-Vs are used as mirrored boot volumes for blade serves and ZFS servers
  • 160 GB X25-Ms are used as cache (L2ARC) drives in ZFS servers.
  • 32 GB X25-Es are used as mirrored ZIL volumes in ZFS servers

All of these drives have seen at least one year of use, and some have recently passed the two-year mark. As of this writing, the company hasn't seen any failures.

When we asked what benefits the company is seeming from SSDs that couldn't be achieved with mechanical storage, we received the following response: "With ZFS and hybrid storage, SSD drives allow for huge performance increases over old-style spinners. We still use spinners for primary storage, so we are able to retain most of the cost benefits of spinners, while getting the performance benefits of SSDs. Eventually, we are planning to switch all of our SANs to purely SSD-based storage. For 2011, we will stick with hybrid storage using ZFS."

InterServer only uses SSDs in its database servers. Specifically, it has Intel's X25-E (SSDSA2SH032G1GN) in its Xeon machines to take full advantage of high data throughput. How much performance are we talking about? InterServer tells us it is achieving an average of 4514 MySQL queries per second. On an older Xeon server equipped with IDE-based drives, it's looking at roughly 200-300 MySQL queries per second. We know these drives have been in use since 2009, and there are no reported failures thus far.

InterServer provided the following statement concerning SSD use.

Intel SSD's are night and day in failure rates when it comes to some other drives. For example the SuperTalent SSD drives have had an extremely high failure rate including model FTM32GL25H, FTM32G225H, and FTM32GX25H. I estimate about two-thirds of these drives have failed since being put into service. With these failures however, the drives were not recoverable at all. They generally disappeared completely, no longer being readable. Spinners die much more gracefully with an easier disk recovery. I cannot compare this to the Intel's SSDs yet since I have not experienced any failures.
Create a new thread in the UK Article comments forum about this subject
This thread is closed for comments
Comment from the forums
    Your comment
  • AlexIsAlex
    The 'drive completely dead, data unrecoverable' failure mode is not the worst; I can restore yesterday's image and lose, at most, a day's data (acceptable for my usage - obv. tailor backup frequency etc. to what's acceptable to you).

    The worst is what happened to my last SSD. For weeks I thought the problems I was seeing were software issues: the occasional crash, the odd SxS error in the event log, a game failing Steam file validation, an
    old email showing half garbled. Eventually, I managed to diagnose the problem.

    Old, untouched, files on the SSD were being corrupted at a very low rate (a few bytes per GB, I'd estimate). A file could be written and verified after writing, but days later might fail a checksum test when read. Without any error notification, SMART or otherwise, to indicate that the data read was anything other than perfect.

    Now that was a problem. Who knows when the last backup image without any corruption was? How can you even tell? The vast majority of files will be fine, but some will be backed up corrupt, and may have been for some time. With much manual effort I eventually did recover everything important, but my new backup regime involves checksumming everything on the SSD weekly. If something has changed data but not changed timestamp, this time I'm going to get some red flags!

    I can't say for certain that this failure mode is SSD specific, but it happened on my first SSD, and never on any of my spinners. Not enough data to be statistically significant, but enough to make me cautious.
  • Anonymous
    Can second the findings with regard to OCZ Vertex 2 drives. Mine has just gone and without any warning - all data lost after a year of light use. OCZ are completely useless in helping to fix it. It's like they know that their SSDs fail a lot and aren't at all surprised. Have gone onto Intel 320 SSD based on the findings.
  • dyvim
    Thanks Andrew, that's an interesting article even for a layman operating a single SSD ^^
    So far my OCZ Vertex 2 is doing fine, but then failure is always only a probability. System drives shouldn't be used to store important data in my eyes anyways.
    If not having mechanical parts doesn't really lower the percentage of dying drives, that only means that backup is just as important (and as often forgotten) as it always was.
  • Anonymous
    Good news: this website (http: ) we has been updated and add products and many things they abandoned their increases are welcome to visit our website. Accept cash or credit card payments, free transport. You can try oh, will make you satisfied.
    Tshirt price is $12Jeans price is $34