RAID 5 May Be Doomed in 2009
A story appearing online is forecasting the doom of RAID 5 in 2009. Apparently with storage capacities of modern SATA hard drives now reaching 2-terabytes in size, the odds of a read error during a RAID 5 disk reconstruction is becoming unavoidable.
According to Zdnet, SATA drives often have unrecoverable read rates (URE) of 10^14, which implies that disk drives will not be able to read a sector once every 100,000,000,000,000 bits read. With hard drive capacities expected to reach two-terrabytes in 2009, the odds of a read error become practically unavoidable when recovering from a 7-drive RAID 5 disk failure. Upon encountering such a read error during a reconstruction process, it is claimed that the array volume will be declared unreadable and the recovery processes will be halted. Apparently all 12-terrabytes of data stored on the drives will be lost... or at least will require some extra effort and knowledge to recover.
RAID 5 is described as a striped set with distributed parity, which protects against a single disk failure. When a drive fails in a RAID 5 set, the failed drive can be replaced, the data can be rebuilt from the distributed parity and the array can eventually be restored. If more than one drive fails however, the array will have data loss. For some, this can make the reconstruction process after a single drive failure a stressful event, as the array during that time will be vulnerable to more drive failures.
While using RAID 6 instead may seem like a solution, where RAID 6 is two drive failures are allowable instead of just one, the increased redundancy may not be cost effective. Also, as hard drive capacities continue to increase exponentially, year after year, even RAID 6 may soon become prone to the same problems. When single disk drives become 12-terrabytes in size, even a direct drive-to-drive copy may commonly encounter these read errors. The use of disk drives that have smaller capacities and improved unrecoverable read rates could be a solution to avoid these potential headaches.
The problem comes from the increasingly tight data density packed onto drive platters. Using traditional means, bit magnetic poles can often leak their polarity onto other adjacent bits, causing a switch in an otherwise normal bit. Manufacturers have switched to perpendicular recording methods to avoid such problems and increase density, but even this method has its physical limits. Manufacturers will have to find more creative solutions down the road if drives are going to exceed 2TB in size.
- Dell to Showcase New Instant-On Desktop Systems
- World's Smallest Notebook - Smaller Than a Netbook
- A Second Wind for MSI this November: Integrated 3.5G and XP
- Recalled Version of LittleBigPlanet Sells for $250 on Ebay
- Microsoft Announces Anti-Piracy Day
- Nvidia Announces Licensing of SLI by Big Players
- Samsung Claiming Thinnest LCD Screen at 7.9mm
- RIM Announces Blackberry App Store
- HP Announces Two New Smartphones
- Apple and Psystar to Settle Differences Out of Court
- eBay Bans Sale of Ivory After Holidays
- Yahoo! CEO Announces 10 Percent Reduction in Staff
- Apple Reports Fourth Quarter Results
- Google's Android OS Goes Open Source
- Nvidia Big Bang II ForceWare Beta Release
- EVGA Details X58 SLI FTW Board
- E3 2009 Announced: Return to Former Glory
- Apple Starts a University





creative methods... like SSDs?
Not a RAID expert, but shouldn't sticking to drives under 2-terabytes mitigate this problem and save money too?
I imagine that this problem would only be of significant concern to Enterprise RAID systems where data integrity and performance are critical, rather than for hobbyists. Most Enterprise systems generally use drives with capacities not much more than 20% (300GB) of the capacity of current consumer drives around 1.5TB.
Additionally should these issues be a concern, one method of addressing the capacity of such drives would be to split the physical drive using partitions and creating multiple smaller arrays - i.e. an array consisting of 2TB drives could be split into 4 smaller arrays consisting of 512GB Partitions, which could be used, maintained and rebuilt independantly to miminmise the impact of such errors.
Not to mention other creative methods of increasing the security of data.
Raid 6 shouldn't suffer the same issues at all. The chances of getting a read error on the same data on 2 separate disks is astronomical.
Okay first off I am not a hardware expert(hobbyist with a 6 disk RAID 5), but as far as I understand splitting disks into partitions then using said partitions to create the RAID(s) would create a bigger performance hit than leaving it alone, and you would still be stuck with the problems of disk failure(just worse if you use multiple partitions to make one RAID).
Wither said failure affects only one disk, or more than one is due to probability of numbers(more tightly packed bits means a higher probability of fault). also it doesn't get around the fear of having a second failure upon RAID reconstruction(RAID 6 does, slightly, but still same disk drive issues apply), and not to mention that if you were to have say 4 disks with 2 or more partitions per disk to create two RAID 5 arrays, both would need reconstruction upon a single disk failure, no?
Enterprise may be size wise similar, but thats where it stops... their stuff is, well was nicer last time I was in "that" loop.
I vaguely remember reading about a RAID 5 on SSD, and it had issues with sustained large file reading and writing... or I fell asleep while reading online news and drempt(sp?) it :S
Sorry if it sounded like I was rambling...
I am amazed that you guys dicover this now. It has probably been well hidden by RAID hardware vendors. So here are the facts (nothing new and nothing specifically related to 2TB drives):
On hard drive, sectors go bad routinely. When bad sectors are detected, the hard drive maps good sectors in their place. All drives have extra space for that. The catch here is the "when bad sectors are detected". If a sector is deemed good when you write to it but goes bad afterwards, you lose your data even if the sector is then replaced.
On a single disk, sectors containing data are lost in this way which results in damaged files. With one or two bad sectors in a single disk, it is probable you will never notice since the damaged files may not be used very often or can be a text file, an image or a movie which can take some damage before becomming noticeably broken.
Now, when you put these disks in a RAID 5 array, they will also end up with one or two bad sectors here and there. These sectors are not detected if they are never accessed. Then one drive fails, you replace it with a new one and launch the reconstruction. The reconstruction process will stop on the first bad sector it detects because it does not have enough redundancy to reconstruct it. Instead of causing damage to a single file, the one bad sector kills your entire RAID5 array.
This is not theory, it happened to me on a DELL server with a PERC RAID backplane.
It is also a well known fact to hardware makers (at least DELL). Not so long ago, DELL introduced on its server a service that would permanently scan disks in a RAID array to make sure all sectors were accessed from time to time and that bad sectors get detected quickly. This does not eliminate the problem but makes it less likely, albeit with a performance hit because your drives get scanned all the time.
In respect of Partitioning, there are some RAID controllers which support the splitting of disks into sections rather than simply using the entire contents of the disk. The benefits here would be that you can have 2 or more dicrete arrays of space, if one becomes corrupted the others should be unaffected by general data corruption. As far as performance is concerned one would arrange the partitions of the disk such that the least frequently used data is positioned on the outside of the disk and the more frequently used data is positioned near the centre, this then minimises the possibility of spanning. I am amazed that anyone would consider having just one huge partition on every disk in an array to be any more beneficial since this gives rise to significant distribution of data to avoid fragmentation. The OS will position data in the first space large enough to accomodate the data which means that over time data becomes spread across a significantly larger area. By using such partitioning even on simple disks without RAID, it is possible to organise the data more constructively.
I have seen a number of large corporations buy large capacity storage servers with hundreds if not thousands of disks and throw these into one (or more occasionally) two huge arrays. They then scratch their heads and wonder why performance is dismal, far below expectations, and blame the software and database for the performance issues, and yes can you imagine the look on their faces when their storage server suddenly looses a disk and fails to rebuild...
It was also reported a long time ago that the contents of disks may degrade over time through degrading of the strength of the magnetically encoded information in the surface of the disk. This coupled with the density and the potenial for adjacent bits to corrupt each other, requires that additional steps such as disk refreshes take place regularly as mentioned above.
At least the most sensible method of reinforcing against such data failure is to employ a mixture of mirroring and RAID 0, RAID 5 or RAID 6 where there may two or more complete duplicates of each RAID 0/5/6 array, thus protecting against failures in individual disks, and allowing rebuilds to take place, with minimal degredation to performance.
And for those that are more technically minded and might critisize this is uneconomical, or still fraught with potential for failure, the really paranoid individuals can construct each RAID 0/5/6 array over multiple controllers to guard against hardware failure and can have each mirrored copy in a completely separate storage array with redundant power, network and network switches, guarding against all aspects of hardware or power failure.
It just strikes me a little stupid though that rebuilding an array should be halted by a single bad sector. Imagine Chkdsk telling you that you have a bad sector and junking the entire hard disk because the Filesystem is not 100% intact. If a single sector or a small group of sectors are damaged or corrupted, then the system should alert the administrator, complete the rebuild, but highlight the files or data that is impacted by the damage. If the data is lost and is unrecoverable even from backups, then that is a separate issue, however the damaged hardware must be replaced and the system must continue on.
It's about cost-cutting bull, you see?
I got F'ed by seagate and their cheap hdds and hitachi before them... now i only buy WD, and i'm not even into the RAID business as thats more problems.
These lot just want to squeeze most out of the area they have, now they limit the hdds to the least platters to save cost. And i'm thinking damn the old hdds were built like a tank, and didn't loose that much data. It's as though they don't really give a, you know about peoples data, like its worthless, talk about disrespect.
Still keeping the hdds cooler seems to help things - so cooler the better.
I've lost data many times and know the dangers. Many of my friends think their ipod is invincible and there's no possibility of loosing or even being robbed, hence they live in a fantasy world!