RAID Primer: What's in a number?

Name: RAID Primer: What's in a number?
Item: RAID Primer: What's in a number?
Author: Dave Robinet

by Dave Robinet on September 7, 2007 12:00 PM EST

Posted in
Storage

41 Comments | Add A Comment

41 Comments

RAID 0

RAID 0 takes two or more disk drives and writes data in a "stripe" across each disk. Data is accessed by requesting the stripe from the array, resulting in the disks more or less simultaneously feeding their portion of the data back to the controller. The overall capacity of the array is equal to the sum of the formatted capacities of all drives, and disk usage is more or less spread evenly among all drives in the array.

The net result is that the system will see much faster sustained transfer rates for both read and write operations compared to a single drive. File access time, however, is not measurably improved by leveraging multiple disks in a RAID 0 set, which means that systems which require frequent access of small, non-contiguous files (as is often the case in desktop configurations) generally do not benefit from RAID 0.

RAID 0 is an excellent choice for video editing and large-scale "solving" applications, where large files need to be read and written in a continuous manner.

Perhaps the greatest drawback to RAID 0 is that the arrays are rendered inaccessible when a single drive in the array fails. In that sense, RAID 0 isn't actually RAID at all, as it lacks the "Redundant" part of the equation. Data reliability and retention is decreased exponentially as drives are added to a RAID 0 setup, so unless frequent backups are made - or if the data is not regarded as even remotely important - RAID 0 should be approached with caution.

Pros:

Excellent streaming performance
Maximum capacity available for users (sum of all disks)

Cons:

No redundancy of data
Negligible performance benefits for many users

RAID 1

RAID 1 sits at the other extreme of the spectrum. It makes a continuous copy of all data from one disk (which is written to and read from by the system) onto another physical disk which is in "standby" mode. This "standby" disk is held in reserve by the controller for when a failure is detected on the first disk. At that point in time, the controller "fails over" to the second disk in the system, with all data still available to the user.

While RAID 1 usually offers no performance benefits (and indeed, it often slightly degrades performance in some situations), it does increase the uptime of the host computer by allowing it to remain online even after a disk in the system has failed. This makes it an extremely popular option for mirroring operating systems on enterprise-class servers, and for small office users without the need for massive amounts of data storage but a requirement for constant uptime.

Higher quality RAID 1 controllers can outperform single drive implementations by making both drives active for read operations. This can in theory reduce file access times (requests are sent to whichever drive is closer to the desired data) as well as potentially doubling data throughput on reads (both drives can read different data simultaneously). Most consumer RAID 1 controllers do not provide this level of sophistication, however, resulting in performance that is at best slightly worse than what would be achieved with a single drive. Software RAID 1 solutions also lack support for reading from both drives in a RAID 1 set simultaneously.

Pros:

Redundancy of data
Lowest cost data redundancy available (one additional disk)
Simple operations make it easy to implement solution using software only

Cons:

Poor usage of drive capacity (only 50% of purchased hard drive capacity available)
Typically no performance benefit over a single hard disk

Index Data Striping and Parity

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

41 Comments

View All Comments

munim - Friday, September 7, 2007 - link
I don't understand what the parity portion is in RAID 5, anyone care to explain?
drebo - Friday, September 7, 2007 - link
Simply: It's a peice of data that the RAID controller can use to calculate the value of either of the other pieces of data in the chunk in the event of a disk failure.

Example: You have a three disk RAID 5 array. A file gets written in two pieces. Piece A gets written to disk one. Piece B gets written to disk two. The parity between the two is then generated and written to disk three. If disk one dies, the RAID controller can use Piece B and the parity data to generate what would have been Piece A. If disk two dies, the controller can generate Piece B. If disk three dies, the controller still has the original pieces of data. Thus, any single disk can fail before any data loss can occur.

RAID 5 is what is known as a distributed parity system, so the disk that holds parity alternated with each write. If a second file is written in the above example, disk one would get Piece A, disk two would get the parity, and disk three would get Piece B. This ensures that regardless of which disk dies, you always have two of the three pieces of data, which is all you need to get the original.
Zan Lynx - Tuesday, September 11, 2007 - link
The reason RAID-5 uses distributed parity is to balance the disk accesses.

Most RAID-5 controllers do not read the parity data except during verification operations or when the array is degraded.

By rotating the parity blocks between disks 1-3, read operations can use all three disks instead of having a parity-only disk which is ignored by all reads.
ChronoReverse - Friday, September 7, 2007 - link
I understand how the fault tolerance in the best case is half the drives in the 1+0 scenario, but that's still not worse than the RAID 5 scenario where you can't lose more than 1 drive no matter what.

So why was RAID 5 given a "Good" description while RAID 10/01 given a "Minimal" description?
drebo - Friday, September 7, 2007 - link
RAID 0+1 (also known as a mirror of stripes) turns into a straight RAID 0 after one disk dies. The only way it will support a two disk failure is if disks on the same leg of the mirror die. If one on each side dies, you lose everything. After one disk failure, you lose all remaining fault tolerance. RAID 10 (or a stripe of mirrors) will sustain two disk failures if the disks are on different legs of the array. If it loses both disks on a single leg, you lose everything. Thus, it is far more likely that you'll lose the wrong two disks.

In a RAID 5 array, any single disk can be lost and you'll not lose anything--its position is irrelevant. Not only that, but a RAID 5 has better disk-to-storage efficiency (nx-1) when compared to RAID 0+1 and RAID 10 (nx/2). It's also less expensive to implement.

Overall, RAID 5 is one of the best fault tolerant features you can put into a system. RAID 6 is better, but it is also much more expensive.
ChronoReverse - Saturday, September 8, 2007 - link
So in RAID 5 you can lose any ONE drive.

In RAID 1+0/0+1, you can also lose any ONE drive.

In RAID 5 you CANNOT lose a second drive.

In RAID 1+0/0+1, there's a CHANCE you might survive a second drive failure.

Therefore, RAID 5 is more fault-tolerant.
Brovane - Saturday, September 8, 2007 - link
Yes apparently to some people. Also one of the big bonus to Raid 0+1 is that you lose a drive and you do not suffer any performance degradation unlike RAID5 which until the RAID is rebuilt after a drive failure you take a big performance hit. If you are running a Exchange Cluster you cannot afford to take this performance hit during the middle of a busy work day unless you really do not like your helpdesk people. I think the one argument you could make is that a RAID 0+1 has more drives in a array to offer the same amount of storage as a RAID 5 volume so maybe you could make the statistical argument that the RAID 0+1 could be less fault tolerant. However to me this seems very tenuous.
Dave Robinet - Saturday, September 8, 2007 - link
Good comments, and thanks for reading.

You are, however, taking things a little out of context. Take a 6 drive configuration, for example. If you do a RAID 5 with four drives and two hotspares, you'll end up with the same usable capacity as a 6 disk RAID 0+1 - but with the "ability" to lose 3 drives.

Your comment about rebuilding is, however, completely backwards. You spend FAR more time rebuilding the mirrored set of a RAID 0+1 after a failed disk, because you need to rebuild the entire mirrored portion of array once again (since data has presumably changed, there's no parity, etc). (Don't believe me? Try it. ;)

Your general observation about being able to lose one disk in one or the other configuration is correct. You do need to compare apples-to-apples - a RAID 5 will offer you far more capacity if you build the same number of drives as in a 0+1, and a 0+1 will give you more performance. Apples-to-apples, though, you're going to get better redundancy OPTIONS out of the additional RAID 5 flexibility than you will with a 0+1.

Again, though, good points, and thanks again for reading.

dave
ChronoReverse - Sunday, September 9, 2007 - link
What do you mean by hotspares? Do you mean a drive that you can swap in immediately after a drive fails? If that's the case, while in terms of economics it might be three "spares" in terms of actual data redundancy, it's still just one drive. If a drive fails while the spare is filling, you've still lost data.

In any case, it's quite clear that RAID will certainly give you more capacity, but the question was about the comment about data redundancy. I'll have to specify that it means to me how safe my data is in terms of absolute drive failures.
Brovane - Monday, September 10, 2007 - link
If you are using a SAN (Storage Area Network) were you might have say over 100+ disks were you build storage groups to assign to your servers that are connected into the SAN. This SAN will usually have various combinations of RAID 0,1,1_0,5. In this SAN you might have various types of disks says 300GB 10K FC and 146GB 15K FC Disks. You will keep a couple of these disks as hot spare. If say at 2AM the SAN detects in one of your RAID 1_0 Storage Groups that a disk has failed it will grab one of the hot spare disks and start re-building the RAID. The SAN will usually also send a alert of the various support teams that this has happened so the bad disk can be replaced. The SAN doesn't care where the hot spare is plugged into the SAN.

The biggest issue that I see as a ding against RAID 5 vs RAID 0+1 is the performance hit when a drive fails in a RAID 5. With a RAID 0+1 you suffer no performance hit when a drive fails because there is no parity rebuilding. With RAID 5 you can take a good performance hit until the RAID is rebuilt because of the parity calculation. Also with a SAN setup you can mirror a RAID 0+1 between physical DAE so the storage group will still stay up in the unlikely event of a complete DAE failure. Also even though in a RAID 0+1 you will have to rebuild the complete disk in the event of a drive failure with 15K RPM and 4GB FC backbone on the SAN this happens faster than you would think even when dealing with 500GB volumes. if you very concerned about losing another disk before the rebuild is complete you could use SnapView on your SAN to take a SnapShot of the disk and copy this data to your backup volume.

RAID Primer: What's in a number?

Post Your Comment

41 Comments

View All Comments

munim - Friday, September 7, 2007 - link

drebo - Friday, September 7, 2007 - link

Zan Lynx - Tuesday, September 11, 2007 - link

ChronoReverse - Friday, September 7, 2007 - link

drebo - Friday, September 7, 2007 - link

ChronoReverse - Saturday, September 8, 2007 - link

Brovane - Saturday, September 8, 2007 - link

Dave Robinet - Saturday, September 8, 2007 - link

ChronoReverse - Sunday, September 9, 2007 - link

Brovane - Monday, September 10, 2007 - link

Log in

Don't have an account? Sign up now