Wednesday 18 June 2008

Understanding RAID


RAID: Redundant Array of Independent (or Inexpensive) Disks, RAID can be Software, Hardware or a combination of both. Generally speaking, Software RAID tends to offer duplication or mirroring, whilst Hardware RAID offers Parity-based protection.

Software RAID uses more system resources as more disk ports and channels are required and it is subject to additional load during write and copy operations. Software RAID may have a lower cost than hardware RAID because it has no dedicated RAID controller, but may not have the same hotfix or performance capabilities. Software RAID is needed for mirroring to remote locations.

Hardware RAID offloads Parity generation and checking from the host, and also leaves the host unaffected by internal operations such as rebuilds. Hardware RAID allows for greater disk capacity per disk port. Hardware RAID requires the expense of a RAID controller per subsystem. Hardware RAID systems themselves can also be mirrored with software mirroring.

RAID Levels:
RAID 0 (Striped): RAID Level 0 requires minimum of 2 drives to implement.

RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written onto separate disk.

I/O performance is greatly improved by spreading the I/O load across many drives.

Best performance is achieved when data is striped across multiple controllers with only one drive per controller.

No parity calculation overhead is involved.

Very simple design & easy to implement.

Failure of just one drive will result in all data in an array being lost.

RAID 1:
RAID 1 (Mirroring): RAID Level 1 requires minimum of 2 drives to implement.

RAID level 1 provides fault tolerance. This level is also known as disk mirroring.
All data written to the primary disk is written to the mirror disk. It also generally improves read performance (but may degrade write performance).

One Write or two Reads possible per mirrored pair.

100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement disk.

Simplest RAID storage subsystem design.

May not support hot swap of failed disk when implemented in "software".

RAID 2:
RAID 2 (Hamming Code ECC) :

RAID level 2 uses error correcting algorithm that employs disk-striping strategy that breaks a file into bytes and spreads it across multiple disks. The error-correction method requires several disks.

It is not as efficient as other RAID levels and is not generally used.

"On the fly Data error correction"

Extremely high data transfer rates possible

No commercial implementations exist / not commercially viable

RAID 3:
RAID 3 (Parallel Transfer with Parity):

RAID level 3 is similar to RAID level 2, because it uses the same striping method as level 2, but it requires only one disk for parity data. RAID 3 suffers from a write bottleneck, because all parity data is written to a single drive, but provides some read and write performance improvement.

Stripe parity is generated on Writes, recorded on the parity disk and checked on Reads.

RAID Level 3 requires a minimum of 3 drives to implement.

Very high Read & Write data transfer rate.

Controller design is fairly complex.

RAID 4:
RAID 4 (Independent Data Disks with Shared Parity Disk) :

Each entire block is written onto a data disk. Parity for same rank blocks is generated on Writes, recorded on the parity disk and checked on Reads.

Very high Read data transaction rate

Quite complex controller design

Worst Write transaction rate and Write aggregate transfer rate

RAID 5:
RAID 5 (Independent Data Disks with Distributed Parity Blocks):

RAID level 5 is known as striping with parity. This is the most popular RAID level.
The data redundancy is provided by the parity information. The data and parity information are arranged on the disk array so that the two are always on different disks. RAID level 5 has better performance than RAID level 1 and provides fault tolerance.

Highest Read data transaction rate.

Medium Write data transaction rate.


No comments: