| PreviousNetwork Address Translation | Protection - RAID Systems | NextUninterruptible Power Supply |
RAID technology (acronym for Redundant Array of Inexpensive Disks, or sometimes Redundant Array of Independent Disks) allows user to form one storage unit from several hard drives. The created unit (called a cluster) is therefore highly fault-tolerant (high-availability) or has a higher I/O capacity. The distribution of data on several hard drives allows for increased data security and more reliable associated services.
This technology was developed in 1987 by three researchers (Patterson, Gibson and Katz) at the University of California (Berkeley). Since 1992, the RAID Advisory Board has managed these specifications. This consists in putting together a large capacity (and therefore expensive) drive with the help of smaller, cheaper drives (meaning that the MTBF, Mean Time Between Failure, is small).
According to RAID technology, the assembled drives can be used in different ways, which are called RAID Levels. The University of California defined 5 levels, which were assigned the levels of 0 to 6. Each one of these levels describes the manner in which the data are distributed over the drives:
Each of these levels constitutes a way of using the cluster, according to:
The RAID-0 level, called striping (which is sometimes mistakenly called stripping) consists in storing data by spreading them out over all of the cluster's drives. This level had no redundancy and therefore is not fault-tolerant. Indeed, if one of the drives fails, all of the data divided up over all the drives will be lost.
However, given that each drive of the cluster has its own controller, this solution offers a higher data rate.
RAID-0 consists of the logical juxtaposition (aggregation) of several physical hard drives. In RAID-0 mode, data are written in stripes:
|
|
|
The term "striping" is used to characterize the relative size of the fragments (stripes) stored on each physical unit. The average output depends on this factor (the smaller the stripe, the better the output).
If one of the elements of the cluster is bigger than the others, the system for filling the drives with data will be blocked when the smaller disk is full. Therefore, the final size is equal to double to capacity of the smaller of the two drives:
![]() |
It is recommended that two drives of identical size be used for RAID-0 because otherwise, the drive with the larger capacity will not be fully exploited. |
The goal of level 1 is to duplicate the information and store it on several drives. The terms mirroring or shadowing are used to describe this procedure.
|
|
|
Conversely, RAID-1 technology is very expensive given that only half of the storage capacity is in fact being used.
Level RAID-2 is now obsolete because it uses Hamming code for error correction (ECC codes - Error Correction Code). Hamming code is now directly integrated in hard drive controllers.
This technology consists in storing data according to the same principle as in RAID-0 but by writing the ECC check bits on a separate unit (normally 3 ECC drives are used for 4 drives of data).
RAID 2 technology offers mediocre performances but a high level of security.
Level 3 RAID technology stores data in bytes on each drive and devotes one of the drives to storing a parity bit.
|
|
|
|
Level 4 RAID technology is very similar to level 3. The difference is in the parity level: level 4 uses block level striping with a dedicated parity disk, whereas level 3 uses byte-level striping. More precisely this means that the striping is different from RAID 3.
|
|
|
|
In order to read a reduced number of blocks, the system does not have to access multiple physical drives but only those on which the data are actually stored. Conversely, the drive hosting the control data must have an access time that is equal to the sum of access time of the other disks so as to not limit the performance of the whole.
Level 5 is similar to level 4, i.e. parity is calculated at the block level but is spread over all of the cluster's drives.
|
|
|
|
That way, RAID 5 greatly improves access to data (both in writing and reading) because access to parity bits is spread over the cluster's different drives.
RAID-5 provides performances that are very close to those obtained in RAID-0 while ensuring high fault tolerance. This is why it is one of the best RAID modes in terms of performance and reliability.
![]() |
Given that the usable drive space in a cluster of n drives is equal to n-1 drives, it is best to have a large number of drives in order to make RAID-5 "cost-effective". |
Level 6 was added to the levels defined by the Berkeley researchers. It defines the use of two functions of parity and their storage on two dedicated drives. This level ensures redundancy in case both drives are damaged simultaneously. This means that at least 4 drives are needed to implement a RAID-6 system.
The RAID solutions that are generally used are levels 1 and 5.
Choosing a RAID solution depends on three criteria:
There are several different ways to implement a RAID solution on a server: