Raid level: what is it and how does it work?
It allows to improve system performance and increase the security level of data storage, it’s RAID, not the famous anti-insect spray, but the acronym of Redundant Array of Inexpensive Disk (redundant set of independent disks). A method to manage data securely thanks to the use of two or more hard disks in parallel.
By saving the same data on multiple disks, I/O operations (Input / Output) can overlap, thus ensuring an improvement in the performance of the system.
There are different RAID levels, each optimized for a specific situation.
It was first described by David Patterson, Garth Gibson and Randy Katz in the article “A Case for Redundant Arrays of Inexpensive Disks” published in 1988.
The intention of the three computer scientists was to demonstrate that using two or more dated and cheap hard drives was more convenient, performing and safe than using a single hard drive. Apparently, Patterson, Gibson and Katz were absolutely right.
How does RAID work?
The management of performance, security and tolerance against possible failures, varies depending on the strategy chosen. So, to understand which RAID configuration is best suited to your needs, it would be appropriate to understand how many and what types of RAID exist.
There are the basic RAID types, ranging from level 0 to level 7, and the nested RAID types. They are nothing more than the basic RAID types combined between them which therefore exploit the characteristics of one or the other type.
Below is an overview of the most popular RAID levels (home and business): RAID 0, RAID 1, RAID 5, RAID 6 and RAID 10.
RAID level 0 (Disk striping)
RAID 0, sometimes called striping, corresponds to the first and most basic level of RAID, called level 0. The data to be saved are divided into strings of equal size (striping unit) and allocated sequentially in the various disks that make up the RAID system.
In this case data redundancy isn’t assured but only an improvement in system performance, given that it’s possible to read and write to multiple disks at the same time. RAID 0 is used when you need to create a small number of virtual disks from a large number of physical hard disks. It’s useful, therefore, when you need to create NFS servers in one location and when data redundancy is irrelevant.
Advantages of RAID level 0
RAID 0 offers great performance, both in read and write operations. There is no overhead caused by parity controls. All storage capacity is used, there is no overhead and the technology is easy to implement.
However, it’s not a true RAID because level 0 offers no guarantee against any failure, in fact if a disk is broken all the data will be immediately lost. In this case, moreover, the probability that a disk may fail increases proportionally to the number of disks used.
RAID 0 is ideal for non-critical storage of data that have to be read/written at a high speed, such as on an image retouching or video editing station.
If you want to use RAID 0 purely to combine the storage capacity of twee drives in a single volume, consider mounting one drive in the folder path of the other drive. This is supported in Linux, OS X as well as Windows and has the advantage that a single drive failure has no impact on the data of the second disk or SSD drive.
RAID level 1 (Mirroring)
RAID 1, or mirrored data array, creates an exact copy (mirror) of all data on one disk on another of “support”. In this case, redundancy is preferred to performance Data are stored twice by writing them to both the data drive (or set of data drives) and a mirror drive (or set of drives).
Data are stored twice by writing them to both the data drive (or set of data drives) and a mirror drive (or set of drives). If a drive fails, the controller uses either the data drive or the mirror drive for data recovery and continues operation. You need at least 2 drives for a RAID 1 array.
RAID 1 offers excellent read speed and a write-speed that is comparable to that of a single drive. In case a drive fails, data don’t have to be rebuild, they just have to be copied to the replacement drive. RAID 1 is a very simple technology.
The main disadvantage is that the effective storage capacity is only half of the total drive capacity because all data get written twice.
Software RAID 1 solutions don’t always allow a hot swap of a failed drive. That means the failed drive can only be replaced after powering down the computer it is attached to. For servers that are used simultaneously by many people, this may not be acceptable. Such systems typically use hardware controllers that do support hot swapping.
RAID 1 is ideal for mission critical storage, for instance for accounting systems. It’s also suitable for small servers in which only two data drives will be used.
RAID level 5
RAID level 5 can be considered, to all intents and purposes, the most common secure RAID level. It requires at least 3 drives but can work with up to 16. Data blocks are striped across the drives and on one drive a parity checksum of all the block data is written. The parity data aren’t written to a fixed drive, they are spread across all drives, as the drawing below shows. Using the parity data, the computer can recalculate the data of one of the other data blocks, should those data no longer be available. That means a RAID 5 array can withstand a single drive failure without losing data or access to data. Although RAID 5 can be achieved in software, a hardware controller is recommended. Often extra cache memory is used on these controllers to improve the write performance.
Read data transactions are very fast while write data transactions are somewhat slower (due to the parity that has to be calculated).
If a drive fails, you still have access to all data, even while the failed drive is being replaced and the storage controller rebuilds the data on the new drive.
Drive failures have an effect on throughput, although this is still acceptable. This is complex technology. If one of the disks in an array using 4TB disks fails and is replaced, restoring the data (the rebuild time) may take a day or longer, depending on the load on the array and the speed of the controller. If another disk goes bad during that time, data are lost forever.
RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance. It’s ideal for file and application servers that have a limited number of data drives.
RAID level 6 (Striping with double parity)
RAID 6 is like RAID 5 but the parity data are written to two drives. In this way it’s possible to resist the simultaneous failure of two disks, unlike RAID level 5 which tolerates at most only one.
In RAID level 6 the minimum number of disks goes up to four while the actual capacity is equal to that of the smaller disk multiplied by the total number of disks minus two.
For example, if you want to build a RAID level 6 with four disks of 2 terabytes each, in this case the actual capacity will be equal to 2 TB x (4 – 2 disks) = 2 TB x 2 disks, or 4 terabytes.
Unfortunately, however, just like in RAID level 5, even in RAID level 6, damaging a single disk affects the overall performance of the entire RAID system. Therefore, even in this case, restoring the entire RAID structure can take quite a long time.
RAID 6 is a good all-round system that combines efficient storage with excellent security and decent performance. It’s preferable over RAID 5 in file and application servers that use many large drives for data storage.
RAID level 10
Thanks to the use of this particular structure, the RAID level 10 allows to have very high performances, based on the number of branches present in the RAID level 0, with an equally high level of security, based on the number of branches present in level 1 RAID.
In this case, it’s therefore possible to use all those applications that require high performance and, at the same time, fault tolerance. Regarding the number of disks, in RAID level 10 at least four disks are needed, while, as far as the actual capacity is concerned, in this case it’s equal to that of the smaller disk multiplied by the total number of disks present then dividing all for two.
If, for example, you want to build a RAID level 10 with four 2 TB disks each, in this case the actual capacity will be equal to (2 TB x 4 disks) / 2 = (8 TB) / 2 = 4 TB.
RAID 10 is the ideal choice for database management solutions that need to read and write a large number of small files on a volume’s disks. The exceptional levels of IOPS and data protection offered by the RAID 10 level guarantee excellent reliability of database management solutions in terms of both file protection and access speed.