RAID Technology Concepts

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

RAID TECHNOLOGY

RAID Technology Concepts


There are several key concepts in RAID :
1. Mirroring (copying data to more than one hard disk)
2. Striping and also error correction, where data redundancy is stored to allow errors and
problems to be detected and possibly corrected (more commonly referred to as fault
tolerance Techniques).

The main purpose of using RAID is to increase the reliability that is critical to protecting
critical information for multiple business fields, such as databases, or even performance
improvements, which are very important for multiple jobs, as well as to present video on
demand to many audience at once.

HOW RAID WORKS:


The year was 1987. The size of a disk drive was measured in megabytes, not gigs or
terabytes. An 80-MB drive was considered a bit of a luxury.

A team of computer scientists at the University of California at Berkeley had just


released a paper suggesting it might be a good idea to string together a number of average-
size disk drives to derive larger capacities that would otherwise require high-end and very
expensive equipment.
For instance, since there was no such thing as a 800-MB disk drive, why not stack up
ten 80-MB drives in a rack and design a special disk controller that would make the whole
thing look as if it were a single device with 10 times the capacity?

That, in fact, wasn't rocket science. The technology was sufficiently advanced to make
this happen without too much trouble.

The real problem was the greater probability of hardware failure. Indeed, every disk
drive has a certain probability of failure. To use simplistic numbers, let's say that a typical disk
drive has a 1% chance of failing within the first year. Now, if you group 10 of these in an array,
you now have 10 times the odds that one of them will fail within a year, or a 10% probability
of failure.

RAID LEVELS:
RAID 0 (also known as a stripe set or striped volume) splits ("stripes") data evenly across two
or more disks, without parity information, redundancy, or fault tolerance. Since RAID 0
provides no fault tolerance or redundancy, the failure of one drive will cause the entire array
to fail; as a result of having data striped across all disks, the failure will result in total data loss.
This configuration is typically implemented having speed as the intended goal. RAID 0 is
normally used to increase performance, although it can also be used as a way to create a large
logical volume out of two or more physical disks.
A RAID 0 setup can be created with disks of differing sizes, but the storage space added
to the array by each disk is limited to the size of the smallest disk. For example, if a 120 GB
disk is striped together with a 320 GB disk, the size of the array will be 120 GB × 2 = 240 GB.
However, some RAID implementations allow the remaining 200 GB to be used for other
purposes.
RAID 1:
It consists of an exact copy (or mirror) of a set of data on two or more disks; a classic
RAID 1 mirrored pair contains two disks. This configuration offers no parity, striping, or
spanning of disk space across multiple disks, since the data is mirrored on all disks belonging
to the array, and the array can only be as big as the smallest member disk. This layout is useful
when read performance or reliability is more important than write performance or the
resulting data storage capacity.
The array will continue to operate so long as at least one member drive is operational.

RAID 2:
Which is rarely used in practice, stripes data at the bit (rather than block) level, and
uses a Hamming code for error correction. The disks are synchronized by the controller to
spin at the same angular orientation (they reach index at the same time), so it generally
cannot service multiple requests simultaneously. However, depending with a high
rate Hamming code, many spindles would operate in parallel to simultaneously transfer data
so that "very high data transfer rates" are possible as for example in the DataVault where 32
data bits were transmitted simultaneously.
With all hard disk drives implementing internal error correction, the complexity of an
external Hamming code offered little advantage over parity so RAID 2 has been rarely
implemented; it is the only original level of RAID that is not currently used.

RAID 3:
Which is rarely used in practice, consists of byte-level striping with a
dedicated parity disk. One of the characteristics of RAID 3 is that it generally cannot service
multiple requests simultaneously, which happens because any single block of data will, by
definition, be spread across all members of the set and will reside in the same physical
location on each disk. Therefore, any I/O operation requires activity on every disk and usually
requires synchronized spindles.
This makes it suitable for applications that demand the highest transfer rates in long
sequential reads and writes, for example uncompressed video editing. Applications that make
small reads and writes from random disk locations will get the worst performance out of this
level.
The requirement that all disks spin synchronously (in a lockstep) added design
considerations to a level that provided no significant advantages over other RAID levels, so it
quickly became useless and is now obsolete. Both RAID 3 and RAID 4 were quickly replaced
by RAID 5. RAID 3 was usually implemented in hardware, and the performance issues were
addressed by using large disk caches.
RAID 4:
Consists of block-level striping with a dedicated parity disk. As a result of its layout,
RAID 4 provides good performance of random reads, while the performance of random writes
is low due to the need to write all parity data to a single disk.
In diagram 1, a read request for block A1 would be serviced by disk 0. A simultaneous
read request for block B1 would have to wait, but a read request for B2 could be serviced
concurrently by disk.

RAID 5:
Consists of block-level striping with distributed parity. Unlike in RAID 4, parity
information is distributed among the drives. It requires that all drives but one be present to
operate. Upon failure of a single drive, subsequent reads can be calculated from the
distributed parity such that no data is lost. RAID 5 requires at least three disks.
In comparison to RAID 4, RAID 5's distributed parity evens out the stress of a dedicated
parity disk among all RAID members. Additionally, write performance is increased since all
RAID members participate in the serving of write requests. Although it won't be as efficient
as a striping (RAID 0) setup, because parity must still be written, this is no longer a bottleneck.
Since parity calculation is performed on the full stripe, small changes to the array
experience write amplification: in the worst case when a single, logical sector is to be written,
the original sector and the according parity sector need to be read, the original data is
removed from the parity, the new data calculated into the parity and both the new data sector
and the new parity sector are written.

RAID 6:
Extends RAID 5 by adding another parity block; thus, it uses block-level striping with
two parity blocks distributed across all member disks.
According to the Storage Networking Industry Association (SNIA), the definition of
RAID 6 is: "Any form of RAID that can continue to execute read and write requests to all of a
RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods,
including dual check data computations (parity and Reed-Solomon), orthogonal dual parity
check data and diagonal parity, have been used to implement RAID Level 6."

APPLICATIONS:
RAID — which stands for Redundant Array of Inexpensive Disks is a technology that employs
the simultaneous use of two or more hard disk drives to achieve greater levels of
performance, reliability, and/or larger data volume sizes.
This provides good performance for video imaging, geophysics, life sciences, or other
sequential processing applications.

You might also like