The shift in focus from processor-centric to data-centric applications is driving the increased importance of data storage systems. At the same time, the problem of low throughput and fault tolerance characteristic of such systems has always been quite important and always required a solution.

In the modern computer industry, magnetic disks are widely used as a secondary data storage system, because, despite all their shortcomings, they have the best characteristics for the corresponding type of device at an affordable price.

Features of the technology for constructing magnetic disks have led to a significant discrepancy between the increase in performance of processor modules and the magnetic disks themselves. If in 1990 the best among serial ones were 5.25″ drives with an average access time of 12 ms and a latency time of 5 ms (at a spindle speed of about 5,000 rpm 1), then today the palm belongs to 3.5″ drives with an average access time of 5 ms and delay time 1 ms (at spindle speed 10,000 rpm). Here we see an improvement in technical characteristics of about 100%. At the same time, processor performance increased by more than 2,000%. This is largely possible because the processors have the direct benefits of using VLSI (Very Large Scale Integration). Its use not only makes it possible to increase the frequency, but also the number of components that can be integrated into the chip, which makes it possible to introduce architectural advantages that allow parallel computing.

1 - Average data.

The current situation can be characterized as a secondary storage system I/O crisis.

Increasing performance

The impossibility of significantly increasing the technological parameters of magnetic disks entails the need to search for other ways, one of which is parallel processing.

If you arrange a block of data across N disks of some array and organize this placement so that it is possible to read information simultaneously, then this block can be read N times faster (without taking into account the block formation time). Since all data is transferred in parallel, this architectural solution is called parallel-access array(array with parallel access).

Parallel arrays are typically used for applications that require large data transfers.

Some tasks, on the contrary, are characterized by a large number of small requests. Such tasks include, for example, database processing tasks. By distributing database records across array disks, you can distribute the load by positioning the disks independently. This architecture is usually called independent-access array(array with independent access).

Increasing fault tolerance

Unfortunately, as the number of disks in an array increases, the reliability of the entire array decreases. With independent failures and an exponential distribution law of time between failures, the MTTF of the entire array (mean time to failure) is calculated using the formula MTTF array = MMTF hdd /N hdd (MMTF hdd is the mean time to failure of one disk; NHDD is the number disks).

Thus, there is a need to increase the fault tolerance of disk arrays. To increase the fault tolerance of arrays, redundant coding is used. There are two main types of encoding that are used in redundant disk arrays - duplication and parity.

Duplication, or mirroring, is most often used in disk arrays. Simple mirror systems use two copies of data, each copy located on separate disks. This scheme is quite simple and does not require additional hardware costs, but it has one significant drawback - it uses 50% of the disk space to store a copy of information.

The second way to implement redundant disk arrays is to use redundant encoding using parity calculation. Parity is calculated by XORing all the characters in the data word. Using parity in redundant disk arrays reduces overhead to a value calculated by the formula: HP hdd =1/N hdd (HP hdd - overhead; N hdd - number of disks in the array).

History and development of RAID

Despite the fact that storage systems based on magnetic disks have been produced for 40 years, mass production of fault-tolerant systems began only recently. Redundant disk arrays, commonly called RAID (redundant arrays of inexpensive disks), were introduced by researchers (Petterson, Gibson and Katz) at the University of California, Berkeley in 1987. But RAID systems became widespread only when disks that were suitable for use in redundant arrays became available and sufficiently productive. Since the white paper on RAID in 1988, research into redundant disk arrays has exploded in an attempt to provide a wide range of cost-performance-reliability trade-offs.

There was an incident with the abbreviation RAID at one time. The fact is that at the time of writing this article, all disks that were used in PCs were called inexpensive disks, as opposed to expensive disks for mainframes (mainframe computers). But for use in RAID arrays, it was necessary to use quite expensive equipment compared to other PC configurations, so RAID began to be deciphered as redundant array of independent disks 2 - a redundant array of independent disks.

2 - Definition of RAID Advisory Board

RAID 0 was introduced by the industry as the definition of a non-fault-tolerant disk array. Berkeley defined RAID 1 as a mirrored disk array. RAID 2 is reserved for arrays that use Hamming code. RAID levels 3, 4, 5 use parity to protect data from single faults. It was these levels, including level 5, that were presented at Berkeley, and this RAID taxonomy was adopted as a de facto standard.

RAID levels 3,4,5 are quite popular and have good disk space utilization, but they have one significant drawback - they are only resistant to single faults. This is especially true when using a large number of disks, when the likelihood of simultaneous downtime of more than one device increases. In addition, they are characterized by a long recovery, which also imposes some restrictions on their use.

Today, a fairly large number of architectures have been developed that ensure the operation of the array even with the simultaneous failure of any two disks without data loss. Among the whole set, it is worth noting two-dimensional parity and EVENODD, which use parity for encoding, and RAID 6, which uses Reed-Solomon encoding.

In a scheme using dual-space parity, each data block participates in the construction of two independent codewords. Thus, if a second disk in the same codeword fails, a different codeword is used to reconstruct the data.

The minimum redundancy in such an array is achieved with an equal number of columns and rows. And is equal to: 2 x Square (N Disk) (in “square”).

If the two-space array is not organized into a “square,” then when implementing the above scheme, the redundancy will be higher.

The EVENODD architecture has a fault tolerance scheme similar to dual-space parity, but a different placement of information blocks that guarantees minimal redundant capacity utilization. As in dual-space parity, each data block participates in the construction of two independent codewords, but the words are placed in such a way that the redundancy coefficient is constant (unlike the previous scheme) and is equal to: 2 x Square (N Disk).

By using two check characters, parity and non-binary codes, the data word can be designed to provide fault tolerance when a double fault occurs. This design is known as RAID 6. Non-binary code, based on Reed-Solomon encoding, is typically computed using tables or as an iterative process using closed-loop linear registers, a relatively complex operation requiring specialized hardware.

Considering that the use of classic RAID options, which provide sufficient fault tolerance for many applications, often has unacceptably low performance, researchers from time to time implement various moves that help increase the performance of RAID systems.

In 1996, Savage and Wilks proposed AFRAID - A Frequently Redundant Array of Independent Disks. This architecture to some extent sacrifices fault tolerance for performance. In an attempt to compensate for the small-write problem typical of RAID level 5 arrays, it is possible to leave striping without parity calculation for a certain period of time. If the disk designated for parity recording is busy, parity recording is delayed. It has been theoretically proven that a 25% reduction in fault tolerance can increase performance by 97%. AFRAID effectively changes the failure model of single fault tolerant arrays because a codeword that does not have updated parity is susceptible to disk failures.

Instead of sacrificing fault tolerance, you can use traditional performance techniques such as caching. Given that disk traffic is bursty, you can use a writeback cache to store data when the disks are busy. And if the cache memory is made in the form of non-volatile memory, then, in the event of a power failure, the data will be saved. In addition, deferred disk operations make it possible to randomly combine small blocks to perform more efficient disk operations.

There are also many architectures that sacrifice volume to increase performance. Among them are delayed modification on the log disk and various schemes for modifying the logical placement of data into the physical one, which allow you to distribute operations in the array more efficiently.

One of the options - parity logging(parity registration), which involves solving the small-write problem and more efficient use of disks. Parity logging defers parity changes to RAID 5 by recording them in a FIFO log, which is located partly in the controller's memory and partly on disk. Given that accessing a full track is on average 10 times more efficient than accessing a sector, parity logging collects large amounts of modified parity data, which are then written together to a disk dedicated to storing parity across the entire track.

Architecture floating data and parity(floating and parity), which allows the physical placement of disk blocks to be reallocated. Free sectors are placed on each cylinder to reduce rotational latency(rotation delays), data and parity are allocated to these free spaces. To ensure operation during a power failure, the parity and data map must be stored in non-volatile memory. If you lose the placement map, all data in the array will be lost.

Virtual stripping- is a floating data and parity architecture using writeback cache. Naturally realizing the positive sides of both.

In addition, there are other ways to improve performance, such as RAID operations. At one time, Seagate built support for RAID operations into its drives with Fiber Chanel and SCSI interfaces. This made it possible to reduce traffic between the central controller and the disks in the array for RAID 5 systems. This was a fundamental innovation in the field of RAID implementations, but the technology did not get a start in life, since some features of Fiber Chanel and SCSI standards weaken the failure model for disk arrays.

For the same RAID 5, the TickerTAIP architecture was introduced. It looks like this - the central control mechanism originator node (initiator node) receives user requests, selects a processing algorithm and then transfers disk work and parity to the worker node (work node). Each worker node processes a subset of the disks in the array. As in the Seagate model, worker nodes transfer data among themselves without the participation of the initiating node. If a worker node fails, the disks it served become unavailable. But if the codeword is constructed in such a way that each of its symbols is processed by a separate worker node, then the fault tolerance scheme repeats RAID 5. To prevent failures of the initiating node, it is duplicated, thus we get an architecture that is resistant to failures of any of its nodes. For all its positive features, this architecture suffers from the “write hole” problem. Which means an error occurs when several users change the codeword at the same time and the node fails.

We should also mention a fairly popular method for quickly restoring RAID - using a free disk (spare). If one of the disks in the array fails, the RAID can be restored using a free disk instead of the failed one. The main feature of this implementation is that the system goes to its previous (fail-safe state without external intervention). When using a distributed sparing architecture, the logical blocks of a spare disk are physically distributed across all disks in the array, eliminating the need to rebuild the array if a disk fails.

In order to avoid the recovery problem typical of classic RAID levels, an architecture called parity declustering(parity distribution). It involves placing fewer, larger-capacity logical drives onto smaller, larger-capacity physical drives. Using this technology, the system's response time to a request during reconstruction is improved by more than half, and reconstruction time is significantly reduced.

Architecture of Basic RAID Levels

Now let's look at the architecture of the basic levels of RAID in more detail. Before considering, let's make some assumptions. To demonstrate the principles of constructing RAID systems, consider a set of N disks (for simplicity, we will assume that N is an even number), each of which consists of M blocks.

We will denote the data - D m,n, where m is the number of data blocks, n is the number of subblocks into which the data block D is divided.

Disks can connect to either one or several data transfer channels. Using more channels increases system throughput.

RAID 0. Striped Disk Array without Fault Tolerance

It is a disk array in which data is divided into blocks, and each block is written (or read) to a separate disk. Thus, multiple I/O operations can be performed simultaneously.

Advantages:

  • highest performance for applications requiring intensive processing of I/O requests and large data volumes;
  • ease of implementation;
  • low cost per unit volume.

Flaws:

  • not a fault-tolerant solution;
  • The failure of one drive results in the loss of all data in the array.

RAID 1. Redundant disk array or mirroring

Mirroring is a traditional way to increase the reliability of a small disk array. In the simplest version, two disks are used, on which the same information is recorded, and if one of them fails, a duplicate of it remains, which continues to operate in the same mode.

Advantages:

  • ease of implementation;
  • ease of array recovery in case of failure (copying);
  • sufficiently high performance for applications with high request intensity.

Flaws:

  • high cost per unit volume - 100% redundancy;
  • low data transfer speed.

RAID 2. Fault-tolerant disk array using Hamming Code ECC.

The redundant coding used in RAID 2 is called Hamming code. The Hamming code allows you to correct single faults and detect double faults. Today it is actively used in the technology of encoding data in ECC type RAM. And encoding data on magnetic disks.

In this case, an example is shown with a fixed number of disks due to the cumbersomeness of the description (a data word consists of 4 bits, respectively, the ECC code is 3).

Advantages:

  • fast error correction (“on the fly”);
  • very high data transfer speed for large volumes;
  • as the number of disks increases, overhead costs decrease;
  • quite simple implementation.

Flaws:

  • high cost with a small number of disks;
  • low request processing speed (not suitable for transaction-oriented systems).

RAID 3. Fault-tolerant array with parallel data transfer and parity (Parallel Transfer Disks with Parity)

Data is divided into subblocks at the byte level and written simultaneously to all disks in the array except one, which is used for parity. Using RAID 3 solves the problem of high redundancy in RAID 2. Most of the control disks used in RAID level 2 are needed to determine the position of the failed bit. But this is not necessary, since most controllers are able to determine when a disk has failed using special signals, or additional encoding of information written to the disk and used to correct random failures.

Advantages:

  • very high data transfer speed;
  • disk failure has little effect on the speed of the array;

Flaws:

  • difficult implementation;
  • low performance with high intensity requests for small data.

RAID 4. Fault-tolerant array of independent disks with shared parity disk (Independent Data disks with shared Parity disk)

Data is broken down at the block level. Each block of data is written to a separate disk and can be read separately. Parity for a group of blocks is generated on write and checked on read. RAID Level 4 improves the performance of small data transfers through parallelism, allowing more than one I/O access to be performed simultaneously. The main difference between RAID 3 and 4 is that in the latter, data striping is performed at the sector level, rather than at the bit or byte level.

Advantages:

  • very high speed of reading large volumes of data;
  • high performance at high intensity of data reading requests;
  • low overhead to implement redundancy.

Flaws:

  • very low performance when writing data;
  • low speed of reading small data with single requests;
  • asymmetry of performance regarding reading and writing.

RAID 5. Fault-tolerant array of independent disks with distributed parity (Independent Data disks with distributed parity blocks)

This level is similar to RAID 4, but unlike the previous one, parity is distributed cyclically across all disks in the array. This change improves the performance of writing small amounts of data on multitasking systems. If write operations are planned properly, it is possible to process up to N/2 blocks in parallel, where N is the number of disks in the group.

Advantages:

  • high data recording speed;
  • fairly high data reading speed;
  • high performance at high intensity of data read/write requests;
  • low overhead to implement redundancy.

Flaws:

  • Data reading speed is lower than in RAID 4;
  • low speed of reading/writing small data with single requests;
  • quite complex implementation;
  • complex data recovery.

RAID 6. Fault-tolerant array of independent disks with two independent distributed parity schemes (Independent Data disks with two independent distributed parity schemes)

Data is partitioned at the block level, similar to RAID 5, but in addition to the previous architecture, a second scheme is used to improve fault tolerance. This architecture is double fault tolerant. However, when performing a logical write, there are actually six disk accesses, which greatly increases the processing time of one request.

Advantages:

  • high fault tolerance;
  • fairly high speed of request processing;
  • relatively low overhead for implementing redundancy.

Flaws:

  • very complex implementation;
  • complex data recovery;
  • very low data writing speed.

Modern RAID controllers allow you to combine different RAID levels. In this way, it is possible to implement systems that combine the advantages of different levels, as well as systems with a large number of disks. Typically this is a combination of level zero (stripping) and some kind of fault-tolerant level.

RAID 10. Fault-tolerant array with duplication and parallel processing

This architecture is a RAID 0 array whose segments are RAID 1 arrays. It combines very high fault tolerance and performance.

Advantages:

  • high fault tolerance;
  • high performance.

Flaws:

  • very high cost;
  • limited scaling.

RAID 30. Fault-tolerant array with parallel data transfer and increased performance.

It is a RAID 0 array, the segments of which are RAID 3 arrays. It combines fault tolerance and high performance. Typically used for applications requiring large volumes of serial data transfer.

Advantages:

  • high fault tolerance;
  • high performance.

Flaws:

  • high price;
  • limited scaling.

RAID 50: Fault-tolerant array with distributed parity and increased performance

It is a RAID 0 array, the segments of which are RAID 5 arrays. It combines fault tolerance and high performance for applications with high request intensity and high data transfer rates.

Advantages:

  • high fault tolerance;
  • high data transfer speed;
  • high speed of request processing.

Flaws:

  • high price;
  • limited scaling.

RAID 7: Fault-tolerant array optimized for performance. (Optimized Asynchrony for High I/O Rates as well as High Data Transfer Rates). RAID 7® is a registered trademark of Storage Computer Corporation (SCC)

To understand the RAID 7 architecture, let's look at its features:

  1. All data transfer requests are processed asynchronously and independently.
  2. All read/write operations are cached via the high-speed x-bus.
  3. The parity disk can be placed on any channel.
  4. The microprocessor of the array controller uses a real-time operating system focused on processing processes.
  5. The system has good scalability: up to 12 host interfaces and up to 48 disks.
  6. The operating system controls the communication channels.
  7. Standard SCSI disks, buses, motherboards and memory modules are used.
  8. A high-speed X-bus is used to work with internal cache memory.
  9. The parity generation procedure is integrated into the cache.
  10. Disks attached to the system can be declared as separate.
  11. An SNMP agent can be used to manage and monitor the system.

Advantages:

  • high data transfer speed and high request processing speed (1.5 - 6 times higher than other standard RAID levels);
  • high scalability of host interfaces;
  • the data writing speed increases with the number of disks in the array;
  • There is no need for additional data transmission to calculate parity.

Flaws:

  • property of one manufacturer;
  • very high cost per unit volume;
  • short warranty period;
  • cannot be serviced by the user;
  • you need to use an uninterruptible power supply to prevent data loss from the cache memory.

Let's now look at the standard levels together to compare their characteristics. The comparison is made within the framework of the architectures mentioned in the table.

RAIDMinimum
disks
Need
in disks
Failure
sustainability
Speed
data transmission
Intensity
processing
requests
Practical
usage
0 2 N very high
up to N x 1 disk
Graphics, video
1 2 2N* R > 1 disk
W = 1 disk
up to 2 x 1 disk
W = 1 disk
small file servers
2 7 2N ~RAID 3Lowmainframes
3 3 N+1 LowGraphics, video
4 3 N+1 R WR=RAID 0
W
file servers
5 3 N+1 R WR=RAID 0
W
database servers
6 4 N+2the tallestlowR > 1 disk
W
used extremely rarely
7 12 N+1 the tallestthe tallestdifferent types of applications

Clarifications:

  • * - the commonly used option is considered;
  • k - number of subsegments;
  • R - reading;
  • W - record.

Some aspects of implementing RAID systems

Let's consider three main options for implementing RAID systems:

  • software (software-based);
  • hardware - bus-based;
  • hardware - autonomous subsystem (subsystem-based).

It is impossible to say unequivocally that any implementation is better than another. Each option for organizing an array satisfies one or another user’s needs, depending on financial capabilities, the number of users and the applications used.

Each of the above implementations is based on the execution of program code. They actually differ in where this code is executed: in the computer’s central processor (software implementation) or in a specialized processor on a RAID controller (hardware implementation).

The main advantage of software implementation is low cost. But at the same time, it has many disadvantages: low performance, load on the central processor with additional work, and increased bus traffic. Simple RAID levels 0 and 1 are usually implemented in software, since they do not require significant computation. Taking these features into account, software-based RAID systems are used in entry-level servers.

Hardware RAID implementations accordingly cost more than software ones, since they use additional hardware to perform I/O operations. At the same time, they unload or free up the central processor and system bus and, accordingly, allow for increased performance.

Bus-oriented implementations are RAID controllers that use the high-speed bus of the computer in which they are installed (lately the PCI bus is usually used). In turn, bus-oriented implementations can be divided into low-level and high-level. The former usually do not have SCSI chips and use the so-called RAID port on the motherboard with a built-in SCSI controller. In this case, the functions of processing RAID code and I/O operations are distributed between the processor on the RAID controller and the SCSI chips on the motherboard. Thus, the central processor is freed from processing additional code and bus traffic is reduced compared to the software version. The cost of such boards is usually low, especially if they are aimed at RAID 0 or 1 systems (there are also implementations of RAID 3, 5, 10, 30, 50, but they are more expensive), due to which they are gradually displacing software implementations from the entry-level server market. High-level controllers with bus implementation have a slightly different structure than their younger brothers. They take on all functions related to I/O and RAID code execution. In addition, they are not so dependent on the implementation of the motherboard and, as a rule, have more capabilities (for example, the ability to connect a module to store information in a cache in the event of a motherboard failure or power loss). Such controllers are usually more expensive than low-end controllers and are used in mid- and high-end servers. They, as a rule, implement RAID levels 0.1, 3, 5, 10, 30, 50. Considering that bus-oriented implementations are connected directly to the internal PCI bus of the computer, they are the most productive among the systems under consideration (when organizing one- host systems). The maximum performance of such systems can reach 132 MB/s (32bit PCI) or 264 MB/s (64bit PCI) at a bus frequency of 33MHz.

Along with the listed advantages, the bus-oriented architecture has the following disadvantages:

  • dependence on the operating system and platform;
  • limited scalability;
  • limited capabilities for organizing fault-tolerant systems.

All these disadvantages can be avoided by using autonomous subsystems. These systems have a completely autonomous external organization and, in principle, are a separate computer that is used to organize information storage systems. In addition, if fiber optic channel technology successfully develops, the performance of autonomous systems will be in no way inferior to bus-oriented systems.

Typically, an external controller is placed in a separate rack and, unlike systems with a bus organization, can have a large number of input/output channels, including host channels, which makes it possible to connect several host computers to the system and organize cluster systems. In systems with a standalone controller, hot standby controllers can be implemented.

One of the disadvantages of autonomous systems is their high cost.

Taking into account the above, we note that autonomous controllers are usually used to implement high-capacity data storage and cluster systems.

RAID array (Redundant Array of Independent Disks) - connecting several devices to increase performance and/or reliability of data storage, in translation - a redundant array of independent disks.

According to Moore's law, current productivity increases every year (namely, the number of transistors on a chip doubles every 2 years). This can be seen in almost every computer hardware industry. Processors increase the number of cores and transistors, while reducing the process, RAM increases frequency and bandwidth, solid-state drive memory increases durability and read speed.

But simple hard drives (HDDs) have not advanced much over the past 10 years. As the standard speed was 7200 rpm, it remains so (not taking into account server HDDs with revolutions of 10,000 or more). Slow 5400 rpm is still found on laptops. For most users, in order to increase the performance of their computer, it will be more convenient to buy an SDD, but the price for 1 gigabyte of such media is much higher than that of a simple HDD. “How to increase the performance of drives without losing a lot of money and volume? How to save your data or increase the security of your data? There is an answer to these questions - a RAID array.

Types of RAID arrays

Currently, the following types of RAID arrays exist:

RAID 0 or "Striping"– an array of two or more disks to improve overall performance. The raid volume will be total (HDD 1 + HDD 2 = Total volume), the read/write speed will be higher (due to splitting the recording into 2 devices), but the reliability of information security will suffer. If one of the devices fails, all information in the array will be lost.

RAID 1 or "Mirror"– several disks copying each other to increase reliability. The write speed remains at the same level, the read speed increases, reliability increases many times over (even if one device fails, the second will work), but the cost of 1 Gigabyte of information increases by 2 times (if you make an array of two hdds).

RAID 2 is an array built on disks for storing information and error correction disks. The number of HDDs for storing information is calculated using the formula “2^n-n-1”, where n is the number of HDD corrections. This type is used when there is a large number of HDDs, the minimum acceptable number is 7, where 4 is for storing information, and 3 is for storing errors. The advantage of this type will be increased performance compared to a single disk.

RAID 3 – consists of “n-1” disks, where n is a disk for storing parity blocks, the rest are devices for storing information. Information is divided into pieces smaller than the sector size (divided into bytes), well suited for working with large files, the reading speed of small files is very low. Characterized by high performance, but low reliability and narrow specialization.

RAID 4 is similar to type 3, but is divided into blocks rather than bytes. This solution was able to correct the low reading speed of small files, but the writing speed remained low.

RAID 5 and 6 - instead of a separate disk for error correlation, as in previous versions, blocks are used that are evenly distributed across all devices. In this case, the speed of reading/writing information increases due to parallelization of recording. The disadvantage of this type is the long-term recovery of information in the event of failure of one of the disks. During recovery, there is a very high load on other devices, which reduces reliability and increases the failure of another device and the loss of all data in the array. Type 6 improves overall reliability but reduces performance.

Combined types of RAID arrays:

RAID 01 (0+1) – Two Raid 0s are combined into Raid 1.

RAID 10 (1+0) – RAID 1 disk arrays, which are used in type 0 architecture. It is considered the most reliable data storage option, combining high reliability and performance.

You can also create an array from SSD drives. According to 3DNews testing, such a combination does not provide a significant increase. It is better to purchase a drive with a more powerful PCI or eSATA interface

Raid array: how to create

Created by connecting through a special RAID controller. At the moment there are 3 types of controllers:

  1. Software – the array is emulated by software, all calculations are performed by the CPU.
  2. Integrated – mainly common on motherboards (not the server segment). A small chip on the mat. board responsible for emulating the array, calculations are performed through the CPU.
  3. Hardware - an expansion card (for desktop computers), usually with a PCI interface, has its own memory and computing processor.

RAID hdd array: How to make it from 2 disks via IRST


Data recovery

Some data recovery options:

  1. If Raid 0 or 5 fails, the RAID Reconstructor utility can help, which will collect the available drive information and rewrite it to another device or media in the form of an image of the previous array. This option will help if the disks are working properly and the error is software.
  2. For Linux systems, mdadm recovery is used (a utility for managing software Raid arrays).
  3. Hardware recovery should be performed through specialized services, because without knowledge of the controller’s operating methods, you can lose all data and it will be very difficult or even impossible to get them back.

There are many nuances that need to be taken into account when creating a Raid on your computer. Basically, most options are used in the server segment, where data stability and security is important and necessary. If you have questions or additions, you can leave them in the comments.

Have a great day!

All modern motherboards are equipped with an integrated RAID controller, and top models even have several integrated RAID controllers. The extent to which integrated RAID controllers are in demand by home users is a separate question. In any case, a modern motherboard provides the user with the ability to create a RAID array of several disks. However, not every home user knows how to create a RAID array, what array level to choose, and generally has little idea of ​​the pros and cons of using RAID arrays.
In this article, we will give brief recommendations on creating RAID arrays on home PCs and use a specific example to demonstrate how you can independently test the performance of a RAID array.

History of creation

The term “RAID array” first appeared in 1987, when American researchers Patterson, Gibson and Katz from the University of California Berkeley in their article “A Case for Redundant Arrays of Inexpensive Discs, RAID” described how In this way, you can combine several low-cost hard drives into one logical device so that the resulting capacity and performance of the system are increased, and the failure of individual drives does not lead to failure of the entire system.

More than 20 years have passed since this article was published, but the technology of building RAID arrays has not lost its relevance today. The only thing that has changed since then is the decoding of the RAID acronym. The fact is that initially RAID arrays were not built on cheap disks at all, so the word Inexpensive (inexpensive) was changed to Independent (independent), which was more true.

Operating principle

So, RAID is a redundant array of independent disks (Redundant Arrays of Independent Discs), which is tasked with ensuring fault tolerance and increasing performance. Fault tolerance is achieved through redundancy. That is, part of the disk space capacity is allocated for official purposes, becoming inaccessible to the user.

Increased performance of the disk subsystem is ensured by the simultaneous operation of several disks, and in this sense, the more disks in the array (up to a certain limit), the better.

The joint operation of disks in an array can be organized using either parallel or independent access. With parallel access, disk space is divided into blocks (strips) for recording data. Similarly, information to be written to disk is divided into the same blocks. When writing, individual blocks are written to different disks, and multiple blocks are written to different disks simultaneously, which leads to increased performance in write operations. The necessary information is also read in separate blocks simultaneously from several disks, which also increases performance in proportion to the number of disks in the array.

It should be noted that the parallel access model is implemented only if the size of the data write request is larger than the size of the block itself. Otherwise, parallel recording of several blocks is almost impossible. Let's imagine a situation where the size of an individual block is 8 KB, and the size of a request to write data is 64 KB. In this case, the source information is cut into eight blocks of 8 KB each. If you have a four-disk array, you can write four blocks, or 32 KB, at a time. Obviously, in the example considered, the write and read speeds will be four times higher than when using a single disk. This is only true for an ideal situation, but the request size is not always a multiple of the block size and the number of disks in the array.

If the size of the recorded data is less than the block size, then a fundamentally different model is implemented - independent access. Moreover, this model can also be used when the size of the data being written is larger than the size of one block. With independent access, all data from a single request is written to a separate disk, that is, the situation is identical to working with one disk. The advantage of the independent access model is that if several write (read) requests arrive simultaneously, they will all be executed on separate disks independently of each other. This situation is typical, for example, for servers.

In accordance with different types of access, there are different types of RAID arrays, which are usually characterized by RAID levels. In addition to the type of access, RAID levels differ in the way they accommodate and generate redundant information. Redundant information can either be placed on a dedicated disk or distributed among all disks. There are many ways to generate this information. The simplest of them is complete duplication (100 percent redundancy), or mirroring. In addition, error correction codes are used, as well as parity calculations.

RAID levels

Currently, there are several RAID levels that can be considered standardized - these are RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5 and RAID 6.

Various combinations of RAID levels are also used, which allows you to combine their advantages. Typically this is a combination of some kind of fault-tolerant level and a zero level used to improve performance (RAID 1+0, RAID 0+1, RAID 50).

Note that all modern RAID controllers support the JBOD (Just a Bench Of Disks) function, which is not intended for creating arrays - it provides the ability to connect individual disks to the RAID controller.

It should be noted that the RAID controllers integrated on motherboards for home PCs do not support all RAID levels. Dual-port RAID controllers only support levels 0 and 1, while RAID controllers with more ports (for example, the 6-port RAID controller integrated into the southbridge of the ICH9R/ICH10R chipset) also support levels 10 and 5.

In addition, if we talk about motherboards based on Intel chipsets, they also implement the Intel Matrix RAID function, which allows you to simultaneously create RAID matrices of several levels on several hard drives, allocating part of the disk space for each of them.

RAID 0

RAID level 0, strictly speaking, is not a redundant array and, accordingly, does not provide reliable data storage. Nevertheless, this level is actively used in cases where it is necessary to ensure high performance of the disk subsystem. When creating a RAID level 0 array, information is divided into blocks (sometimes these blocks are called stripes), which are written to separate disks, that is, a system with parallel access is created (if, of course, the block size allows it). By allowing simultaneous I/O from multiple disks, RAID 0 provides the fastest data transfer speeds and maximum disk space efficiency because no storage space is required for checksums. The implementation of this level is very simple. RAID 0 is mainly used in areas where fast transfer of large amounts of data is required.

RAID 1 (Mirrored disk)

RAID Level 1 is an array of two disks with 100 percent redundancy. That is, the data is simply completely duplicated (mirrored), due to which a very high level of reliability (as well as cost) is achieved. Note that to implement level 1, it is not necessary to first partition the disks and data into blocks. In the simplest case, two disks contain the same information and are one logical disk. If one disk fails, its functions are performed by another (which is absolutely transparent to the user). Restoring an array is performed by simple copying. In addition, this level doubles the speed of reading information, since this operation can be performed simultaneously from two disks. This type of information storage scheme is used mainly in cases where the cost of data security is much higher than the cost of implementing a storage system.

RAID 5

RAID 5 is a fault-tolerant disk array with distributed checksum storage. When recording, the data stream is divided into blocks (stripes) at the byte level and simultaneously written to all disks of the array in cyclic order.

Suppose the array contains n disks, and the stripe size d. For each portion of n–1 stripes, the checksum is calculated p.

Stripe d 1 recorded on the first disk, stripe d 2- on the second and so on up to the stripe dn–1, which is written to ( n–1)th disk. Next on n-disk checksum is written p n, and the process is repeated cyclically from the first disk on which the stripe is written d n.

Recording process (n–1) stripes and their checksum are produced simultaneously for all n disks.

The checksum is calculated using a bitwise exclusive-or (XOR) operation applied to the data blocks being written. So, if there is n hard drives, d- data block (stripe), then the checksum is calculated using the following formula:

pn=d1 d 2 ... d 1–1.

If any disk fails, the data on it can be restored using the control data and the data remaining on the working disks.

As an illustration, consider blocks of four bits. Let there be only five disks for storing data and recording checksums. If there is a sequence of bits 1101 0011 1100 1011, divided into blocks of four bits, then to calculate the checksum it is necessary to perform the following bitwise operation:

1101 0011 1100 1011 = 1001.

Thus, the checksum written to the fifth disk is 1001.

If one of the disks, for example the fourth, fails, then the block d 4= 1100 will not be available when reading. However, its value can be easily restored using the checksum and the values ​​of the remaining blocks using the same “exclusive OR” operation:

d4 = d1 d 2d 4p5.

In our example we get:

d4 = (1101) (0011) (1100) (1011) = 1001.

In the case of RAID 5, all disks in the array are the same size, but the total capacity of the disk subsystem available for writing becomes exactly one disk smaller. For example, if five disks are 100 GB in size, then the actual size of the array is 400 GB because 100 GB is allocated for control information.

RAID 5 can be built on three or more hard drives. As the number of hard drives in an array increases, its redundancy decreases.

RAID 5 has an independent access architecture, which allows multiple reads or writes to be performed simultaneously.

RAID 10

RAID level 10 is a combination of levels 0 and 1. The minimum requirement for this level is four drives. In a RAID 10 array of four drives, they are combined in pairs into level 0 arrays, and both of these arrays as logical drives are combined into a level 1 array. Another approach is also possible: initially the disks are combined into mirrored arrays of level 1, and then logical drives based on these arrays - into an array of level 0.

Intel Matrix RAID

The considered RAID arrays of levels 5 and 1 are rarely used at home, which is primarily due to the high cost of such solutions. Most often, for home PCs, a level 0 array on two disks is used. As we have already noted, RAID level 0 does not provide secure data storage, and therefore end users are faced with a choice: create a fast but unreliable RAID level 0 array or, doubling the cost of disk space, RAID- a level 1 array that provides reliable data storage, but does not provide significant performance benefits.

To solve this difficult problem, Intel developed Intel Matrix Storage Technology, which combines the benefits of Tier 0 and Tier 1 arrays on just two physical disks. And in order to emphasize that in this case we are not just talking about a RAID array, but about an array that combines both physical and logical disks, the word “matrix” is used in the name of the technology instead of the word “array”.

So, what is a two-disk RAID matrix using Intel Matrix Storage technology? The basic idea is that if the system has several hard drives and a motherboard with an Intel chipset that supports Intel Matrix Storage Technology, it is possible to divide the disk space into several parts, each of which will function as a separate RAID array.

Let's look at a simple example of a RAID matrix consisting of two disks of 120 GB each. Any of the disks can be divided into two logical disks, for example 40 and 80 GB. Next, two logical drives of the same size (for example, 40 GB each) can be combined into a RAID level 1 matrix, and the remaining logical drives into a RAID level 0 matrix.

In principle, using two physical disks, it is also possible to create just one or two RAID level 0 matrices, but it is impossible to obtain only level 1 matrices. That is, if the system has only two disks, then Intel Matrix Storage technology allows you to create the following types of RAID matrices:

  • one level 0 matrix;
  • two level 0 matrices;
  • level 0 matrix and level 1 matrix.

If the system has three hard drives, the following types of RAID matrices can be created:

  • one level 0 matrix;
  • one level 5 matrix;
  • two level 0 matrices;
  • two level 5 matrices;
  • level 0 matrix and level 5 matrix.

If the system has four hard drives, then it is additionally possible to create a RAID matrix of level 10, as well as combinations of level 10 and level 0 or 5.

From theory to practice

If we talk about home computers, the most popular and popular are RAID arrays of levels 0 and 1. The use of RAID arrays of three or more disks in home PCs is rather an exception to the rule. This is due to the fact that, on the one hand, the cost of RAID arrays increases in proportion to the number of disks involved in it, and on the other hand, for home computers, the capacity of the disk array is of primary importance, and not its performance and reliability.

Therefore, in the future we will consider RAID levels 0 and 1 based on only two disks. The objective of our research will be to compare the performance and functionality of RAID arrays of levels 0 and 1, created on the basis of several integrated RAID controllers, as well as to study the dependence of the speed characteristics of the RAID array on the stripe size.

The fact is that although theoretically, when using a RAID level 0 array, the read and write speed should double, in practice the increase in speed characteristics is much less modest and it varies for different RAID controllers. The same is true for a RAID level 1 array: despite the fact that theoretically the read speed should double, in practice it’s not so smooth.

For our RAID controller comparison testing, we used the Gigabyte GA-EX58A-UD7 motherboard. This board is based on the Intel X58 Express chipset with the ICH10R southbridge, which has an integrated RAID controller for six SATA II ports, which supports the organization of RAID arrays of levels 0, 1, 10 and 5 with the Intel Matrix RAID function. In addition, the Gigabyte GA-EX58A-UD7 board integrates the GIGABYTE SATA2 RAID controller, which has two SATA II ports with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Also on the GA-EX58A-UD7 board is an integrated SATA III controller Marvell 9128, on the basis of which two SATA III ports are implemented with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Thus, the Gigabyte GA-EX58A-UD7 board has three separate RAID controllers, on the basis of which you can create RAID arrays of levels 0 and 1 and compare them with each other. Let us recall that the SATA III standard is backward compatible with the SATA II standard, therefore, based on the Marvell 9128 controller, which supports drives with the SATA III interface, you can also create RAID arrays using drives with the SATA II interface.

The testing stand had the following configuration:

  • processor - Intel Core i7-965 Extreme Edition;
  • motherboard - Gigabyte GA-EX58A-UD7;
  • BIOS version - F2a;
  • hard drives - two Western Digital WD1002FBYS drives, one Western Digital WD3200AAKS drive;
  • integrated RAID controllers:
  • ICH10R,
  • GIGABYTE SATA2,
  • Marvell 9128;
  • memory - DDR3-1066;
  • memory capacity - 3 GB (three modules of 1024 MB each);
  • memory operating mode - DDR3-1333, three-channel operating mode;
  • video card - Gigabyte GeForce GTS295;
  • power supply - Tagan 1300W.

Testing was carried out under the Microsoft Windows 7 Ultimate (32-bit) operating system. The operating system was installed on a Western Digital WD3200AAKS drive, which was connected to the port of the SATA II controller integrated into the ICH10R southbridge. The RAID array was assembled on two WD1002FBYS drives with a SATA II interface.

To measure the speed characteristics of the created RAID arrays, we used the IOmeter utility, which is the industry standard for measuring the performance of disk systems.

IOmeter utility

Since we intended this article as a kind of user guide for creating and testing RAID arrays, it would be logical to start with a description of the IOmeter (Input/Output meter) utility, which, as we have already noted, is a kind of industry standard for measuring the performance of disk systems. This utility is free and can be downloaded from http://www.iometer.org.

The IOmeter utility is a synthetic test and allows you to work with hard drives that are not partitioned into logical partitions, so you can test drives regardless of the file structure and reduce the influence of the operating system to zero.

When testing, it is possible to create a specific access model, or “pattern,” which allows you to specify the execution of specific operations by the hard drive. If you create a specific access model, you are allowed to change the following parameters:

  • size of the data transfer request;
  • random/sequential distribution (in%);
  • distribution of read/write operations (in%);
  • The number of individual I/O operations running in parallel.

The IOmeter utility does not require installation on a computer and consists of two parts: IOmeter itself and Dynamo.

IOmeter is the controlling part of the program with a user graphical interface that allows you to make all the necessary settings. Dynamo is a load generator that has no interface. Each time you run IOmeter.exe, the Dynamo.exe load generator automatically starts.

To start working with the IOmeter program, just run the IOmeter.exe file. This opens the main window of the IOmeter program (Fig. 1).

Rice. 1. Main window of the IOmeter program

It should be noted that the IOmeter utility allows you to test not only local disk systems (DAS), but also network-attached storage devices (NAS). For example, it can be used to test the performance of a server's disk subsystem (file server) using several network clients. Therefore, some of the bookmarks and tools in the IOmeter utility window relate specifically to the program’s network settings. It is clear that when testing disks and RAID arrays we will not need these program capabilities, and therefore we will not explain the purpose of all tabs and tools.

So, when you start the IOmeter program, a tree structure of all running load generators (Dynamo instances) will be displayed on the left side of the main window (in the Topology window). Each running Dynamo load generator instance is called a manager. Additionally, the IOmeter program is multi-threaded and each individual thread running on a Dynamo load generator instance is called a Worker. The number of running Workers always corresponds to the number of logical processor cores.

In our example, we use only one computer with a quad-core processor that supports Hyper-Threading technology, so only one manager (one instance of Dynamo) and eight (according to the number of logical processor cores) Workers are launched.

Actually, to test disks in this window there is no need to change or add anything.

If you select the name of the computer with the mouse in the tree structure of running Dynamo instances, then in the window Target on the tab Disk Target All disks, disk arrays and other drives (including network drives) installed on the computer will be displayed. These are the drives that IOmeter can work with. Media may be marked yellow or blue. Logical partitions of media are marked in yellow, and physical devices without logical partitions created on them are marked in blue. A logical section may or may not be crossed out. The fact is that in order for the program to work with a logical partition, it must first be prepared by creating a special file on it, equal in size to the capacity of the entire logical partition. If the logical partition is crossed out, this means that the section is not yet prepared for testing (it will be prepared automatically at the first stage of testing), but if the section is not crossed out, this means that a file has already been created on the logical partition, completely ready for testing .

Note that, despite the supported ability to work with logical partitions, it is optimal to test drives that are not partitioned into logical partitions. You can delete a logical disk partition very simply - through a snap-in Disk Management. To access it, just right-click on the icon Computer on the desktop and select the item in the menu that opens Manage. In the window that opens Computer Management on the left side you need to select the item Storage, and in it - Disk Management. After that, on the right side of the window Computer Management All connected drives will be displayed. By right-clicking on the desired drive and selecting the item in the menu that opens Delete Volume..., you can delete a logical partition on a physical disk. Let us remind you that when you delete a logical partition from a disk, all information on it is deleted without the possibility of recovery.

In general, using the IOmeter utility you can only test blank disks or disk arrays. That is, you cannot test a disk or disk array on which the operating system is installed.

So, let's return to the description of the IOmeter utility. In the window Target on the tab Disk Target you must select the disk (or disk array) that will be tested. Next you need to open the tab Access Specifications(Fig. 2), on which it will be possible to determine the testing scenario.

Rice. 2. Access Specifications tab of the IOmeter utility

In the window Global Access Specifications There is a list of predefined test scripts that can be assigned to the boot manager. However, we won’t need these scripts, so all of them can be selected and deleted (there is a button for this Delete). After that, click on the button New to create a new test script. In the window that opens Edit Access Specification You can define the boot scenario for a disk or RAID array.

Suppose we want to find out the dependence of the speed of sequential (linear) reading and writing on the size of the data transfer request block. To do this, we need to generate a sequence of boot scripts in sequential read mode at different block sizes, and then a sequence of boot scripts in sequential write mode at different block sizes. Typically, block sizes are chosen as a series, each member of which is twice the size of the previous one, and the first member of this series is 512 bytes. That is, the block sizes are as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB. There is no point in making the block size larger than 1 MB for sequential operations, since with such large data block sizes the speed of sequential operations does not change.

So, let's create a loading script in sequential reading mode for a block of 512 bytes.

In field Name window Edit Access Specification enter the name of the loading script. For example, Sequential_Read_512. Next in the field Transfer Request Size we set the data block size to 512 bytes. Slider Percent Random/Sequential Distribution(the percentage ratio between sequential and selective operations) we shift all the way to the left so that all our operations are only sequential. Well, the slider , which sets the percentage ratio between read and write operations, is shifted all the way to the right so that all our operations are read only. Other parameters in the window Edit Access Specification no need to change (Fig. 3).

Rice. 3. Edit Access Specification Window to Create a Sequential Read Load Script
with a data block size of 512 bytes

Click on the button Ok, and the first script we created will appear in the window Global Access Specifications on the tab Access Specifications IOmeter utilities.

Similarly, you need to create scripts for the remaining data blocks, however, to make your work easier, it is easier not to create the script anew each time by clicking the button New, and having selected the last created scenario, press the button Edit Copy(edit copy). After this the window will open again Edit Access Specification with the settings of our last created script. It will be enough to change only the name and size of the block. Having completed a similar procedure for all other block sizes, you can begin to create scripts for sequential recording, which is done in exactly the same way, except that the slider Percent Read/Write Distribution, which sets the percentage ratio between read and write operations, must be moved all the way to the left.

Similarly, you can create scripts for selective writing and reading.

After all the scripts are ready, they need to be assigned to the download manager, that is, indicate which scripts will work with Dynamo.

To do this, we check again what is in the window Topology The name of the computer (that is, the load manager on the local PC) is highlighted, and not the individual Worker. This ensures that load scenarios will be assigned to all Workers at once. Next in the window Global Access Specifications select all the load scenarios we have created and press the button Add. All selected load scenarios will be added to the window (Fig. 4).

Rice. 4. Assigning the created load scenarios to the load manager

After this you need to go to the tab Test Setup(Fig. 5), where you can set the execution time of each script we created. To do this in a group Run Time set the execution time of the load scenario. It will be enough to set the time to 3 minutes.

Rice. 5. Setting the execution time of the load scenario

Moreover, in the field Test Description You must specify the name of the entire test. In principle, this tab has a lot of other settings, but they are not needed for our tasks.

After all the necessary settings have been made, it is recommended to save the created test by clicking on the button with the image of a floppy disk on the toolbar. The test is saved with the extension *.icf. Subsequently, you can use the created load scenario by running not the IOmeter.exe file, but the saved file with the *.icf extension.

Now you can start testing directly by clicking on the button with a flag. You will be asked to specify the name of the file containing the test results and select its location. Test results are saved in a CSV file, which can then be easily exported to Excel and, by setting a filter on the first column, select the desired data with test results.

During testing, intermediate results can be seen on the tab Result Display, and you can determine which load scenario they belong to on the tab Access Specifications. In the window Assigned Access Specification a running script appears in green, completed scripts in red, and unexecuted scripts in blue.

So, we looked at the basic techniques for working with the IOmeter utility, which will be required for testing individual disks or RAID arrays. Note that we have not talked about all the capabilities of the IOmeter utility, but a description of all its capabilities is beyond the scope of this article.

Creating a RAID array based on the GIGABYTE SATA2 controller

So, we begin creating a RAID array based on two disks using the GIGABYTE SATA2 RAID controller integrated on the board. Of course, Gigabyte itself does not produce chips, and therefore under the GIGABYTE SATA2 chip is hidden a relabeled chip from another company. As you can find out from the driver INF file, we are talking about a JMicron JMB36x series controller.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the Ctrl+G key combination when the corresponding inscription appears on the screen. Naturally, first in the BIOS settings you need to define the operating mode of the two SATA ports related to the GIGABYTE SATA2 controller as RAID (otherwise access to the RAID array configurator menu will be impossible).

The setup menu for the GIGABYTE SATA2 RAID controller is quite simple. As we have already noted, the controller is dual-port and allows you to create RAID arrays of level 0 or 1. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0 or 1), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

Once the array is created, then any changes to it are no longer possible. That is, you cannot subsequently change for the created array, for example, its level or stripe size. To do this, you first need to delete the array (with loss of data), and then create it again. Actually, this is not unique to the GIGABYTE SATA2 controller. The inability to change the parameters of created RAID arrays is a feature of all controllers, which follows from the very principle of implementing a RAID array.

Once an array based on the GIGABYTE SATA2 controller has been created, its current information can be viewed using the GIGABYTE RAID Configurer utility, which is installed automatically along with the driver.

Creating a RAID array based on the Marvell 9128 controller

Configuring the Marvell 9128 RAID controller is only possible through the BIOS settings of the Gigabyte GA-EX58A-UD7 board. In general, it must be said that the Marvell 9128 controller configurator menu is somewhat crude and can mislead inexperienced users. However, we will talk about these minor shortcomings a little later, but for now we will consider the main functionality of the Marvell 9128 controller.

So, although this controller supports SATA III drives, it is also fully compatible with SATA II drives.

The Marvell 9128 controller allows you to create a RAID array of levels 0 and 1 based on two disks. For a level 0 array, you can set the stripe size to 32 or 64 KB, and also specify the name of the array. In addition, there is an option such as Gigabyte Rounding, which needs explanation. Despite the name, which is similar to the name of the manufacturer, the Gigabyte Rounding function has nothing to do with it. Moreover, it is in no way connected with the RAID level 0 array, although in the controller settings it can be defined specifically for an array of this level. Actually, this is the first of those shortcomings in the Marvell 9128 controller configurator that we mentioned. The Gigabyte Rounding feature is defined only for RAID Level 1. It allows you to use two drives (for example, different manufacturers or different models) with slightly different capacities to create a RAID Level 1 array. The Gigabyte Rounding function precisely sets the difference in the sizes of the two disks used to create a RAID level 1 array. In the Marvell 9128 controller, the Gigabyte Rounding function allows you to set the difference in the sizes of the disks to 1 or 10 GB.

Another flaw in the Marvell 9128 controller configurator is that when creating a RAID level 1 array, the user has the ability to select the stripe size (32 or 64 KB). However, the concept of stripe is not defined at all for RAID level 1.

Creating a RAID array based on the controller integrated into the ICH10R

The RAID controller integrated into the ICH10R southbridge is the most common. As already noted, this RAID controller is 6-port and supports not only the creation of RAID 0 and RAID 1 arrays, but also RAID 5 and RAID 10.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the key combination Ctrl + I when the corresponding inscription appears on the screen. Naturally, first in the BIOS settings you should define the operating mode of this controller as RAID (otherwise access to the RAID array configurator menu will be impossible).

The RAID controller setup menu is quite simple. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0, 1, 5 or 10), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

RAID performance comparison

To test RAID arrays using the IOmeter utility, we created sequential read, sequential write, selective read, and selective write load scenarios. The data block sizes in each load scenario were as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB.

On each of the RAID controllers, we created a RAID 0 array with all allowable stripe sizes and a RAID 1 array. In addition, in order to be able to evaluate the performance gain obtained from using a RAID array, we also tested a single disk on each of the RAID controllers.

So, let's look at the results of our testing.

GIGABYTE SATA2 Controller

First of all, let's look at the results of testing RAID arrays based on the GIGABYTE SATA2 controller (Fig. 6-13). In general, the controller turned out to be literally mysterious, and its performance was simply disappointing.

Rice. 6.Speed ​​sequential
and selective disk operations
Western Digital WD1002FBYS

Rice. 7.Speed ​​sequential

with a stripe size of 128 KB
(GIGABYTE SATA2 controller)

Rice. 12.Serial speed
and selective operations for RAID 0
with a stripe size of 4 KB
(GIGABYTE SATA2 controller)

Rice. 13.Serial speed
and selective operations
for RAID 1 (GIGABYTE SATA2 controller)

If you look at the speed characteristics of one disk (without a RAID array), the maximum sequential read speed is 102 MB/s, and the maximum sequential write speed is 107 MB/s.

When creating a RAID 0 array with a stripe size of 128 KB, the maximum sequential read and write speed increases to 125 MB/s, an increase of approximately 22%.

With stripe sizes of 64, 32, or 16 KB, the maximum sequential read speed is 130 MB/s, and the maximum sequential write speed is 141 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 27%, and the maximum sequential write speed increases by 31%.

In fact, this is not enough for a level 0 array, and I would like the maximum speed of sequential operations to be higher.

With a stripe size of 8 KB, the maximum speed of sequential operations (reading and writing) remains approximately the same as with a stripe size of 64, 32 or 16 KB, however, there are obvious problems with selective reading. As the data block size increases up to 128 KB, the selective read speed (as it should) increases in proportion to the data block size. However, when the data block size is more than 128 KB, the selective read speed drops to almost zero (to approximately 0.1 MB/s).

With a stripe size of 4 KB, not only the selective read speed drops when the block size is more than 128 KB, but also the sequential read speed when the block size is more than 16 KB.

Using a RAID 1 array on a GIGABYTE SATA2 controller does not significantly change the sequential read speed (compared to a single drive), but the maximum sequential write speed is reduced to 75 MB/s. Recall that for a RAID 1 array, the read speed should increase, and the write speed should not decrease compared to the read and write speed of a single disk.

Based on the results of testing the GIGABYTE SATA2 controller, only one conclusion can be drawn. It makes sense to use this controller to create RAID 0 and RAID 1 arrays only if all other RAID controllers (Marvell 9128, ICH10R) are already used. Although it is quite difficult to imagine such a situation.

Marvell 9128 controller

The Marvell 9128 controller demonstrated much higher speed characteristics compared to the GIGABYTE SATA2 controller (Fig. 14-17). In fact, the differences appear even when the controller operates with one disk. If for the GIGABYTE SATA2 controller the maximum sequential read speed is 102 MB/s and is achieved with a data block size of 128 KB, then for the Marvell 9128 controller the maximum sequential read speed is 107 MB/s and is achieved with a data block size of 16 KB.

When creating a RAID 0 array with stripe sizes of 64 and 32 KB, the maximum sequential read speed increases to 211 MB/s, and sequential write speed increases to 185 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 97%, and the maximum sequential write speed increases by 73%.

There is no significant difference in the speed performance of a RAID 0 array with a stripe size of 32 and 64 KB, however, the use of a 32 KB stripe is more preferable, since in this case the speed of sequential operations with a block size of less than 128 KB will be slightly higher.

When creating a RAID 1 array on a Marvell 9128 controller, the maximum sequential operation speed remains virtually unchanged compared to a single disk. So, if for a single disk the maximum speed of sequential operations is 107 MB/s, then for RAID 1 it is 105 MB/s. Also note that for RAID 1, selective read performance degrades slightly.

In general, it should be noted that the Marvell 9128 controller has good speed characteristics and can be used both to create RAID arrays and to connect single disks to it.

Controller ICH10R

The RAID controller built into the ICH10R turned out to be the highest performing of all those we tested (Figure 18-25). When working with a single drive (without creating a RAID array), its performance is virtually the same as that of the Marvell 9128 controller. The maximum sequential read and write speed is 107 MB and is achieved with a data block size of 16 KB.

Rice. 18. Sequential speed
and selective operations
for Western Digital WD1002FBYS disk (ICH10R controller)

If we talk about the RAID 0 array on the ICH10R controller, then the maximum sequential read and write speed does not depend on the stripe size and is 212 MB/s. Only the size of the data block at which the maximum sequential reading and writing speed is achieved depends on the stripe size. Test results show that for RAID 0 based on the ICH10R controller, it is optimal to use a 64 KB stripe. In this case, the maximum sequential read and write speed is achieved with a data block size of only 16 KB.

So, to summarize, we once again emphasize that the RAID controller built into the ICH10R significantly exceeds all other integrated RAID controllers in performance. And given that it also has greater functionality, it is optimal to use this particular controller and simply forget about the existence of all the others (unless, of course, the system uses SATA III drives).

© Andrey Egorov, 2005, 2006. TIM Group of Companies.

Forum visitors ask us the question: “Which RAID level is the most reliable?” Everyone knows that the most common level is RAID5, but it is not without serious drawbacks that are not obvious to non-specialists.

RAID 0, RAID 1, RAID 5, RAID6, RAID 10 or what are RAID levels?

In this article, I will try to characterize the most popular RAID levels, and then formulate recommendations for using these levels. To illustrate the article, I created a diagram in which I placed these levels in the three-dimensional space of reliability, performance and cost efficiency.

JBOD(Just a Bunch of Disks) is a simple spanning of hard drives, which is not formally a RAID level. A JBOD volume can be an array of a single disk or an aggregation of multiple disks. The RAID controller does not need to perform any calculations to operate such a volume. In our diagram, the JBOD drive serves as a “single” or starting point—its reliability, performance, and cost values ​​are the same as those of a single hard drive.

RAID 0(“Striping”) has no redundancy, and distributes information immediately across all disks included in the array in the form of small blocks (“stripes”). Due to this, performance increases significantly, but reliability suffers. As with JBOD, we get 100% of the disk capacity for our money.

Let me explain why the reliability of data storage on any composite volume decreases - since if any of the hard drives included in it fail, all information is completely and irretrievably lost. In accordance with probability theory, mathematically, the reliability of a RAID0 volume is equal to the product of the reliabilities of its constituent disks, each of which is less than one, so the total reliability is obviously lower than the reliability of any disk.

Good level - RAID 1(“Mirroring”, “mirror”). It has protection against failure of half of the available hardware (in the general case, one of two hard drives), provides an acceptable write speed and gains in read speed due to parallelization of requests. The disadvantage is that you have to pay the cost of two hard drives to get the usable capacity of one hard drive.

Initially, it is assumed that the hard drive is a reliable thing. Accordingly, the probability of failure of two disks at once is equal (according to the formula) to the product of the probabilities, i.e. orders of magnitude lower! Unfortunately, real life is not a theory! Two hard drives are taken from the same batch and operate under the same conditions, and if one of the disks fails, the load on the remaining one increases, so in practice, if one of the disks fails, urgent measures must be taken to restore redundancy. To do this, it is recommended to use hot spare disks with any RAID level (except zero) HotSpare. The advantage of this approach is maintaining constant reliability. The disadvantage is even greater costs (i.e. the cost of 3 hard drives to store the volume of one disk).

Mirror on many disks is a level RAID 10. When using this level, mirrored pairs of disks are arranged in a “chain”, so the resulting volume can exceed the capacity of a single hard drive. The advantages and disadvantages are the same as for the RAID1 level. As in other cases, it is recommended to include HotSpare hot spare disks in the array at the rate of one spare for every five workers.

RAID 5, indeed, the most popular of the levels - primarily due to its efficiency. By sacrificing the capacity of just one disk from the array for redundancy, we gain protection against failure of any of the volume’s hard drives. Writing information to a RAID5 volume requires additional resources, since additional calculations are required, but when reading (compared to a separate hard drive), there is a gain, because data streams from several array drives are parallelized.

The disadvantages of RAID5 appear when one of the disks fails - the entire volume goes into critical mode, all write and read operations are accompanied by additional manipulations, performance drops sharply, and the disks begin to heat up. If immediate action is not taken, you may lose the entire volume. Therefore, (see above) you should definitely use a Hot Spare disk with a RAID5 volume.

In addition to the basic levels RAID0 - RAID5 described in the standard, there are combined levels RAID10, RAID30, RAID50, RAID15, which are interpreted differently by different manufacturers.

The essence of such combinations is briefly as follows. RAID10 is a combination of one and zero (see above). RAID50 is a combination of “0” level 5 volumes. RAID15 is a “mirror” of the “fives”. And so on.

Thus, combined levels inherit the advantages (and disadvantages) of their “parents”. So, the appearance of a “zero” in the level RAID 50 does not add any reliability to it, but has a positive effect on performance. Level RAID 15, probably very reliable, but it is not the fastest and, moreover, extremely uneconomical (the useful capacity of the volume is less than half the size of the original disk array).

RAID 6 differs from RAID 5 in that in each row of data (in English stripe) has not one, but two checksum block. Checksums are “multidimensional”, i.e. independent of each other, so even the failure of two disks in the array allows you to save the original data. Calculating checksums using the Reed-Solomon method requires more intensive calculations compared to RAID5, so previously the sixth level was practically not used. Now it is supported by many products, since they began to install specialized microcircuits that perform all the necessary mathematical operations.

According to some studies, restoring integrity after a single disk failure on a RAID5 volume composed of large SATA disks (400 and 500 gigabytes) ends in data loss in 5% of cases. In other words, in one case out of twenty, during the regeneration of a RAID5 array to a Hot Spare disk, the second disk may fail... Hence the recommendations of the best RAID drives: 1) Always make backups; 2) use RAID6!

Recently new levels RAID1E, RAID5E, RAID5EE have appeared. The letter “E” in the name means Enhanced.

RAID level-1 Enhanced (RAID level-1E) combines mirroring and data striping. This mixture of levels 0 and 1 is arranged as follows. The data in a row is distributed exactly as in RAID 0. That is, the data row has no redundancy. The next row of data blocks copies the previous one with a shift of one block. Thus, as in standard RAID 1 mode, each data block has a mirror copy on one of the disks, so the useful volume of the array is equal to half the total volume of the hard drives included in the array. RAID 1E requires a combination of three or more drives to operate.

I really like the RAID1E level. For a powerful graphics workstation or even for a home computer - the best choice! It has all the advantages of the zero and first levels - excellent speed and high reliability.

Let's now move on to the level RAID level-5 Enhanced (RAID level-5E). This is the same as RAID5, only with a backup disk built into the array spare drive. This integration is carried out as follows: on all disks of the array, 1/N part of the space is left free, which is used as a hot spare if one of the disks fails. Due to this, RAID5E demonstrates, along with reliability, better performance, since reading/writing is performed in parallel from a larger number of drives at the same time and the spare drive is not idle, as in RAID5. Obviously, the backup disk included in the volume cannot be shared with other volumes (dedicated vs. shared). A RAID 5E volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

RAID level-5E Enhanced (RAID level-5EE) similar to RAID level-5E, but it has more efficient spare drive allocation and, as a result, faster recovery time. Like the RAID5E level, this RAID level distributes blocks of data and checksums in rows. But it also distributes free blocks of the spare drive, and does not simply reserve part of the disk space for these purposes. This reduces the time required to reconstruct the integrity of a RAID5EE volume. The backup disk included in the volume cannot be shared with other volumes - as in the previous case. A RAID 5EE volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

Oddly enough, no mention of level RAID 6E I couldn’t find it on the Internet - so far this level is not offered or even announced by any manufacturer. But the RAID6E (or RAID6EE?) level can be offered according to the same principle as the previous one. Disk HotSpare Necessarily must accompany any RAID volume, including RAID 6. Of course, we will not lose information if one or two disks fail, but it is extremely important to start regenerating the integrity of the array as early as possible in order to quickly bring the system out of the “critical” mode. Since the need for a Hot Spare disk is beyond doubt for us, it would be logical to go further and “spread” it over the volume as is done in RAID 5EE in order to get the benefits of using a larger number of disks (better read-write speed and faster restoration of integrity).

RAID levels in “numbers”.

I have collected some important parameters of almost all RAID levels in a table so that you can compare them with each other and better understand their essence.

Level
~~~~~~~

Huts-
exactly
ness
~~~~~~~

Use
Disk capacity
~~~~~~~

Production
ditel-
ness
reading

~~~~~~~

Production
ditel-
ness
records

~~~~~~~

Built-in
disk
reserve

~~~~~~~

Min. number of disks
~~~~~~~

Max. number of disks

~~~~~~~

Exc.

Exc.

Exc.

Exc.

All “mirror” levels are RAID 1, 1+0, 10, 1E, 1E0.

Let's try again to thoroughly understand how these levels differ?

RAID 1.
This is a classic “mirror”. Two (and only two!) hard drives work as one, being a complete copy of each other. Failure of either of these two drives does not result in loss of your data, as the controller continues to operate on the remaining drive. RAID1 in numbers: 2x redundancy, 2x reliability, 2x cost. Write performance is equivalent to that of a single hard drive. Read performance is higher because the controller can distribute read operations between two disks.

RAID 10.
The essence of this level is that the disks of the array are combined in pairs into “mirrors” (RAID 1), and then all these mirror pairs, in turn, are combined into a common striped array (RAID 0). That is why it is sometimes referred to as RAID 1+0. An important point is that in RAID 10 you can only combine an even number of disks (minimum 4, maximum 16). Advantages: reliability is inherited from the “mirror”, performance for both reading and writing is inherited from “zero”.

RAID 1E.
The letter "E" in the name means "Enhanced", i.e. "improved". The principle of this improvement is as follows: the data is “stripped” in blocks across all disks of the array, and then “striped” again with a shift to one disk. RAID 1E can combine from three to 16 disks. Reliability corresponds to the “ten” indicators, and performance becomes a little better due to greater “alternation”.

RAID 1E0.
This level is implemented like this: we create a “null” array from RAID1E arrays. Therefore, the total number of disks must be a multiple of three: a minimum of three and a maximum of sixty! In this case, we are unlikely to get a speed advantage, and the complexity of the implementation may adversely affect reliability. The main advantage is the ability to combine a very large (up to 60) number of disks into one array.

The similarity of all RAID 1X levels lies in their redundancy indicators: for the sake of reliability, exactly 50% of the total capacity of the array disks is sacrificed.

RAID- an abbreviation that stands for Redundant Array of Independent Disks - “fail-safe array of independent disks” (previously, the word Inexpensive was sometimes used instead of Independent). The concept of a structure consisting of multiple disks combined into a group that provides fault tolerance was born in 1987 in the seminal work of Patterson, Gibson and Katz.

Original RAID types

RAID-0
If we believe that RAID is “fault tolerance” (Redundant...), then RAID-0 is “zero fault tolerance”, the absence of it. The RAID-0 structure is a “striped array of disks.” Data blocks are written one by one to all disks included in the array, in order. This increases performance, ideally by as many times as the number of disks included in the array, since recording is parallelized between several devices.
However, reliability decreases by the same amount, since data will be lost if any of the disks included in the array fails.

RAID-1
This is the so-called “mirror”. Write operations are performed on two disks in parallel. The reliability of such an array is higher than that of a single disk, but performance increases slightly (or does not increase at all).

RAID-10
An attempt to combine the advantages of two types of RAID and deprive them of their inherent disadvantages. If we take a RAID-0 group with increased performance, and give each of them (or the entire array) “mirror” disks to protect data from loss due to failure, we will get a fault-tolerant array with increased performance as a result of using striping.
Today, “in the wild” this is one of the most popular types of RAID.
Disadvantages - we pay for all the above advantages with half the total capacity of the disks included in the array.

RAID-2
Remained a completely theoretical option. This is an array in which data is encoded with an error-resistant Hamming code, which allows you to restore individual faulty fragments due to its redundancy. By the way, various modifications of the Hamming code, as well as its successors, are used in the process of reading data from the magnetic heads of hard drives and optical CD/DVD readers.

RAID-3 and 4
“Creative development” of the idea of ​​data protection with redundant code. The Hamming code is indispensable in the case of a “constantly unreliable” stream saturated with continuous weakly predictable errors, such as, for example, a noisy air communication channel. However, in the case of hard drives, the main problem is not read errors (we believe that hard drives output data in the form in which we wrote it, if it works), but the failure of the entire drive.
For such conditions, you can combine a striping scheme (RAID-0) and, to protect against the failure of one of the disks, supplement the recorded information with redundancy, which will allow you to restore data if some part of it is lost, by allocating an additional disk for this.
If we lose any of the data disks, we can restore the data stored on it using simple mathematical operations on the redundancy data; if the disk with redundancy data fails, we still have data read from the RAID-0 type disk array.
Options RAID-3 and RAID-4 differ in that in the first case individual bytes are interleaved, and in the second case groups of bytes, “blocks,” are interleaved.
The main disadvantage of these two schemes is the extremely low speed of writing to the array, since each write operation causes an update of the “checksum,” a block of redundancy for the written information. It is obvious that, despite the striped structure, the performance of a RAID-3 and RAID-4 array is limited by the performance of a single disk, the one on which the “redundancy block” lies.

RAID-5
An attempt to circumvent this limitation gave rise to the next type of RAID, which is currently the most widespread, along with RAID-10. If writing a “redundancy block” to disk limits the entire array, let’s also spread it across the disks of the array, make an unallocated disk for this information, thereby the redundancy update operations will be distributed across all disks of the array. That is, as in the case of RAID-3(4), we take disks to store N information in the amount of N + 1 disk, but unlike Type 3 and 4, this disk is also used to store data mixed with redundancy data, like the rest N.
Flaws? What would it be like without them? The problem with slow recording was partly solved, but still not completely. However, writing to a RAID-5 array is slower than writing to a RAID-10 array. But RAID-5 is more “cost-effective”. For RAID-10, we pay for fault tolerance with exactly half of the disks, and in the case of RAID-5 it is just one disk.

However, the write speed decreases in proportion to the increase in the number of disks in the array (unlike RAID-0, where it only increases). This is due to the fact that when writing a data block, the array needs to recalculate the redundancy block, for which it reads the remaining “horizontal” blocks and recalculates the redundancy block in accordance with their data. That is, for one write operation, an array of 8 disks (7 data disks + 1 additional) will make 6 read operations into the cache (the remaining data blocks from all disks to calculate the redundancy block), calculate the redundancy block from these blocks, and make 2 writes (writing a block of recorded data and overwriting a redundancy block). In modern systems, the problem is partially alleviated by caching, but nevertheless, lengthening the RAID-5 group, although it causes a proportional increase in read speed, also causes a corresponding decrease in write speed.
The situation with decreased performance when writing to RAID-5 sometimes gives rise to interesting extremism, for example, http://www.baarf.com/ ;)

However, since RAID-5 is the most efficient RAID structure in terms of disk consumption per “linear megabyte,” it is widely used where the reduction in write speed is not a decisive parameter, for example, for long-term data storage or for data that is primarily read.
Separately, it should be mentioned that expanding a RAID-5 disk array by adding an additional disk causes a complete recalculation of the entire RAID, which can take hours, and in some cases, days, during which the performance of the array drops catastrophically.

RAID-6
Further development of the RAID-5 idea. If we calculate additional redundancy according to a law different from that used in RAID-5, then we can maintain access to data if two disks of the array fail.
The price for this is an additional disk for the data of the second “redundancy block”. That is, to store data equal to the volume of N disks, we will need to take N + 2 disks. The “mathematics” of calculating redundancy blocks becomes more complicated, which causes an even greater reduction in write speed compared to RAID-5, but reliability increases. Moreover, in some cases it even exceeds the reliability level of RAID-10. It is easy to see that RAID-10 can also withstand the failure of two disks in the array, however, if these disks belong to the same “mirror” or to different, but not two mirrored disks. And the likelihood of just such a situation cannot be discounted.

A further increase in the numbers of RAID types occurs due to “hybridization”, this is how RAID-0+1 appears, which has become the already discussed RAID-10, or all sorts of chimerical RAID-51 and so on.
Fortunately, they are not found in wildlife, usually remaining a “dream of the mind” (well, except for the RAID-10 already described above).