Raid array of hard drives. Practical tips for creating RAID arrays on home PCs

And so on, so on, so on, so on. So, today we’ll talk about RAID arrays based on them.

As you know, these same hard disks They also have a certain safety margin after which they fail, as well as characteristics that affect performance.

As a result, probably many of you, one way or another, have once heard about certain raid arrays that can be made from regular hard disks in order to speed up the operation of these same disks and the computer as a whole or to ensure increased reliability of data storage.

Surely you also know (and if you don’t know, it doesn’t matter) that these arrays have different sequence numbers ( 0, 1, 2, 3, 4 etc.), and also perform quite well various functions. This phenomenon actually takes place in nature and, as you have already guessed, it is precisely these same RAID arrays is what I want to tell you in this article. More precisely, I’m already telling you ;)

Go.

What is RAID and why is it needed?

RAID- this is a disk array (i.e. a complex or, if you like, a bundle) of several devices, - hard drives. As I said above, this array serves to increase the reliability of data storage and/or to increase the speed of reading/writing information (or both).

Actually, what exactly this bunch of disks does, i.e. speeding up work or increasing data security, depends on you, or more precisely, on the choice of the current configuration of the raid(s). The different types of these configurations are precisely noted different numbers: 1, 2, 3, 4 and, accordingly, perform different functions.

Simply, for example, in the case of constructing 0 -th version (description of variations 0, 1, 2, 3 etc. - read below) You will get a noticeable increase in productivity. And in general HDD Nowadays there is just a narrow channel in the speed of the system.

Why did this happen in general?

Hard drives only grow in volume due to the rotation speed of their heads (with the exception of rare models like Raptor"ov) has been frozen for quite some time at around 7200 , the cache isn’t exactly growing either, the architecture remains almost the same.

In general, in terms of performance, disks are stagnant (the situation can only be saved by developing ones), but they play a significant role in the operation of the system and, in some places, full-fledged applications.

In the case of constructing a single unit (in the sense of number 1 ) raid, you will lose a little in performance, but you will receive some tangible guarantee of the security of your data, because it will be completely duplicated and, in fact, even if one disk fails, everything will be located entirely on the second without any loss.

In general, I repeat, raids will be useful to everyone. I would even say that they are required :)

Do you want to know and be able to do more yourself?

We offer you training in the following areas: computers, programs, administration, servers, networks, website building, SEO and more. Find out the details now!

What is RAID in the physical sense?

Physically RAID-array represents from two before n-number of connected hard drives supporting the ability to create RAID(or to the corresponding controller, which is less common because these are expensive for the average user (controllers are usually used on servers due to increased reliability and performance)), i.e. To the eye, nothing changes inside the system unit; there are simply no unnecessary connections or connections of disks to each other or to anything else.

In general, everything in the hardware is almost the same as always, and only the software approach changes, which, in fact, sets, by selecting the type of raid, exactly how the connected disks should work.

Programmatically, in the system, after creating a raid, no special quirks appear either. In fact, the only difference in working with a raid is a little tweaking, which actually organizes the raid (see below) and uses the driver. Otherwise, EVERYTHING is absolutely the same - in "My Computer" the same C, D and other disks, all the same folders, files... In general and in software, by eye, they are completely identical.

Installing the array is not difficult: we just take a motherboard that supports the technology RAID, we take two completely identical ones, - it is important!, - both according to the characteristics (size, cache, interface, etc.) and according to the manufacturer and model of the disk and connect them to this motherboard. Next, just turn on the computer, go to BIOS and set the parameter SATA Configuration: RAID.

After this, during the computer boot process (usually before booting Windows) a panel appears displaying information about the disk in the raid and outside it, where you actually need to click CTR-I to configure the raid (add disks to it, delete, etc., etc.). Actually, that's all. Then there are other joys of life, that is, again, everything is as always.

Important note to remember

When creating or deleting a raid ( 1 This doesn’t seem to apply to the th raid, but it’s not a fact) all information is inevitably deleted from the disks, and therefore it’s easy to conduct an experiment, creating and deleting various configurations, clearly not worth it. Therefore, before creating a raid, first save all necessary information(if it exists), and then experiment.

As for the configurations.. As I already said, RAID There are several types of arrays (at least from the main basis - this is RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, RAID 6). To begin with, I will talk about two that are the most understandable and popular among ordinary users:

  • RAID 0- disk array to increase recording speed.
  • RAID 1- mirrored disk array.

And at the end of the article I’ll quickly go over the others.

RAID 0 - what is it and what is it used for?

So.. RAID 0(aka, striping) - uses two to four (more, less often) hard drives that jointly process information, which increases productivity. To make it clear, carrying bags for one person takes longer and is more difficult than for four people (although the bags remain the same in their physical properties, only the powers interacting with them change). Programmatically, information on a raid of this type is divided into data blocks and written to both/several disks in turn.

One block of data on one disk, another block of data on another, and so on. This significantly increases performance (the number of disks determines the multiplicity of the increase in performance, i.e. 4 disks will run faster than two), but the security of data on the entire array suffers. If any of the components included in such RAID hard drives (i.e. hard drives), all information is almost completely and irretrievably lost.

Why? The fact is that each file consists of a certain number of bytes... each of which carries information. But in RAID 0 In an array, the bytes of one file can be located on several disks. Accordingly, if one of the disks “dies,” an arbitrary number of bytes of the file will be lost and it will simply be impossible to recover it. But there is more than one file.

In general, when using such a raid array, it is strongly recommended to make permanent valuable information on external media. The raid really provides noticeable speed - I’m telling you this in own experience, because such happiness has been installed in my home for years.

RAID 1 - what is it and what is it used for?

What about RAID 1?(Mirroring - “mirror”). Actually, I’ll start with the drawback. Unlike RAID 0 it turns out that you seem to be “losing” the volume of the second hard drive(it is used to write a complete (byte for byte) copy of the first hard drive to it while RAID 0 this space is completely accessible).

The advantage, as you already understood, is that it has high reliability, that is, everything works (and all data exists in nature, and does not disappear when one of the devices fails) as long as at least one disk is functioning, i.e. Even if you roughly destroy one disk, you will not lose a single byte of information, because the second is a pure copy of the first and replaces it when it fails. This type of raid is often used on servers due to the incredible viability of data, which is important.

With this approach, performance is sacrificed and, according to personal feelings, it is even less than when using one disk without any raids. However, for some, reliability is much more important than performance.

RAID 2, 3, 4, 5, 6 - what are they and what are they used with?

The description of these arrays is here as much as possible, i.e. purely for reference, and even then in a compressed form (in fact, only the second one is described). Why is that? At least due to the low popularity of these arrays among the average (and, in general, any other) user and, as a consequence, my little experience in using them.

RAID 2 reserved for arrays that use some kind of Hamming code (I wasn’t interested in what it was, so I won’t tell you). The principle of operation is approximately this: data is recorded on the corresponding devices in the same way as in RAID 0, that is, they are divided into small blocks across all disks that are involved in storing information.

The remaining (specially allocated for it) disks store error correction codes, which can be used to restore information in the event of a hard drive failure. So in arrays of this type, disks are divided into two groups - for data and for error correction codes

For example, you have two disks that provide space for the system and files, and two more will be completely dedicated to correction data in case the first two disks fail. In essence, this is something like a zero raid, only with the ability to at least somehow save information in the event of failure of one of the hard drives. Rarely expensive - four disks instead of two with a very controversial increase in security.

RAID 3, 4, 5, 6.. About them, no matter how strange it may sound on the pages of this site, try reading about them on Wikipedia. The fact is that in my life I have encountered these arrays extremely rarely (except that the fifth one came to hand more often than others) and I cannot describe in accessible words the principles of their operation, and I absolutely do not want to reprint an article from the above proposed resource, at least due to the presence of infuriating formulations in these, which even I can barely understand.

Which RAID should you choose?

If you play games, often copy music, movies, install resource-intensive programs, then you will definitely find it useful RAID 0. But be careful when choosing hard drives - in this case their quality is especially important - or be sure to make backups to external media.

If you are working with valuable information, which is tantamount to death to lose, then you definitely need RAID 1- it is extremely difficult to lose information with it.

I repeat that Very it is desirable that the disks installed in RAID the array was gender identical. Size, brand, series, cache size - everything should preferably be the same.

Afterword

That's how things are.

By the way, I wrote how to assemble this miracle in the article: " How to create a RAID array standard methods ", and about a couple of parameters in the material " RAID 0 of two SSDs, - practical tests with Read Ahead and Read Cache". Use the search.

I sincerely hope that this article will be useful to you and you will definitely make yourself a raid of one type or another. Believe me, it's worth it.

For questions about creating and configuring them, in general, you can contact me in the comments - I’ll try to help (if there are instructions for your motherboard online). I will also be glad to any additions, wishes, thoughts and all that stuff.

If you are interested in this article, then you have probably encountered or expect to soon encounter one of the following problems on your computer:

- there is clearly not enough physical capacity of the hard drive as a single logical drive. Most often this problem occurs when working with large files (video, graphics, databases);
- the hard drive's performance is clearly not enough. Most often, this problem occurs when working with non-linear video editing systems or when a large number of users simultaneously access files on the hard drive;
- The reliability of the hard drive is clearly lacking. Most often, this problem arises when it is necessary to work with data that must never be lost or that must always be available to the user. Sad experience shows that even the most reliable equipment sometimes breaks down and, as a rule, at the most inopportune moment.
Creating a RAID system on your computer can solve these and some other problems.

What is "RAID"?

In 1987, Patterson, Gibson, and Katz of the University of California, Berkeley, published “A Case for Redundant Arrays of Inexpensive Disks (RAID).” This article described different types disk arrays, denoted by the abbreviation RAID - Redundant Array of Independent (or Inexpensive) Disks (redundant array of independent (or inexpensive) disk drives). RAID is based on the following idea: by combining several small and/or cheap disk drives into an array, you can get a system that is superior in capacity, speed and reliability to the most expensive disk drives. On top of that, from a computer's point of view, such a system looks like one single disk drive.
It is known that the mean time between failures of a drive array is equal to the mean time between failures of a single drive divided by the number of drives in the array. As a result, the array's mean time between failures is too short for many applications. However, a disk array can be made tolerant of the failure of a single drive in several ways.

In the above article, five types (levels) of disk arrays were defined: RAID-1, RAID-2, ..., RAID-5. Each type provided fault tolerance as well as different advantages over a single drive. Along with these five types, the RAID-0 disk array, which is NOT redundant, has also gained popularity.

What RAID levels are there and which one should you choose?

RAID-0. Typically defined as a non-redundant group of disk drives without parity. RAID-0 is sometimes called “Striping” based on the way information is placed on the drives included in the array:

Since RAID-0 does not have redundancy, failure of one drive leads to failure of the entire array. On the other hand, RAID-0 provides maximum data transfer speed and efficient use of disk drive space. Because RAID-0 does not require complex math or logic calculations, its implementation costs are minimal.

Scope of application: audio and video applications requiring high speed continuous data transfer, which cannot be provided by a single drive. For example, research conducted by Mylex to determine the optimal configuration disk system for a non-linear video editing station show that, compared to one drive, a RAID-0 array of two drives gives a 96% increase in write/read speed, and of three drives - by 143% (according to the Miro VIDEO EXPERT Benchmark test).
The minimum number of drives in a "RAID-0" array is 2.

RAID-1. Better known as "Mirroring" is a pair of drives that contain the same information and make up one logical drive:

Recording is performed on both drives in each pair. However, drives in a pair can perform simultaneous read operations. Thus, "mirroring" can double the read speed, but the write speed remains unchanged. RAID-1 has 100% redundancy and a failure of one drive does not lead to a failure of the entire array - the controller simply switches read/write operations to the remaining drive.
RAID-1 provides highest speed work among all types of redundant arrays (RAID-1 - RAID-5), especially in a multi-user environment, but worst use disk space. Because RAID-1 does not require complex math or logic calculations, its implementation costs are minimal.
The minimum number of drives in a "RAID-1" array is 2.
To increase write speed and ensure reliable data storage, several RAID-1 arrays can, in turn, be combined into RAID-0. This configuration is called “two-level” RAID or RAID-10 (RAID 0+1):


The minimum number of drives in a "RAID 0+1" array is 4.
Scope of application: cheap arrays in which the main thing is reliability of data storage.

RAID-2. Distributes data into sector-sized stripes across a group of disk drives. Some drives are dedicated to ECC (Error Correction Code) storage. Since most drives store ECC codes on a per-sector basis by default, RAID-2 does not offer much benefit over RAID-3 and is therefore not used in practice.

RAID-3. As in the case of RAID-2, data is distributed over stripes of one sector in size, and one of the array drives is allocated to store parity information:

RAID-3 relies on ECC codes stored in each sector to detect errors. If one of the drives fails, the information stored on it can be restored by calculating exclusive OR (XOR) using the information on the remaining drives. Each entry is typically distributed across all drives and therefore this type of array is good for data-intensive applications. disk subsystem. Because each I/O operation accesses all the disk drives in the array, RAID-3 cannot perform multiple operations simultaneously. Therefore, RAID-3 is good for single-user, single-tasking environments with long records. To work with short notes synchronization of the rotation of the disk drives is required, since otherwise a decrease in the exchange speed is inevitable. Rarely used, because inferior to RAID-5 in terms of disk space usage. Implementation requires significant costs.
The minimum number of disk drives in a "RAID-3" array is 3 pcs.

RAID-4. RAID-4 is identical to RAID-3 except that the stripe size is much larger than one sector. In this case, reads are performed from a single drive (not counting the drive that stores parity information), so multiple read operations can be performed simultaneously. However, since each write operation must update the contents of the parity drive, it is not possible to perform multiple write operations simultaneously. This type of array does not have any noticeable advantages over a RAID-5 array.
RAID-5. This type of array is sometimes called a "rotating parity array". This type of array successfully overcomes the inherent disadvantage of RAID-4 - the inability to simultaneously perform multiple write operations. This array, like RAID-4, uses stripes big size, but, unlike RAID-4, parity information is stored not on one drive, but on all drives in turn:

Write operations access one drive with data and another drive with parity information. Since the parity information for different stripes is stored on different drives, multiple simultaneous writes are not possible unless either the data stripes or the parity stripes are on the same drive. How more disks ods in the array, the less often the location of the information and parity stripes coincides.
Scope of application: reliable large-volume arrays. Implementation requires significant costs.
The minimum number of drives in a "RAID-5" array is 3.

RAID-1 or RAID-5?
RAID-5 uses more economically than RAID-1 disk space, since for redundancy it stores not a “copy” of information, but a check number. As a result, RAID-5 can combine any number of drives, of which only one will contain redundant information.
But higher disk space efficiency comes at the expense of lower information exchange rates. When writing information to RAID-5, the parity information must be updated each time. To do this, you need to determine which parity bits have changed. First, the old information to be updated is read. This information is then XORed with new information. The result of this operation is a bit mask in which each bit =1 means that the value in the parity information at the corresponding position must be replaced. The updated parity information is then written to the appropriate location. Therefore, for each program request to write information, RAID-5 performs two reads, two writes, and two XOR operations.
There is a cost to using disk space more efficiently (storing a parity block instead of a copy of the data): additional time is required to generate and write parity information. This means that the write speed on RAID-5 is lower than on RAID-1 by a ratio of 3:5 or even 1:3 (i.e., the write speed on RAID-5 is 3/5 to 1/3 the write speed RAID-1). Because of this, RAID-5 is pointless to create in software. They also cannot be recommended in cases where recording speed is critical.

Which RAID implementation method should you choose - software or hardware?

After reading the description different levels RAID, you will notice that nowhere is there any mention of any specific requirements for the hardware that is needed for RAID implementations. From which we can conclude that all that is needed to implement RAID is to connect required amount drives to the controller available in the computer and install special software on the computer. This is true, but not entirely!
Indeed, there is a possibility software implementation RAID. An example would be the OS Microsoft Windows NT 4.0 Server, in which software implementation of RAID-0, -1 and even RAID-5 is possible (Microsoft Windows NT 4.0 Workstation provides only RAID-0 and RAID-1). However, this solution should be considered as extremely simplified and does not allow fully realizing the capabilities of the RAID array. It is enough to note that with software implementation of RAID, the entire burden of placing information on disk drives, calculating control codes, etc. falls on CPU, which naturally does not increase the performance and reliability of the system. For the same reasons, there are practically no service functions here and all operations to replace a faulty drive, add a new drive, change the RAID level, etc. are carried out with complete loss of data and with the complete prohibition of performing any other operations. The only advantage of software implementation of RAID is its minimal cost.
- a specialized controller frees the central processor from basic operations with RAID, and the efficiency of the controller is more noticeable the higher the level of RAID complexity;
- controllers, as a rule, are equipped with drivers that allow you to create RAID for almost any popular OS;
- built-in Controller BIOS and the management programs that come with it allow the system administrator to easily connect, disconnect or replace drives included in RAID, create multiple RAID arrays, and even different levels, monitor the status of the disk array, etc. With “advanced” controllers, these operations can be performed “on the fly”, i.e. without turning off system unit. Many operations can be performed in " background", i.e. without interrupting current work and even remotely, i.e. from any (of course, if you have access) workplace;
- controllers can be equipped buffer memory(“cache”), in which the last few blocks of data are stored, which, with frequent access to the same files, can significantly increase the performance of the disk system.
The disadvantage of hardware RAID implementation is the relatively high cost of RAID controllers. However, on the one hand, you have to pay for everything (reliability, speed, service). On the other hand, in Lately, with the development of microprocessor technology, the cost of RAID controllers (especially younger models) began to fall sharply and became comparable to the cost of ordinary disk controllers, which makes it possible to install RAID systems not only in expensive mainframes, but also in servers entry level and even to workstations.

How to choose a RAID controller model?

There are several types of RAID controllers depending on their functionality, design and cost:
1. Drive controllers with RAID functionality.
In essence, this is an ordinary disk controller, which, thanks to special BIOS firmware, allows you to combine disk drives into a RAID array, usually of level 0, 1 or 0+1.

Ultra (Ultra Wide) SCSI controller from Mylex KT930RF (KT950RF).
Externally, this controller is no different from an ordinary SCSI controller. All “specialization” is located in the BIOS, which is divided into two parts - “SCSI Configuration” / “RAID Configuration”. Despite its low cost (less than $200), this controller has a good set of functions:

- combining up to 8 drives into RAID 0, 1 or 0+1;
- support Hot Spare for on-the-fly replacement of a failed disk drive;
- the ability to automatically (without operator intervention) replace a faulty drive;
- automatic control integrity and identity (for RAID-1) of data;
- presence of a password to access the BIOS;
- RAIDPlus program that provides information about the state of drives in RAID;
- drivers for DOS, Windows 95, NT 3.5x, 4.0

Today we will find out interesting information about what it is RAID array and what role do these arrays play in the life of hard drives, yes, exactly in them.

Hard drives themselves play a fairly important role in a computer, since with the help of them we run the system and store a lot of information on them.

Time passes and any hard drive can fail, it could be any that we are not talking about today.

I hope that many have heard about the so-called raid arrays, which allow you not only to speed up the operation of hard drives, but also, if something happens, to save important data from disappearing, perhaps forever.

Also, these arrays have serial numbers, which is why they differ. Each performs different functions. For example, there is RAID 0, 1, 2, 3, 4, 5 etc. Today we will talk about these same arrays, and then I will write an article on how to use some of them.

What is a RAID array?

RAID is a technology that allows you to combine several devices, namely hard drives, in our case there is something like a bunch of them. Thus, we increase the reliability of data storage and read/write speed. Perhaps one of these functions.

So, if you want to either speed up your disk or simply secure your information, it’s up to you. More precisely, it depends on the choice of the desired Raid configuration; these configurations are marked with serial numbers 1, 2, 3...

Raids are very useful feature and I recommend it to everyone. For example, if you use 0 configuration, then you will experience an increase in speed hard disk, after all, hard disks are almost the lowest-speed device.

If you ask why, then I think everything is clear. every year they become more powerful, they are equipped with more high frequency, a large number of cores, and much more. The same with and. But hard drives are only growing in volume so far, but the turnover rate remains the same as 7200. Of course there are also rarer models. The situation has been saved so far by the so-called, which speed up the system several times.

Let's say you came to build RAID 1, in this case you will receive a high guarantee of the protection of your data, since they will be duplicated on another device (disk) and, if one hard drive fails, all the information will remain on the other.

As you can see from the examples, raids are very important and useful, they need to be used.

So, a RAID array is physically a combination of two hard drives connected to system board, maybe three or four. By the way, it should also support the creation of RAID arrays. Connecting hard disks are carried out according to the standard, and the creation of raids takes place at the software level.

When we created the raid programmatically, nothing much changed by eye, you will just work in the BIOS, and everything else will remain as it was, that is, when you look into My Computer, you will see all the same connected drives.

To create an array you don’t need much: a motherboard with RAID support, two identical hard drives (it is important). They should be the same not only in size, but also in cache, interface, etc. It is desirable that the manufacturer be the same. Now turn on the computer and look for the parameter there SATA Configuration and put it on RAID. After restarting the computer, a window should appear in which we will see information about disks and raids. There we have to click CTRL+I to start setting up the raid, that is, adding or removing disks from it. Then its configuration will begin.

How many of these raids are there? There are several of them, namely RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, RAID 6. I will talk in more detail about only two of them.

  1. RAID 0– allows you to create a disk array in order to increase the read/write speed.
  2. RAID 1– allows you to create mirrored disk arrays to protect data.

RAID 0, what is it?

Array RAID 0, which is also called "Stripping" uses from 2 to 4 hard drives, rarely more. Working together, they improve productivity. Thus, the data with such an array is divided into data blocks and then written to several disks at once.

Performance increases due to the fact that one block of data is written to one disk, to another disk, another block, etc. I think it is clear that 4 disks will increase performance more than two. If we talk about security, it suffers throughout the entire array. If one of the disks fails, then in most cases, all information will be lost forever.

The fact is that in a RAID 0 array, information is located on all disks, that is, the bytes of a file are located on several disks. Therefore, if one disk fails, a certain amount of data will also be lost, and recovery is impossible.

It follows from this that it is necessary to make permanent ones on external media.

RAID 1, what is it?

Array RAID 1, it is also called Mirroring- mirror. If we talk about the disadvantage, then in RAID 1 the volume of one of the hard drives is, as it were, “unavailable” to you, because it is used to duplicate the first disk. In RAID 0 this space is available.

Among the advantages, as you probably already guessed, it follows that the array provides high data reliability, that is, if one disk fails, all the data will remain on the second. Failure of two disks at once is unlikely. Such an array is often used on servers, but this does not prevent it from being used on ordinary computers.

If you choose RAID 1, then know that performance will drop, but if data is important to you, then use a data approach.

RAID 2-6, what is it?

Now I will briefly describe the remaining arrays, so to speak, for general development, and all because they are not as popular as the first two.

RAID 2– needed for arrays that use Hamming code (I wasn’t interested in what kind of code it was). The principle of operation is approximately the same as in RAID 0, that is, information is also divided into blocks and written to disks one by one. The remaining disks are used to store error correction codes, with the help of which, if one of the disks fails, data can be recovered.

True, for of this array It’s better to use 4 disks, which is quite expensive, and as it turned out, when using so many disks, the performance gain is quite controversial.

RAID 3, 4, 5, 6– I won’t write about these arrays here, because necessary information is already on Wikipedia, if you want to learn about these arrays, then read it.

Which RAID array to choose?

Let's say you often install various programs, games and copy a lot of music or movies, then you are recommended to use RAID 0. When choosing hard drives, be careful, they must be very reliable so as not to lose information. Be sure to do backups data.

Eat important information, which should be safe and sound? Then RAID 1 comes to the rescue. When choosing hard drives, their characteristics must also be identical.

Conclusion

So we sorted out a new one for someone, and for someone old information by RAID arrays. I hope you find the information useful. Soon I will write about how to create these arrays.

RAID array (Redundant Array of Independent Disks) - connecting several devices to increase performance and/or reliability of data storage, in translation - a redundant array of independent disks.

According to Moore's law, current productivity increases every year (namely, the number of transistors on a chip doubles every 2 years). This can be seen in almost every computer hardware industry. Processors increase the number of cores and transistors, while reducing the process RAM increases frequency and throughput, memory solid state drives increases wear resistance and reading speed.

But simple hard drives (HDDs) have not advanced much over the past 10 years. As the standard speed was 7200 rpm, it remains so (not taking into account server HDDs with revolutions of 10,000 or more). Slow 5400 rpm is still found on laptops. For most users, in order to increase the performance of their computer it will be more convenient to buy an SDD, but the price for 1 gigabyte of such media is significantly higher than simple HDD. "How to improve drive performance without severe loss money and volume? How to save your data or increase the security of your data? There is an answer to these questions - a RAID array.

Types of RAID arrays

On this moment exist following types RAID arrays:

RAID 0 or "Striping"– an array of two or more disks to enhance overall performance. The raid volume will be total (HDD 1 + HDD 2 = Total volume), the read/write speed will be higher (due to splitting the recording into 2 devices), but the reliability of information security will suffer. If one of the devices fails, all information in the array will be lost.

RAID 1 or "Mirror"– several disks copying each other to increase reliability. The write speed remains at the same level, the read speed increases, reliability increases many times over (even if one device fails, the second will work), but the cost of 1 Gigabyte of information increases by 2 times (if you make an array of two hdds).

RAID 2 is an array built on disks for storing information and error correction disks. The number of HDDs for storing information is calculated using the formula “2^n-n-1”, where n is the number of HDD corrections. This type is used when large quantities HDD, the minimum acceptable number is 7, where 4 is for storing information, and 3 is for storing errors. The advantage of this type will be increased productivity, compared to one disk.

RAID 3 – consists of “n-1” disks, where n is a disk for storing parity blocks, the rest are devices for storing information. Information is divided into pieces smaller than the sector size (divided into bytes), well suited for working with large files, the reading speed of small files is very low. Characteristic high performance, but with low reliability and narrow specialization.

RAID 4 is similar to type 3, but is divided into blocks rather than bytes. This solution was able to correct the low reading speed of small files, but the writing speed remained low.

RAID 5 and 6 - instead of a separate disk for error correlation, as in previous versions, blocks are used that are evenly distributed across all devices. In this case, the speed of reading/writing information increases due to parallelization of recording. Minus of this type is long-term recovery of information in the event of failure of one of the disks. During recovery it goes very high load to other devices, which reduces reliability and increases the failure of another device and the loss of all array data. Type 6 improves overall reliability but reduces performance.

Combined types of RAID arrays:

RAID 01 (0+1) – Two Raid 0s are combined into Raid 1.

RAID 10 (1+0) – RAID 1 disk arrays, which are used in type 0 architecture. It is considered the most reliable data storage option, combining high reliability and performance.

You can also create an array from SSD drives . According to 3DNews testing, such a combination does not provide a significant increase. It is better to purchase a drive with a more powerful PCI or eSATA interface

Raid array: how to create

Created by connecting through a special RAID controller. At the moment there are 3 types of controllers:

  1. Software – software an array is emulated, all calculations are performed by the CPU.
  2. Integrated – mainly distributed to motherboards(not server segment). A small chip on the mat. board responsible for emulating the array, calculations are performed through the CPU.
  3. Hardware – expansion card (for desktop computers), usually with PCI interface, has own memory and a computing processor.

RAID hdd array: How to make it from 2 disks via IRST


Data recovery

Some data recovery options:

  1. If Raid 0 or 5 fails, the RAID Reconstructor utility can help, which will assemble available information drives and rewrite it to another device or media in the form of an image of the previous array. This option It will help if the disks are working properly and the error is software.
  2. For Linux systems mdadm recovery is used (a utility for managing software Raid arrays).
  3. Hardware recovery must be done via specialized services, because without knowledge of the controller’s operating methodology, you can lose all the data and getting it back will be very difficult or even impossible.

There are many nuances that need to be taken into account when creating a Raid on your computer. Basically, most options are used in the server segment, where data stability and security is important and necessary. If you have questions or additions, you can leave them in the comments.

Have a great day!

Today we will talk about RAID arrays. Let's figure out what it is, why we need it, what it is like and how to use all this magnificence in practice.

So, in order: what is RAID array or simply RAID? This abbreviation stands for "Redundant Array of Independent Disks" or "redundant (backup) array of independent disks." To put it simply, RAID array this is a collection of physical disks combined into one logical disk.

It usually happens the other way around - one physical disk is installed in the system unit, which we divide into several logical ones. Here the situation is the opposite - several hard drives are first combined into one, and then the operating system is perceived as one. Those. The OS firmly believes that it physically only has one disk.

RAID arrays There are hardware and software.

Hardware RAID arrays are created before the OS boots via special utilities, wired into RAID controller- something like a BIOS. As a result of creating such RAID array already at the OS installation stage, the distribution kit “sees” one disk.

Software RAID arrays are created by OS tools. Those. during loading operating system"understands" that she has several physical disks and only after the OS starts, through software disks are combined into arrays. Naturally, the operating system itself is not located on RAID array, since it is set before it is created.

"Why is all this needed?" - you ask? The answer is: to increase the speed of reading/writing data and/or increase fault tolerance and security.

"How RAID array can increase speed or secure data?" - to answer this question, consider the main types RAID arrays, how they are formed and what it gives as a result.

RAID-0. Also called "Stripe" or "Tape". Two or more hard drives are combined into one by sequential merging and summing up the volumes. Those. if we take two 500GB disks and create them RAID-0, the operating system will perceive this as one terabyte disk. At the same time, the read/write speed of this array will be twice as high as that of one disk, since, for example, if the database is physically located in this way on two disks, one user can read data from one disk, and another user can write to another disk at the same time. While in the case of the database location on one disk, the hard disk itself has read/write tasks different users will execute sequentially. RAID-0 will allow reading/writing in parallel. As a consequence, the more disks in the array RAID-0, the faster the array itself works. The dependence is directly proportional - the speed increases N times, where N is the number of disks in the array.
At the array RAID-0 there is only one drawback that outweighs all the advantages of using it - complete absence fault tolerance. If one of the physical disks of the array dies, the entire array dies. There's an old joke about this: "What does the '0' in the title mean? RAID-0? - the amount of information restored after the death of the array!"

RAID-1. Also called "Mirror" or "Mirror". Two or more hard drives are combined into one by parallel merging. Those. if we take two 500GB disks and create them RAID-1, the operating system will perceive this as one 500GB disk. In this case, the read/write speed of this array will be the same as that of one disk, since information is read/written to both disks simultaneously. RAID-1 does not provide a gain in speed, but provides greater fault tolerance, since in the event of the death of one of the hard drives, there is always a complete duplicate of information located on the second drive. It must be remembered that fault tolerance is provided only against the death of one of the array disks. If the data was deleted purposefully, it is deleted from all disks of the array simultaneously!

RAID-5. More safe option RAID-0. The volume of the array is calculated using the formula (N - 1) * DiskSize RAID-5 from three 500GB disks, we get an array of 1 terabyte. The essence of the array RAID-5 is that several disks will be combined into RAID-0, and on last disc the so-called “checksum” is stored - service information intended to restore one of the array disks in the event of its death. Array write speed RAID-5 slightly lower, since time is spent calculating and writing the checksum to separate disk, but the read speed is the same as in RAID-0.
If one of the array disks RAID-5 dies, the read/write speed drops sharply, since all operations are accompanied by additional manipulations. Actually RAID-5 turns into RAID-0 and if recovery is not taken care of in a timely manner RAID array there is a significant risk of losing data completely.
With an array RAID-5 You can use the so-called Spare disk, i.e. spare. During stable operation RAID array This disk is idle and not used. However, in the event of a critical situation, restoration RAID array starts automatically - information from the damaged one is restored to the spare disk using checksums located on a separate disk.
RAID-5 is created from at least three disks and saves from single errors. In case of simultaneous appearance various errors on different drives RAID-5 doesn't save.

RAID-6- is an improved version of RAID-5. The essence is the same, only for checksums not one, but two disks are used, and checksums are calculated using different algorithms, which significantly increases the fault tolerance of everything RAID array generally. RAID-6 assembled from at least four disks. The formula for calculating the volume of an array looks like (N - 2) * DiskSize, where N is the number of disks in the array, and DiskSize is the size of each disk. Those. while creating RAID-6 from five 500GB disks, we get an array of 1.5 terabytes.
Write speed RAID-6 lower than RAID-5 by about 10-15%, which is due to additional time spent on calculating and writing checksums.

RAID-10- also sometimes called RAID 0+1 or RAID 1+0. It is a symbiosis of RAID-0 and RAID-1. The array is built from at least four disks: on the first RAID-0 channel, on the second RAID-0 to increase read/write speed, and between them in a RAID-1 mirror to increase fault tolerance. Thus, RAID-10 combines the advantages of the first two options - fast and fault-tolerant.

RAID-50- similarly, RAID-10 is a symbiosis of RAID-0 and RAID-5 - in fact, RAID-5 is built, only its constituent elements are not independent hard drives, but RAID-0 arrays. Thus, RAID-50 gives very good speed read/write and contains the resilience and reliability of RAID-5.

RAID-60- the same idea: we actually have RAID-6, assembled from several RAID-0 arrays.

There are also other combined arrays RAID 5+1 And RAID 6+1- they look like RAID-50 And RAID-60 with the only difference being that basic elements The array is not RAID-0 tapes, but RAID-1 mirrors.

How do you understand combined RAID arrays: RAID-10, RAID-50, RAID-60 and options RAID X+1 are direct descendants of the basic array types RAID-0, RAID-1, RAID-5 And RAID-6 and serve only to increase either read/write speed or increase fault tolerance, while carrying the functionality of basic, parent types RAID arrays.

If we move on to practice and talk about the use of certain RAID arrays in life, the logic is quite simple:

RAID-0 V pure form we don’t use it at all;

RAID-1 We use it where read/write speed is not particularly important, but fault tolerance is important - for example, on RAID-1 It’s good to install operating systems. In this case, no one except the OS accesses the disks, the speed of the hard disks themselves is quite sufficient for operation, fault tolerance is ensured;

RAID-5 We install it where you need speed and fault tolerance, but don’t have enough money to buy it more hard drives or there is a need to restore arrays in case of damage without stopping work - spare Spare drives will help us here. Common Application RAID-5- data storage;

RAID-6 used where it is simply scary or there is a real threat of death of several disks in the array at once. In practice it is quite rare, mainly among paranoid people;

RAID-10- used where it is necessary to work quickly and reliably. Also the main direction for use RAID-10 are file servers and database servers.

Again, if we simplify further, we come to the conclusion that where there is no large and voluminous work with files, it is quite enough RAID-1- operating system, AD, TS, mail, proxy, etc. Where serious work with files is required: RAID-5 or RAID-10.

The ideal solution for a database server seems to be a machine with six physical disks, two of which are combined into a mirror RAID-1 and the OS is installed on it, and the remaining four are combined into RAID-10 for fast and reliable operation with data.

If, after reading all of the above, you decide to install it on your servers RAID arrays, but don’t know how to do it and where to start - contact us! - we will help you choose necessary equipment, and we will also carry out installation work to implement RAID arrays.