RAID and speed. An example of building an array based on ASUS

03.05.2019 Internet, Wi-Fi, local networks

Enthusiasts have an unwritten rule: hard drive Western Digital WD1500 Raptor is the ideal desktop model if you need maximum productivity... But not all users can follow this path, since spending $ 240 on a hard drive with a capacity of only 150 GB is not a very attractive solution. Is the Raptor still the best choice? The price hasn't changed for many months, and today you can easily buy a pair of 400GB drives for that kind of money. Isn't it time to compare the performance of modern RAID arrays with Raptor?

Enthusiasts are familiar with the Raptor hard drives, as it is the only 3.5 "desktop hard drive that spins at 10,000 RPM. Most drives in this market sector are 7,200 RPM. Only expensive hard drives for servers rotate faster. WD Raptor 36GB and 74GB hard drives were introduced three years ago. About a year ago, entered the market Western digital raptor-x which provides more high productivity, there are also models with a transparent cover that allows you to look inside the hard drive.

Western Digital Raptor hard drives have outperformed all other 3.5 "Serial ATA desktop drives since their release, although they were initially positioned for low-cost servers.

A spindle speed of 10,000 rpm offers two significant advantages. First, the data transfer rate increases markedly. Yes, the maximum sequential read speed is not very impressive, but minimum speed is far superior to any 7,200 rpm hard drive. In addition, the 10,000 RPM hard drive has less spin latency, which means it takes less time for the drive to retrieve data after the read / write heads are positioned.

The main disadvantage of the WD Raptor is the price - about $ 240 for the 150GB model. Among other disadvantages, we note a higher (although not critical) noise level and higher heat dissipation. However, enthusiasts can easily put up with such shortcomings if this hard drive provides a higher performance of the storage subsystem.

If you calculate the cost of storing a gigabyte of data, then Raptor will not be so attractive anymore. For $ 240, you can get a couple of 400 GB hard drives, and the $ 300 level for the 750 GB Seagate Barracuda 7200.10 is not far away. If you look at the low-cost segment, you can grab a pair of 160GB 7,200 rpm hard drives for $ 50 each, which will provide the same storage capacity as the Raptor, but at more than half the price. Therefore, today even enthusiasts often ask themselves: is it worth taking a WD Raptor, is it not better to choose a RAID 0 configuration on two hard drives at 7,200 rpm?

RAID 0 array does not reduce access time, but it practically doubles the speed sequential read, since the data is shared between two hard drives. The disadvantage can be considered an increased risk of data loss, because if one hard drive fails, the entire array will be lost (although today there are options RAID information recovery). Many of the onboard controllers on high end motherboards support RAID modes that are easy to set up and install.

Fast or sane hard drive?

	Performance	Capacity	Data storage security	Price
One winchester (7,200 rpm)	Good	Adequate to excellent	Sufficient *	Low to high, $ 50 to $ 300
150GB WD Raptor (10,000 rpm)	Excellent	Sufficient	Sufficient *	High: $ 240+
2x 160 GB (7,200 rpm)	Very good to excellent	Good to excellent	Insufficient *	Low to High: $ 50 per hard drive
2x 150 GB WD Raptor (10,000 rpm)	Excellent	Good	Insufficient *	High to Very High: $ 240 per drive

* It should be remembered that any hard drive will fail sooner or later. The technology is based on mechanical components, and their lifetime is limited. Manufacturers specify Mean Time Between Failures (MTBF) for hard drives. If you set up a RAID 0 array on two 7,200 rpm hard disks, the risk of data loss doubles, because if one hard drive fails, you will lose the entire RAID 0 array. Therefore, regularly back up important data and create an image operating system.

Today you can buy 40-80 GB hard drives for almost a penny, and if you don't have special requirements in terms of capacity, then such a volume will be enough even today. However, we recommend getting hard drives that are priced at $ 50- $ 70, as you can easily get models with capacities from 120GB to 200GB. Models for 250 and 320 GB have already begun to appear in online stores for less than $ 100. For the money you spend on a 10,000 RPM WD Raptor, you can easily get capacities from 800GB to 1TB on 7,200 RPM hard drives.

If you do not need such a high capacity, you can be content with 7,200 rpm hard drives. entry level... Two WD1600AAJS drives from Western Digital will cost $ 55 each, and you can easily get 320GB in RAID 0. And spend half the money and get twice the capacity. How justified are these savings? Let's figure it out.

7,200 or 10,000 rpm? RAID 0 or Raptor?

We decided to test different hard drive configurations. We tested a single WD Raptor WD1500ADFD, a single WD4000KD, a Raptor in a RAID 0 array, and a WD4000 in a RAID 0. We decided to use 400GB WD 7,200 rpm hard drives, since two of these hard drives are roughly the price of one Raptor ... Let's see how well a "budget" RAID array compares to a single Raptor.

WD4000KD is equipped with 16 MB cache and has a Serial ATA / 150 interface. The main difference compared to the 10,000 RPM WD Raptor lies in performance and capacity. The Raptor is significantly less expensive per gigabyte of storage, which is at least six times the cost of the 400GB WD4000KD. The benchmarks will show how big the performance differences are. At the time of publication, the price of the WD4000KD Caviar was $ 130.

The Raptor is the undisputed performance champion in the desktop PC market, but it's also the most expensive hard drive. The WD1500 Raptor uses the Serial ATA / 150 interface, which is still sufficient. Looking at the benchmark results, no other hard drive can beat the Raptor, even with a 300MB / s SATA interface. In general, speed SATA interface should not influence the purchasing decision. At the time of publication, the price of the WD1500ADFD Raptor was $ 240.

This configuration should battle the WD1500 Raptor. Will two WD4000KD hard drives in RAID 0 beat Raptor?

This scenario is the most expensive in our testing, as it requires two WD Raptor hard drives, but it is very interesting nonetheless. Two 10,000 RPM Raptor hard drives in a RAID 0 array should literally rip everyone apart.

RAID 0

Performance

In theory, RAID 0 is an ideal solution for increasing performance because the sequential data transfer rate scales almost linearly with the number of hard drives in the array. Files are distributed block-by-block across all hard drives, that is, the RAID controller writes data almost simultaneously to several hard drives. RAID 0 data transfer rates are noticeable in almost all scenarios, although access times are not reduced. In real tests, access times in RAID 0 arrays even increase, albeit very slightly, by about half a millisecond.

If you build a RAID configuration on multiple hard drives, the storage controller can become the bottleneck. Conventional bus PCI can transfer a maximum of 133 MB / s, which is easily consumed by two modern hard drives. The Serial ATA controllers that come with the chipset generally offer higher bandwidth, so they do not limit the performance of RAID arrays.

We got up to 350 MB / s on four WD Raptor hard drives with 10,000 rpm on chipsets with Intel ICH7 and ICH8 south bridges... An excellent result that is very close to the combined bandwidth of four separate hard drives. In the same time, nVidia chipset nForce 680 showed a maximum of 110 MB / s, alas. It seems that not every integrated RAID controller is capable of delivering high performance RAID arrays, even if it is technically possible.

Comparison of RAID modes

It should be noted that RAID 0 does not really reveal the idea of RAID arrays, which stands for Redundant Arrays of Independent / Inexpensive Drives. Redundancy means storing data in at least two places so that it is preserved even if one hard drive fails. This is the case, for example, in the case of a RAID 1 array, in which all data is mirrored on a second hard disk. If one of the hard drives "dies", you will only know about it from the messages of the RAID controller. RAID 5 is much more sophisticated and targeted at the professional sector. It works like a RAID 0 array, stripping data across all hard drives, but adding redundancy information to the data. Therefore, the net capacity of a RAID 5 array is equal to the total capacity of all hard drives except one. Redundancy information is not written to one hard disk (as in the case of RAID 3), but is distributed across all drives so as not to create a "bottleneck" when reading or writing redundancy information to one HDD. A RAID 5 array understandably requires no less than three hard drives.

Risks and side effects

The main danger for a RAID 0 array is the failure of any hard drive, since the entire array is lost. That is why the more disks in a RAID 0 array, the higher the risk of losing information. If three hard drives are used, the probability of information loss is three times higher than with one drive. This is why RAID 0 cannot be read good option for users who need reliable system, and who cannot afford a minute of downtime.

Even if you buy a powerful and expensive standalone RAID controller, you still depend on hardware. Two different controllers can support RAID 5, but the specific implementation can be very different.

Intel Matrix RAID: Multiple RAID arrays can be created on the same set of hard drives.

If the RAID controller is smart enough, it can allow two or more RAID arrays to be installed on one set of hard drives. Although each RAID controller can support multiple RAID arrays, this usually requires a different set of hard drives. Therefore, the Intel south bridges ICH7-R and ICH8-R turned out to be very interesting: they support the Intel Matrix RAID function.

A typical implementation would be two RAID arrays on two hard drives. The first third of the capacity of the two hard drives can be allocated to a fast RAID 0 array for the operating system, and the remainder can be allocated to a RAID 1 array for storing important data. If one of the hard drives fails, the operating system will be lost, but important data that is mirrored to the second hard drive will be preserved thanks to RAID 1. By the way, after installing Windows, you can create an image of the operating system and store it on a reliable RAID 1 array. Then, if the hard drive fails, the OS can be quickly restored.

Please be aware that many RAID arrays require a RAID driver (such as Intel Matrix Storage Manager) to be installed, which can cause problems during system boot and recovery. Any bootable disk that you will use for recovery will need RAID drivers. Therefore, save the driver diskette for such a case.

Test configuration

Configuration for low-level tests


Processors	2x Intel Xeon(Nocona core), 3.6 GHz, FSB800, 1 MB L2 cache
Platform	Asus NCL-DS (Socket 604), Intel E7520 chipset, BIOS 1005
Memory	Corsair CM72DD512AR-400 (DDR2-400 ECC, reg.), 2x 512 MB, CL3-3-3-10 latency
System hard drive	Western Digital Caviar WD1200JB, 120 GB, 7200 rpm, 8 MB cache, UltraATA / 100
Storage controllers	Intel 82801EB UltraATA / 100 Controller (ICH5) Silicon Image Sil3124, PCI-X
Net	Integrated Broadcom BCM5721 Gigabit Ethernet Controller
Video card	Integrated ATi RageXL, 8 MB
Tests and settings
Performance tests	c "t h2benchw 3.6 PCMark05 V1.01
I / O tests	IOMeter 2003.05.10 Fileserver-Benchmark Webserver-Benchmark Database-Benchmark Workstation-Benchmark
System software
OS	Microsoft Windows Server 2003 Enterprise Edition, Service Pack 1
Platform driver	Intel Chipset Installation Utility 7.0.0.1025
Graphics driver	Default Windows Graphics Driver

Configuration for SYSmark2004 SE

System hardware
CPU	Intel Core 2 Extreme X6800 (Conroe 65 nm, 2.93 GHz, 4 MB L2 cache)
Motherboard	Gigabyte GA-965P-DQ6 2.0, chipset: Intel 965P, BIOS: F9
General hardware
Memory	2x 1024 MB DDR2-1111 (CL 4,0-4-4-12), Corsair CM2X1024-8888C4D XMS6403v1.1
Video card	HIS X1900XTX IceQ3, GPU: ATi Radeon X1900 XTX (650 MHz), memory: 512 MB GDDR3 (1550 MHz)
Hard disk I	150 GB, 10,000 RPM, 8 MB Cache, SATA / 150, Western Digital WD1500ADFD
Hard Drive II	400 GB, 7,200 rpm, 16 MB cache, SATA / 300, Western Digital WD4000KD
DVD-ROM	Gigabyte GO-D1600C (16x)
Software
ATi Drivers	Catalyst Suite 7.1
Intel Chipset Drivers	Software Installation Utility 8.1.1.1010
Intel RAID Drivers	Matrix Storage Manager 6.2.1.1002
DirectX	9.0c (4.09.0000.0904)
OS	Windows XP, Build 2600 SP2
Tests and settings
SYSmark	Version 2004 Second Edition, Official Run

Well, we'll have to move on to the battle between the current 150GB WD Raptor hard drives and the 400GB WD4000KD drives in RAID 0. The result was amazing. While the WD Raptor remains undoubtedly the fastest Serial ATA desktop hard drive, RAID 0 comes out on top in most benchmarks, apart from access times and I / O performance. The cost of storing a gigabyte of data on the Raptor is the most questionable, since you can buy three times the capacity of a 7,200 rpm hard drive for half the price. That is, at the price of a gigabyte, the Raptor is now six times lower. However, if you are concerned about data integrity, think twice before choosing a RAID 0 array on two cheap 7,200 rpm hard drives over the WD Raptor.

In the coming months, the price of 500GB hard drives will drop below $ 100. But there will be an increasing demand for available space to store high-definition video, music, and photos. Finally, the recording density on hard disk platters continues to increase, so more productive models at 7,200 rpm. In the long term, the attractiveness of the Raptor will decline.

We think Western Digital should change the pricing of the Raptor lineup, as the performance gains come at the expense of big compromises in hard drive capacity. And, I must say, not everyone will find such compromises justified. We'd love to see an updated 300GB Raptor hard drive, which could also be a hybrid flash drive for Windows Vista.

A year ago, Seagate released the revolutionary Barracuda ATA IV drive - it was the first hard drive with 40GB platters with a spindle speed of 7200 rpm. One of the innovations that Seagate brought to the IDE hard drive market was the silent hydraulic motor, which made Seagate hard drives even more attractive. The Barracuda ATA IV has won numerous awards from various test labs, magazines, websites, and more. I must say, the awards were well deserved, as the disk really combined high reliability, quiet operation and a reasonable price.
But after a while rival firms released 40GB platters, and Barracuda ATA IV gradually disappeared from the first pages of magazines and news feeds online publications.
However, a little later, a wave of rumors arose in the Internet community and began to roll, gaining momentum: - "Barracuda ATA IV does not work in RAID0!".
At first I was even dumbfounded. I didn’t understand how this hard drive doesn’t work in RAID0 ... All my previous experience with hard drives (small, of course, but still ...) told me that this was impossible, but the number of "victims" kept increasing, and the voice they sounded louder and louder. Finally, my nerves could not stand it, and I decided to find out how much truth is in the above words and how much is fiction.

Test system and testing methodology

The tests involved six 40GB Barracuda ATA IV hard drives with three firmware versions: 3.10, 3.19 and 3.75. As you might guess, six hard drives are three pairs of hard drives with the same firmware.
For tests of hard drives, we used the widespread (especially as an integrated controller on motherboards) Promise FastTRAK100 TX2 controller.
The hard drives were tested in two stages. First, three hard drives with different firmware were tested on the FastTRAK100 TX2 controller in SPAN mode, i.e. as single disks. Then not single hard drives were tested, but arrays of two hard drives combined in RAID0. The stripe block size was set to 64KB when creating arrays. For tests in WinBench, arrays were partitioned in FAT32 and NTFS with one partition with the default cluster size.
The tests were carried out four times, the results were averaged. Winchesters did not cool down between tests.
The following tests were used:

WinBench 99 1.2
IOMeter 1999.10.20

To compare the speed of the controller in various types of RAID arrays using the IOMeter test, we used the new StorageReview patterns announced in third edition their methods for testing hard drives.

StorageReview patterns

	File Server	Web Server
	80% Read, 100% Random	100% Read, 100% Random
512b	10%	22%
1KB	5%	15%
2KB	5%	8%
4KB	60%	23%
8KB	2%	15%
16KB	4%	2%
32KB	4%	6%
64KB	10%	7%
128KB	0%	1%
512KB	0%	1%

These patterns are designed to measure the performance of a disk subsystem under a load typical of file & web servers.

Based on the research conducted by Storagereview on the nature of the load on the disk subsystem when working with ordinary Windows applications, our author, Sergey Romanov aka GReY, created a pattern for the IOMeter test (the pattern was created using the averaged IPEAK statistics given on StorageReview for Office, Hi-End and Bootup):

Workstation pattern

Transfer Size Request	% of Access Specification	% Reads	% Random

512B	1	0	100
1KB	2	0	100
2KB	1	0	100
4KB	50	60	80
8KB	4	50	100
16KB	6	50	100
20KB	2	50	100
24KB	2	50	100
28KB	1	50	100
32KB	13	70	70
48KB	1	50	100
52KB	1	50	100
64KB	14	80	60
64KB + 512B	2	50	100

We will be guided by this pattern to assess the attractiveness of hard drives and RAID controllers for an ordinary Windows user.

Additionally, a comparison was made of the speed of the controller with different types of RAID arrays with a variable ratio of read / write operations. A pattern was created in which 100% random 8KB data blocks were used, and the ratio of the number of reads / writes varied from 100/0 to 0/100 with a step of -10 / + 10.

And finally, the ability of the controllers to work with variable-size Sequential requests for reading and writing in various types of RAID arrays was tested.

Test system

motherboard - Asustek P3B-F
processor - Intel P3 600E;
memory - 2 * 128Mb SDRAM Hyundai PC100 ECC;
hard drive - IBM DPTA 372050;
video card - Matrox Millennium 4Mb;
Promise FastTRAK100 TX2 BIOS v2.00.0.24 / Drivers v2.00.0.25
operating system - Windows 2000 Pro SP2;

Winbench99

As we remember, in our last testing of IDE hard drives, the Barracuda ATA IV showed very good results in Winbench99. It is interesting how the results of Barracuda ATA IV in Winbench99 depend on the firmware version of the hard drive, and also how the results in Winbench99 grow when Barracuda ATA IV hard drives are combined into a RAID0 array.

If we compare the results of a single hard drive and a RAID0 array, then, regardless of the firmware version, the RAID0 array is faster when working with large files (for example, in SoundForge). When dealing with small files, RAID0 has no advantage.

The picture as a whole can be estimated by integral tests - the speed of hard drives and RAID0 arrays slightly depends on the firmware version. In tests on a single hard drive, the best results were shown by a hard drive with firmware 3.10, and among RAID0 arrays, hard drives with firmware 3.19 showed the highest speed.
The results are slightly different under NTFS:

As you can see, for RAID0 the fastest firmware is 3.19, and among single disks the best results were shown by hard drives with firmware 3.19 and 3.75.
In general, we must admit that the RAID0 utility factor on the Barracuda ATA IV in Winbench99 is quite low.
In conclusion of this section, I suggest you look at the graphs of linear reading of single Barracuda ATA IV hard drives (with different versions firmware) and RAID0 arrays:

IOMeter: Sequential Read / Write

We'll start the low-level tests of the Barracuda ATA IV and RAID0 arrays on this hard drive with sequential read tests. The essence of the test is that requests for reading data blocks with a sequentially increasing address are sent to the hard drive (array). The size of the data block requested by one command varies from 512bytes to 1MB. The command queue depth is set to four requests.

Let's compare the speed of single hard drives:

It turns out that the speed of Barracuda ATA IV hard drives with requests to read data blocks of different sizes depends a lot on the firmware version! The disk with the "oldest" firmware from those that took part in our tests showed twice the speed when working with 8KB blocks than the disk with firmware 3.19! The newest firmware, 3.75, unexpectedly showed average results.

But, if we look at the results of hard drives in RAID0, we will see that better speed shows a pair of hard drives with 3.75th firmware!
Please note that when working with small blocks (up to 1KB), there is no increase in speed compared to single hard drives. But when the size of the requested data block reaches 64KB, all arrays "go into mode" - so does size matter?
Let's see what we will have when recording:

When writing sequentially, hard drives with different versions firmware demonstrate approximately equal speed, but we could not hide from our eyes that the best results were shown by the drive with firmware 3.10, and the worst - by the drive with firmware 3.75.

When comparing the write speed of RAID0 arrays, it turns out that firmware 3.10 does a better job at writing. However, the array of disks with firmware 3.75 also looks good, but only when working with data blocks whose size exceeds 16KB. The worst performance in this mode is the array of disks with firmware 3.19.
So, synthetic tests have shown that doubling the read speed on an array of two Barracuda ATA IV drives is quite possible. However, this requires two conditions - big size query (FAT32 with 32KB cluster?) and high query intensity.

I finished the previous phrase and thought, but what about the intensity of requests - this still needs to be checked. As one cartoon character said - “I don’t mind my head for work”! :)
So, the next two diagrams show the dependence of the transfer from / to the RAID0 array on the size of the data block under five load options (1,4,16,64,256 requests).

As you can see, when reading, the depth of the command queue greatly affects the data transfer rate. If with a linear load (1 outgoing request) the speed of a RAID0 array hardly reaches 60MB / s, then with queue = 4 it already reaches its maximum (almost 80MB / s). With a further increase in the depth of the command queue, we observe an increase in the speed of work with small blocks, but over 80MB / sec. the transfer does not pass.

When writing, there is also a dependence of the data transfer rate on the depth of the command queue. But here it is not as pronounced as when reading. Though...
Pay attention to the dependence of the speed on the block size with queue = 1. See the break in the graph where the block size is 64KB? After the size of the data block becomes larger than 64KB (and this is the size of the stripe-block), it is split into "subrequests" for hard drives with a size equal to 64KB (the maximum size of the addressable data block for ATA100), and these "subrequests" transferred to the hard drive. How bigger size the original block, the greater the number of "subrequests" arises when it is "split" by the controller driver. Accordingly, each hard drive has its own queue of commands, and both hard drives are "busy" all the time.

So, we found out that Seagate Barracuda ATA IV drives in RAID0 can fully utilize their speed potential. However, this requires two conditions: a large size of the one-time requested block (> 64KB), or a high intensity of smaller requests.
Unfortunately, it is not possible to fulfill these two conditions when working in real applications under Windows ... Accordingly, the speed of the RAID0 array on the Barracuda ATA IV with streaming requests turns out to be low.

IOMeter: Database

Using this test, we examine how the Barracuda ATA IV drives and RAID0 arrays on these lazy write drives are doing.
The results of the Database pattern are summarized in the table:

For the convenience of analysis, three diagrams are built, on each of which you will find graphs of the dependence of the speed of processing requests by single hard drives and RAID0 arrays on the share of write operations.

Well, everything is logical - the greater the depth of the command queue, the more chances the controller has to load evenly both hard drives in the array with work.
As it turned out, Barracuda ATA IV hard drives in a RAID0 array can handle this load normally (random read and write requests). It is clearly seen that the speed of RAID0 arrays on Barracuda ATA IV hard drives with any firmware version turned out to be higher than that of single hard drives at all loads exceeding the linear one (queue = 1). ;)
However, the differences between hard drives with different firmware and RAID0 arrays from these drives are obvious. Disks with firmware 3.19 react slightly better to the increase in the share of write operations, while disks with firmware 3.75, on the contrary, lag slightly behind.

IOMeter: Workstation

First, as usual, the results are in tabular form:

And, of course, in the form of a diagram:

In the Workstation pattern that emulates work in typical Windows In NTFS5 applications, the Barracuda ATA IV's RAID0 array is faster than a single drive under any load.
The difference in the speed of single disks with different firmware is vanishingly small, but when working in RAID0, a slight effect of the firmware version on the speed is observed. But since the difference in the speed of the arrays is 1-2 percent, we can assume that all firmwares cope with the Workstation pattern equally successfully. Although, in all fairness, it should be noted that the drives with firmware 3.75 lag a little.

IOMeter: Fileserver & Webserver

The results of operation in these patterns of Barracuda ATA IV drives as part of RAID0 arrays will let everyone who bought these drives for servers breathe a sigh of relief. :)

As you can see, the RAID0 array of Barracuda ATA IV drives provides a decent performance boost over a single drive. Of course, the results could have been better, but there are no such problems as with Sequential queries.
The reliability of the Barracuda ATA IV drives is at a high level, and considering the above, these drives can be considered a good choice for a server with a low load on the disk subsystem.

FC-Test

As you remember, one of the purposes for which FC-Test was created was testing hard drives in tasks "close to real". :)
In a recent comparison of 12 disks, the use of FC-Test revealed very unusual capabilities of seemingly long-studied hard drives along and across, and therefore I was very curious about applying FC-Test to RAID controllers.
In order for this testing to have practical value, we will compare the speed work of three RAID0 arrays with a capacity of 80GB each with the speed of a single hard drive with the same capacity.
For those who are not yet familiar with the description of the FC-Test program and the methods for testing hard drives using this test, I suggest that you familiarize yourself with the corresponding article, and everyone else will easily understand what we will talk about next.

The tests were carried out under NTFS (the results in FAT32 were also recorded, but since they do not fundamentally differ from the results obtained under NTFS, I decided not to include them in the review in order to reduce its size) in the following patterns:

Patterns for FC-Test

	Total files	Volume, MB
Install	414	575
ISO	3	1600
MP3	271	990
Programs	8504	1380
Windows	9006	1060

Maybe this set of tests is a little too big to find out how Barracuda ATA IV drives behave when working in a RAID0 array (one "ISO" pattern would be enough for this), but we are still trying to compare drives with different firmware versions ...
The diagrams show the speed (in MB / s) of operation of RAID0 arrays and a single Seagate drive Barracuda ATA IV (80GB, firmware 3.19) in four modes.
When reading, the speed of RAID0 arrays on disks with any firmware version is always lower (sometimes significantly!) Than the speed of a single disk.

There is a difference in speed between RAID0 arrays on hard drives with different firmware versions, but it's hard to find any regularity. :)
However, in my opinion, there is one - RAID0 with 3.75 hard drives copes better than others with reading a set of files. But, unfortunately, it is still slower than a single disc.
Why is an array of two disks slower than a single disk when reading, that is, where a RAID0 array should show maximum superiority over a single disk?
The study of the load on the disk subsystem using the Performance Monitor showed that when reading files, the maximum command queue depth was only 4 requests, and the average value was even less - 1.4 requests! We all remember how RAID0 arrays on the Barracuda ATA IV behave in read mode under low load, and this is exactly the mode we find ourselves in in our case!

conclusions

Based on the results of the testing, the following conclusions can be drawn:

Barracuda ATA IV hard drives "work" in RAID0 configurations, whatever the "knowledgeable sources" say

Barracuda ATA IV hard drives cope with "server" patterns quite successfully

Barracuda ATA IV hard drives in RAID0 arrays do not work well with requests to read and write small blocks with low request intensity (low command queue depth).

All tested firmware versions have the above drawback.

So, is it possible to fix the drawbacks found in the Barracuda ATA IV drives and keep the advantages? Apparently, yes. In my opinion, the problem with the low speed of the Barracuda ATA IV in the RAID0 array is the insufficient optimization of the read-ahead algorithms. There is a solution to this problem, and I think Seagate will do it. Judging by the results of hard drives with different firmware versions, Seagate programmers are slowly but surely approaching a solution to the problem. This can be seen from the fact that the linear reading results of hard drives with a large firmware number are higher than those of drives with older firmware.
Can the problem be solved by changing the RAID settings or changing the cluster size (increasing it)? Maybe, but alas, I didn’t succeed ...

To flash or not - that is the question

A number of firmwares for Barracuda ATA IV hard drives "walk" on the Web, and many of our readers could not resist the temptation to put their hands on increasing the entropy of the universe ...
I hope that after reading this article, the number of those who want to risk their hard drive and the data on it for the sake of a small increase in performance will decrease.

Frequently Asked Questions (FAQ)

Server selection tactics.

Fault-tolerant server.

What is RAID?

RAID levels

The server is the solution to the problem. The essence of the concept.

Server(eng. server from English toserve- to serve) - in information technologies - a software component of a computing system that performs service functions on request client giving him access to certain resources

So, the main task of the server is to execute requests from clients or programs. Hence it follows that the server is a purely utilitarian thing, designed to perform a specific task... Execution, or solution tasks - is the main property of the server. That's why - first put task, and then it is matched server.

Balanced server. How to find the optimal balance?

A balanced server is the goal pursued by the integrator or seller and customer. The customer first of all needs to get a server that meets his requirements, which in turn is determined by task which he will decide. It is in our interests to provide the customer with a server that best suits his requirements. Such cooperation is mutually beneficial. The client gets the server exactly what he needs. Without overpaying for unnecessary things, and without wasting money for the fact that it will not work. We get a satisfied customer and a reputation.

The task of selecting this optimal server is not trivial. It is necessary to take into account many factors that the customer does not even know about. A typical example is a customer's inadequate assessment of the scale of a task or a server's requirement for a specific specification, instead of telling the task that the server will solve. STSS specialists are faced with a variety of tasks every day, and the company has already accumulated serious experience in building servers - that's why choosing a server configuration is a professional's task, as well as its implementation, i.e. production itself.

Server selection tactics.

The tactic of choosing a server is, first of all, to find out the tasks that the server will have to solve, what performance margin is required, and the scalability. Next, the requirements for fault tolerance are clarified and, finally, the estimated budget. If the tasks clearly exceed the allocated budget, then, if possible, adjustments to the budget or tasks are made. It is important to offer a scalable solution for the growing needs of the client. This allows you to solve the problem with minimal initial and subsequent investments, reducing TCO (Total Cost of Ownership - total cost of ownership).

Adhering to the above tactics, combined with the professionalism of our engineers and managers, the client receives exactly the solution he needs.

Fault-tolerant server.

Typically, the server serves many users. Therefore, the server, ideally, should always be in working order in order to fulfill this or that request. If your home computer stops working, then in the end only you will suffer from this. If the server stops working, then many clients will suffer, which can result in disproportionate losses compared to the cost of the server itself.

The concepts of "reliability" and "fault tolerance" are often confused.

Reliability is, first of all, a property of a product, which characterizes its ability to work as long as possible without failures. Those. it is rather a characteristic of the quality of the product itself, its components, etc.

Fault tolerance is, following from the very formation of the word, the ability to resist failures. In other words, it is the ability to remain operational in the event of a failure of any system components. Currently, fault tolerance is achieved through redundancy or duplication of critical or most vulnerable system components.

Server downtime. Reduction methods.

Ways to increase server fault tolerance and, as a consequence, reduce downtime is the use of such elements as: RAID arrays (duplication of hard drives), duplicated power supplies, a duplicated cooling system, in some cases - duplication of the memory subsystem (the so-called mirroring of memory modules).

If it is necessary to further increase the fault tolerance of the system, then they are already talking about building HA clusters (High Availability Clusters - high availability or availability clusters). HA-cluster is a fully duplicated system of servers, storage systems, switching and power supply. Such a system has one of the highest availability rates, which is measured by downtime per year, or the ratio of uptime to downtime expressed as a percentage. In addition, such a system allows you not to stop the system for repair and maintenance work, which also significantly increases the availability in general.

For comparison, the readiness indicators of various computers:

regular PC - ~ 90% per year or 36.5 days of downtime per year.

entry-level server - ~ 96% per year or 14.6 days of downtime

fault-tolerant server - ~ 98% per year or 7.3 days of downtime

high availability cluster - 99.99% per year or 53 minutes per year

I can build computers and believe I can build a server! Why should I buy a server from you?

This question is often asked by our clients when they are trying to calculate the cost of a server from the cost of components. Indeed, the cost of the components is lower than the cost of the server, otherwise we would simply work at a loss. But let's try to figure out what the client "overpays" for.

First, the task of technically assembling the server is not so trivial as it might seem at first. Servers have similar elements to regular PCs, for example - case, power supply, motherboard, processors, memory, hard drives etc.

At first glance, everything is simple. Bought necessary components, and put together a server! This is where the first unpleasant surprises await such a "collector".

For example, a server motherboard only works with dedicated server memory. Moreover, not with any, but with a validated one, i.e. explicitly stated by the manufacturer of this board. The case is also not suitable for everyone - there are even more pitfalls here! Server format motherboard usually differs from the usual ATX. Nutrition is also specific. The fact is that the server is an active consumer of + 12volt current. Voltage regulators for processors (VRM - Voltage Regulator Module) operate at this voltage, and each processor is capable of consuming a huge current! Now imagine that there are not one, but two! Each with 100W dissipated (= power consumption). Total 200W - processors only! Even if we assume that the efficiency of VRMs is close to 100%, then it turns out that only processors consume a current of 200/12 = 16.7A on the bus + 12V. Look at the desktop power supplies - they usually indicate 13-15 amperes for the +12 volt bus, and besides the processors, the server has disks, the motherboard itself, memory, etc. Therefore, the server must have a specialized server power supply, which, in addition to reliability, is capable of delivering the required current at + 12 volts. This figure for modern server power supplies is approximately from 30 to 80 amperes!

This illustrative and unfortunately far from the only example clearly illustrates the problems of an unskilled approach to building a server.

Secondly, it is necessary to ensure warranty service and technical support. Obviously, at a high technical level, the customer is not able to quickly resolve any problems on his own, which, as a result, leads to server downtime and turns into losses for the company incommensurably greater than the possible savings.

Company STSS has advantages that will allow you to get the most suitable solution(server) and highly qualified technical and warranty support.

Own production, many years of experience, continuous research are the key to the quality of products and the professionalism of employees.

What's happenedRAID?

The abbreviation RAID originally stood for "Redundant Arrays of Inexpensive Disks" since they were much cheaper than RAM. This is how RAID was presented by its researchers: Petterson ( David A. Patterson), Gibson ( Garth A. Gibson) and Katz ( Randy H. Katz) in 1987. Over time, RAID began to be deciphered as "Redundant Array of Independent Disks" ("redundant (redundant) array of independent disks"), because expensive equipment(under inexpensive disks meant disks for personal computers). RAID serves to improve the reliability of data storage and / or to increase the speed of reading / writing information.

Berkeley introduced the following RAID levels, which have been adopted as the de facto standard:

Presented as a non-fault-tolerant disk array.

Defined as a mirrored disk array.

RAID 2 reserved for arrays that use Hamming code. Currently not used.

RAID 3, 4, 5 use parity to protect data from single faults. At the current time, only RAID 5 is mainly used.

RAID 6 use parity to protect data from double faults

RAID levels

RAID 0 scheme.

RAID 0 ("Striping") - a disk array of two or more HDDs with no redundancy. The information is divided into data blocks (Ai) and written to both / several disks in turn.

Due to this, performance significantly increases (depending on the number of disks, the multiplicity of the increase in performance depends), but the reliability of the entire array suffers. If any of the hard drives included in RAID 0 fails, all information is completely and irrevocably lost. According to the theory of probability, the reliability of a RAID 0 array is equal to the product of the reliability of its constituent disks, each of which is less than one, i.e. the aggregate reliability is obviously lower than the reliability of any of the drives.

RAID 1 scheme.

RAID 1 (Mirroring) has protection against failure of half of the available hardware (in particular, one of the two hard drives), provides acceptable write speed and gains in read speed due to parallelization of requests. The disadvantage is that you have to pay the cost of two hard drives to get the usable space of one hard drive.

Initially, it is assumed that the hard disk is a reliable thing. Accordingly, the probability of failure of two disks at once is equal (according to the formula) to the product of probabilities, that is, lower by orders of magnitude. Unfortunately, this theoretical model does not fully reflect the processes taking place in real life... So, usually two hard drives are taken from the same batch and work in the same conditions, and if one of the drives fails, the load on the remaining one increases, therefore, in practice, if one of the drives fails, urgent measures should be taken - to restore redundancy again. To do this, with any RAID level (except zero), we recommend using HotSpare hot spares. The advantage of this approach is the maintenance of constant reliability. The disadvantage is even greater costs (that is, the cost of three hard drives to store the volume of one disk).

RAID 5 layout.

The most popular of the levels, primarily due to its economy. By sacrificing the capacity of just one disk from the array for the sake of redundancy, we get protection against failure of any of the volume's hard drives. Writing information to a RAID 5 volume costs additional resources, since additional calculations are required, but when reading (in comparison with a separate hard drive) there is a gain, because the data streams from several drives of the array are parallelized.

Disadvantages of RAID 5 appear when one of the disks fails - the entire volume goes into critical mode, all read and write operations are accompanied by additional manipulations, and performance drops sharply. If another disk fails in this mode, all information will be lost. Therefore, it is highly desirable to use a HotSpare disk with a RAID5 volume. If during the recovery of an array caused by the failure of one disk, the second disk fails, the data in the array is destroyed. RAID 6 allows for two or more drive failures.

It is similar to RAID 5 but has a higher degree of reliability - the capacity of 2 disks is allocated for checksums, 2 sums are calculated according to different algorithms. Requires a more serious controller processor - tricky mathematics. Provides performance after failure of 2 disks.

Combined levels of RAID 0 + 1, RAID 10, RAID 50, RAID 60

In addition to basic levels RAID 0 - RAID 5 described in the standard, there are combined levels of RAID 10, RAID 0 + 1, RAID 30, RAID 50, RAID 60 which various manufacturers interpret each in its own way.

The essence of such combinations is as follows.

It is a RAID-1 array of two RAID-0 arrays. As a rule, such an array is often found on the so-called. Host RAID controllers. In the case of four drives, reliability and performance are on par with 4-drive RAID-10.

Is a RAID-0 array of RAID-1 arrays. Allows you to increase performance like RAID-0 and have higher reliability than RAID-5. In theory, allows up to half of the drives to fail. Guaranteed to withstand a single disk failure. Also, the advantage of this array is the absence of requirements for the computing power of the RAID controller, and the disadvantages are the loss of the capacity of half of all disks.

Is a grouping of Level 5 volumes into RAID-0. This solution is used when it is necessary to create an array. large capacity from a large number of disks. The fact is that the more disks in a RAID-5 array, the greater the load on the controller with the calculation of checksums, and the higher the likelihood of failure of two or more disks at the same time, which will inevitably lead to the loss of all information. The situation is aggravated by the fact that in the event of a failure of one of the disks, the recovery time of the array is backup disk(HotSpare) during which the array remains defenseless increases in proportion to the number of disks. To solve this problem, a RAID 50 array is used. By reducing the number of disks on RAID-5 volumes, we reduce the recovery time of the array in the event of a failure, and this combined level allows more than one disk to fail from different volumes RAID-5

RAID 60- Similar to RAID-50, only RAID-6 volumes are used as basic building blocks.

Which RAID Level is Faster and Why?

By far the fastest RAID level is RAID-0. theoretically, its performance is a multiple of the total performance of all disks included in the array. But, this level is completely unreliable, which limits its use in servers.

Fault-tolerant arrays (RAID-1, RAID-10, RAID-5 and RAID-6) have different performance under different load, as well as the specific cost of storing a megabyte of information on them.

Let's take a look at the performance of various RAID levels.

The simplest array to implement. The disadvantage is usable capacity = ½ of the total disk capacity. However, this drawback is more than compensated for by the low cost of implementing such an array in the server, since most modern interface adapters (integrated on motherboards, that is, essentially "free") are "able" to make arrays RAID-0 and 1. Such arrays do not require resource-intensive computations, therefore they are easy to implement and, as a result, are cheap.

The read performance of RAID-1 is theoretically twice that of a single disk. For recording - equal to the speed of a single disc. Considering the low cost of both disks and a controller, this array can be recommended for use in unloaded servers.

This level of the array involves calculating checksums during writing, which imposes an additional load on the server processor, or creates the need to purchase a hardware RAID controller, the cost of which is usually the cost of 3-4 hard drives. In some cases, it is possible to abandon RAID-5 in favor of RAID-1 with preservation or even increase in capacity. For example: You need to build a 500GB array. This can be done in two ways:

1. Buy a RAID controller and 3 disks of 250 GB each, which, when creating a RAID-5, will give a useful capacity of 500 GB

2. Use the RAID-1 built-in on the motherboard, buy two 500GB disks and combine them into RAID-1 to obtain the same 500GB usable array capacity.

The cost of the second solution, taking into account the cost of the disks, may turn out to be more than two times lower than the first one. In this case, RAID-5 has no advantages in terms of performance. Our research has shown that RAID-5 with three drives works in much the same way as RAID-1 with two.

However, if the number of disks is increased in RAID-5, then its read performance grows almost linearly, which makes it possible to use this type of array in tasks where read operations predominate.

The use of this array is advisable when the number of disks in the array is large, and, accordingly, there is a high probability of failure of more than one disk at the same time. RAID-6 imposes more hardware requirements than RAID-5, which generally decreases performance.

Modern RAID controllers have powerful computing resources that allow you to move from RAID-5 to RAID-6 without any visible performance loss.

The array combines high reliability and RAID-0 performance. The performance increases in the same way as in RAID-0, with the difference that the elements of the array are "sets" of RAID-1 of two disks. The array has good write and read performance, which allows us to call it "universal". However, the disadvantage of such an array is the large loss of capacity of the original disks (50%), which makes it unprofitable to use in sequential storage systems.

There are tons of articles on the Internet with RAID description... For example, this one describes everything in great detail. But as usual, there is not enough time to read everything, so you need something short to understand - is it necessary or not, and what is better to use in relation to working with a DBMS (InterBase, Firebird or something else - in fact, it doesn't matter). Before your eyes - just such a material.

As a first approximation, RAID is combining disks into one array. SATA, SAS, SCSI, SSD - it doesn't matter. Moreover, almost every normal motherboard now supports SATA RAID. Let's go through the list of what types of RAID are and why they are. (I would like to immediately note that in RAID you need to combine the same disks. Combining disks from different manufacturers, from the same different types, or different sizes is pampering for a person sitting at a home computer).

RAID 0 (Stripe)

Roughly speaking, it is a sequential combination of two (or more) physical disks into one "physical" disk. It is only suitable for organizing huge disk spaces, for example, for those who work with video editing. It makes no sense to keep databases on such disks - in fact, even if your database has a size of 50 gigabytes, why did you buy two disks of 40 gigabytes each, and not 1 disks of 80 gigabytes? Worst of all, in RAID 0, any failure of one of the drives leads to complete inoperability of such a RAID, because data is written alternately to both drives, and accordingly, RAID 0 has no means of recovery in case of failures.

Of course, RAID 0 offers faster performance because of the read / write interleaving.

RAID 0 is often used to accommodate temporary files.

RAID 1 (Mirror)

Disk mirroring. If Shadow in IB / FB is software mirroring (see Operations Guide.pdf), then RAID 1 is hardware mirroring, and nothing more. Save you from using software mirroring by means of OS or third-party software. You need either "iron" RAID 1, or shadow.

In the event of a failure, carefully check which drive has failed. The most common case of data loss on RAID 1 is incorrect recovery actions (the wrong drive is specified as "integer").

As regards performance, the gain in writing is 0, in reading it is possible up to 1.5 times, since reading can be performed "in parallel" (alternately from different disks). For databases, the speedup is small, while when accessing different (!) Parts (files) of the disk in parallel, the speedup will be absolutely accurate.

RAID 1 + 0

By RAID 1 + 0, we mean the variant of RAID 10, when two RAID 1s are combined into RAID 0. The variant when two RAID 0s are combined into RAID 1 is called RAID 0 + 1, and "outside" is the same RAID 10.

RAID 2-3-4

These RAIDs are rare because they use Hamming codes, or byte splitting + checksums, etc., but the general summary is that these RAIDs only give reliability, with a 0th performance increase, and sometimes even its deterioration.

RAID 5

It requires at least 3 disks. Parity data is distributed across all disks in the array

It is usually said that "RAID5 uses independent disk access, so requests to different disks can be executed in parallel. "It should be borne in mind that we are talking, of course, about parallel I / O requests. If such requests go sequentially (in SuperServer), then of course, you will not get the effect of parallelizing access on RAID 5. Of course, RAID5 will give a performance increase if the operating system and other applications will work with the array (for example, it will contain virtual memory, TEMP, etc.).

In general, RAID 5 used to be the most commonly used disk array for working with DBMS. Now such an array can be organized on SATA disks, and it will turn out to be much cheaper than on SCSI. You can see prices and controllers in articles
Moreover, you should pay attention to the volume of purchased disks - for example, in one of the above-mentioned articles, RAID5 is assembled from 4 disks with a volume of 34 gigabytes, while the volume of the "disk" is 103 gigabytes.

Testing five SATA controllers RAID - http://www.thg.ru/storage/20051102/index.html.

Adaptec SATA RAID 21610SA on RAID 5 - http://www.ixbt.com/storage/adaptec21610raid5.shtml.

Why RAID 5 is Bad - https://geektimes.ru/post/78311/

Attention! When purchasing disks for RAID5, they usually take 3 disks, at a minimum (more likely due to the price). If suddenly, after a lapse of time, one of the discs fails, then a situation may arise when it will not be possible to purchase a disc similar to the ones used (no longer available, temporarily not on sale, etc.). Therefore, a more interesting idea seems to be buying 4 disks, organizing RAID5 of three, and connecting the 4th disk as a backup (for backups, other files and other needs).

The volume of a RAID5 disk array is calculated using the formula (n-1) * hddsize, where n is the number of disks in the array and hddsize is the size of one disk. For example, for an array of 4 disks of 80 gigabytes, the total volume will be 240 gigabytes.

There is about "unsuitability" of RAID5 for databases. At the very least, it can be viewed from the point of view that in order to get good RAID5 performance, you need to use a dedicated controller, and not what is on the motherboard by default.

Article RAID-5 must die. And more about data loss on RAID5.

Note. As of 05.09.2005 the cost of SATA Hitachi disk 80Gb is $ 60.

RAID 10, 50

Then there are combinations of the listed options. For example, RAID 10 is RAID 0 + RAID 1. RAID 50 is RAID 5 + RAID 0.

Interestingly, the combination of RAID 0 + 1 in terms of reliability turns out to be worse than RAID5. The database repair service has a case of a failure of one disk in the RAID0 (3 disks) + RAID1 (3 more of the same disks) system. At the same time, RAID1 was unable to "lift" the spare disk. The base turned out to be damaged with no chance of repair.

RAID 0 + 1 requires 4 drives, and RAID 5 requires 3. Think about it.

RAID 6

Unlike RAID 5, which uses parity to protect data from single faults, RAID 6 uses parity to protect against double faults. Accordingly, the processor is more powerful than in RAID 5, and no longer 3 disks are required, but at least 5 (three data disks and 2 parity disks). Moreover, the number of disks in raid6 does not have the same flexibility as in raid 5, and should be equal to prime number(5, 7, 11, 13, etc.)

Let's say two disks fail at the same time, however, this is a very rare case.

As for the performance of RAID 6, I did not see any data (I did not look for it), but it may well be that, due to excessive control, the performance may be at the level of RAID 5.

Rebuild time

Any RAID array that remains healthy in the event of a single disk failure has such a concept as rebuild time... Of course, when you replaced a dead disk with a new one, the controller must organize the functioning of the new disk in the array, and this will take a certain amount of time.

When a new disk is "plugged in", for example, for RAID 5, the controller may be allowed to work with the array. But the speed of the array in this case will be very low, at least because even with the "linear" filling of the new disk with information, writing to it will "distract" the controller and the disk heads for synchronization with the rest of the disks in the array.

Recovery time of the array in normal mode directly depends on the size of the disks. For example, a Sun StorEdge 3510 FC Array with an array size of 2 terabytes in exclusive mode makes a rebuild within 4.5 hours (with a hardware price of about $ 40,000). Therefore, when organizing an array and planning for disaster recovery, the first thing to think about is rebuild time. If your database and backups occupy no more than 50 gigabytes, and your growth is 1-2 gigabytes per year, then it hardly makes sense to build an array of 500 gigabyte disks. 250 GB will be enough, and even for raid5 it will be at least 500 GB of space to accommodate not only the database, but also movies. But the rebuild time for 250 gigabyte disks will be about 2 times less than for 500 gigabyte disks.

Summary

It turns out that the most sensible is to use either RAID 1 or RAID 5. However, the most common mistake, which almost everyone does, is the use of “fit for everything” RAID. That is, they put on a RAID, they pile everything that is on it, and ... at best, they get reliability, but not an improvement in performance.

Also, the write cache is often not included, as a result of which writing to a raid is slower than to a regular single disk. The fact is that for most controllers this option is disabled by default. it is believed that to turn it on, it is desirable to have at least a battery on the raid controller, as well as the presence of a UPS.

Text
The old article hddspeed.htmLINK (and doc_calford_1.htmLINK) shows how you can get significant performance gains by using multiple physical disks, even for IDEs. Accordingly, if you organize a RAID, put the base on it, and do the rest (temp, OS, virtual machine) on other hard drives. After all, all the same, RAID itself is one "disk", even if it is more reliable and fast.
deprecated. All of the above has a right to exist on RAID 5. However, before such placement, you need to find out how you can backup / restore the operating system, and how long it will take, how long will it take to restore a "dead" disk, is there (will ) at hand a disk to replace the "deceased" and so on, that is, you will need to know in advance the answers to the most elementary questions in case of a system failure.

I still advise keeping the operating system on a separate SATA disk, or, if you prefer, on two SATA disks connected in RAID 1. In any case, placing the operating system on RAID, you should plan your actions if the motherboard suddenly stops working. board - sometimes transferring raid-array disks to another motherboard (chipset, raid-controller) is impossible due to incompatibility of default raid parameters.

Base placement, shadow and backup

Despite all the advantages of RAID, it is strongly discouraged, for example, to make a backup to the same logical disk. Not only does this have a bad effect on performance, but it can also lead to problems with lack of free space (on large databases) - after all, depending on the data, the backup file can be equivalent to the size of the database, and even more. Making a backup to the same physical disk is still all right, although the most the best option- backup to a separate hard drive.

The explanation is very simple. Backup is reading data from a database file and writing to a backup file. If all this happens physically on one disk (even RAID 0 or RAID 1), then the performance will be worse than if it reads from one disk and writes to another. An even greater gain from this separation is when the backup is done while users are working with the database.

The same with regard to shadow - there is no point in putting shadow, for example, on RAID 1, in the same place as the base, even on different logical drives... In the presence of shadow, the server writes data pages both to the base file and to the shadow file. That is, instead of one write operation, two are performed. When separating the base and shadow in different ways physical disks write performance will be determined by the slowest drive.

While creating file server or productive workstation often one has to face the problem of choosing a disk subsystem configuration. Modern motherboards, even budget-level ones, offer the ability to create RAID arrays of all popular levels, you should also not forget about software implementation RAID. Which option will be more reliable and more productive? We decided to conduct our own testing.

Test bench

As a rule, in small and medium-sized businesses, the role of file servers, department-level servers, etc. an ordinary PC is used, assembled from ordinary, budgetary components. The purpose of our testing was to study the performance of the disk subsystem assembled using a chipset RAID controller and compare it with software implementations of RAID arrays (using OS tools). The reason for the testing was the lack of publicly available objective tests of budget RAID, as well as a large number of "myths and legends" on this issue. We did not specifically select iron, but used what was at hand. And at hand were several ordinary PCs for the next implementation, one of which was used as a test bench.

PC configuration:

Motherboard: ASUS M4N68T-M SocketAM3
Processor: CPU AMD ATHLON II X2 245 (ADX245O) 2.9 GHz / 2Mb / 4000 MHz Socket AM3
RAM: 2 x Kingston ValueRAM DDR-III DIMM 1Gb
Hard drives: HDD 320 Gb SATA-II 300 Western Digital Caviar Blue 7200rpm 16Mb
Operating system: Windows Server 2008 SP2 (32-bit)
File System: NTFS

The disk subsystem was configured as follows: an operating system was installed on one disk, a RAID array was assembled from two or three others.

Testing technique

We chose Intel NAS Performance Toolkit as the test software, Current Package presents a set of tests that allow you to evaluate the performance of the disk subsystem on the main typical tasks. Each test was run five times and the final result is the mean. We took the performance of a single hard drive as a benchmark.

We have tested RAID0, RAID1 and RAID5 arrays, and RAID5 was tested both in normal mode and in emergency mode, with one drive removed. Why did we test only this array in emergency mode? The answer is simple: there is no such mode for RAID0, if any of the disks fails, the array is destroyed, and the only remaining RAID1 disk will be no different from a single disk.

We tested both hardware and software implementations, initially we still measured the average CPU load, since there is an opinion that software RAID heavily loads the processor. However, we refused to include this measurement in the test results, the load on the processor turned out to be approximately equal and amounted to about 37-40% for a single disk, RAID0, RAID1 and 40-45% for RAID5.

File operations

The classic operations for any drive are read and write operations. Intel NASPT evaluates these parameters in four tests: copying a 247 MB file to and from a drive and 44 folders containing 2,833 files with a total of 1.2 GB.

Reading / writing files

If we pay attention to the results of the reference disk, we will see that the write speed is almost twice (89%) higher than the read speed. This is due to the peculiarities of the file system and this fact should also be taken into account. RAID0 (striped array), regardless of the implementation method, showed 70% higher performance than a single disk, while the speed parameters of RAID1 (mirror) are completely identical to it.

RAID5 deserves a separate discussion, the write speed on it is unacceptably low, the slowdown is up to 70%, while the read speed is not inferior to the fast RAID0. Perhaps this is due to a lack of computing resources and imperfect algorithms, because when recording, additional resources are spent for computing checksum... If one of the disks fails, the write speed drops; the hardware solution has a less pronounced decline (15%) than the software solution (40%). In this case, the read speed drops significantly and corresponds to the speed of a single disk.

Read / write folders

Everyone who has tried to copy a scattering of small files knows that it is better to pre-pack them into an archive, it will be much faster this way. Our tests only confirm this rule of thumb, reading a scattering of small files and folders is almost 60% slower, reading a large file, the write speed is also slightly (10%) lower.

RAID0 gives a much smaller advantage on write operations (30-40%), and on read operations, the difference is generally negligible. As expected, RAID1 does not bring us any surprises, going one-on-one with a single disk.

RAID5 on small files shows much more best result, but still continues to be inferior to a single disk by an average of 35%. Reading speed is no different from other configurations, we tend to believe that in this case time is the limiting factor. random access hard drive. But when removing one disk from the array, we got a very unexpected result, which made us double-check it several times, including on another model of hard drives (500 Gb Seagate / Maxtor Barracuda 7200.12 / DiamondMax 23<3500418AS>7200rpm 16Mb). The point is that the write speed hardware array dropped sharply (almost three times), and the write speed of software RAID5, on the contrary, increased, perhaps this is due to the algorithm of the software implementation of the array. And yet we prefer to leave this "phenomenon" without comment.

Working with applications

The following tests reflect the performance of the disk subsystem when working with various kinds of applications, primarily office ones. The first test (Content Creation) reflects the use of disk for storing and working with data, the user creates, opens, saves documents without much activity. The most powerful test is Office Productivity, it simulates active work with documents, searching for information on the Internet (the browser cache is reset to the drive), a total of 616 files in 45 directories with a volume of 572 MB. The last test - working with a photo album (mainly viewing), is more typical for home use, includes 1.2 GB of photos (169 files, 11 directories).

Work with documents

When working with single files, RAID0 is predictably almost twice as fast as RAID1 and a single hard disk (Content Creation test), however, when actively working, it loses all its advantages; in the Office Productivity test, RAID0, RAID1 and a single disk show the same results.

RAID5 is an obvious outsider in these tests, the performance of the array on single files is extremely low, and the hardware implementation shows a much better (but still extremely low) result. With active office work, the results are much better, but still lower than those of a single disk and simpler arrays.

Working with photos

V this mode all arrays showed approximately the same result comparable to the performance of a single disk. Although RAID5 showed a slightly lower result, in this case, the lag is unlikely to be noticed "with the naked eye".

Multimedia

And finally, the multimedia tests, which we divided into two parts: playback and recording. In the first case, HD video is played from the drive in one, two and four streams simultaneously. The second one is recording and simultaneous recording - playback of two files. This test is applicable not only to video, as it characterizes general processes linear recording/ read from disk array.

Playback

RAID0

This type of disk array is confidently the leader when working with large files and multimedia. In most cases, it allows you to achieve a significant advantage (about 70%) compared to a single disk, however, it has one significant disadvantage- extremely low fault tolerance. If one disk fails, the entire array is destroyed. When working with office applications and does not have any special advantages with photographs.

Where can RAID0 be applied? First of all, on workstations, which, by the nature of their tasks, have to work with large files, for example, video editing. If fault tolerance is required, you can use RAID10 or RAID0 + 1, which represent a striped array of two mirrors or a mirror from striped arrays, these RAID levels combine the speed parameters of RAID0 and the reliability of RAID1, among the disadvantages are significant overhead costs - only half of the capacity of the incoming disks is used for storage. into an array.

RAID1

The "mirror" has no speed advantages over a single disk, the main task of this array is to provide fault tolerance. Recommended for use when working with office files and small files, i.e. on those tasks where the difference between faster arrays is not so great. Not bad for working with 1C: Enterprise 7.7 in file mode, which by the nature of working with a disk is a cross between Office Productivity and Dir copy from / to NAS. It is not recommended for more productive tasks, here you should pay attention to RAID10 and RAID0 + 1.

RAID5

We would not recommend using this kind of array in budget systems, on write operations, RAID5 is significantly inferior to even a single hard disk... The only area where its use will be justified is the creation of media servers for storing multimedia data, the main mode of which is reading. Here parameters such as high speed reads (at the RAID0 level) and less overhead for providing fault tolerance (1/3 of the array capacity), which gives a good gain when creating storage of significant volume. However, it should be remembered that an attempt to write to an array leads to a sharp decrease in performance, so uploading new data to such media servers should be done during the busiest hours.

Hardware or software?

The test results did not reveal any noticeable advantages or disadvantages for both implementations, except for RAID5, hardware option which showed in a number of cases a higher result. Therefore, one should proceed from other features. Such as compatibility and portability.

Hardware RAID is implemented by south bridge chipset (or a separate controller) and require support from the OS, or loading drivers at the installation stage. The same fact makes it often impossible to use a number of disk and system utilities that use their own boot disks, if their loader does not have support for a RAID controller, then the software simply will not see your array.

The second drawback is binding to a specific manufacturer, if you decide to change the platform or choose a motherboard with a different chipset, you will have to copy your data to external media(which in itself can be problematic) and rebuild the array. The main trouble is that in the event of an unexpected failure of the motherboard, you will have to look for a similar model to gain access to your data.

Software RAID is supported at the OC level, therefore it is largely devoid of these shortcomings, the array is easily assembled and easily transferred between hardware platforms, in the event of a hardware failure, data can be easily accessed on another PC with a compatible version of Windows (lower editions do not support dynamic disks).

Among the shortcomings, it should be noted that it is impossible to install Windows on RAID0 and RAID5 volumes, for the reason that installing Windows on a dynamic volume is possible only when this volume has been converted from a basic boot or system volume. You can read more about dynamic volumes.

Tags:

Please enable JavaScript to view the comments powered by Disqus.

Trackback

When creating a file server, the question of choosing an operating system inevitably arises. There is something to think about here: spend money on Windows Server or pay attention to free Linux and BSD? In the second case, you still have to decide on the choice ...