Arxius

Article RAID en anglés

Escrit al 2005-12-04 00:00:00 per cpina

This article is a benchmark using RAID 0 Striped, RAID 0 Linear and RAID 1 Mirroring. What is faster? How much faster? And more answers on it.

------------------

La comparativa dels sistemes RAID, publicat fa dies en català i en castellà, ara també en anglès.

Ens han ajudat a la revisió de la traducció els nostres companys-agermanats d'Scarborough LUG.
 

Index



  • Introduction
  • A little review of RAID
  • Benchmarking and methodology
  • Results
  • Conclusions
  • Annex



    Main parts: Results and conclusions.

    Introduction


    A long time ago, I wanted to mount a Mirroring RAID using Linux in some computers. A few days ago, I decided to search for more information.
    Obviously, the first thing I did was to decide whether I would use Hardware RAID or Software RAID. What are the advantages and/or difficulties of each one? The principle ones are:
    Hardware RAID:

  • Theoretically it's faster
  • It is said that it fails less, so it's more trustworthy


    Software RAID:

  • More flexibility when mounting together (different hard disks can be used without problems, because it is normally done at partition level)
  • Cheaper: it's not necessary to buy a new device
  • It doesn't depend on a RAID Card - which can be broken
  • It's known by its slowness


    Although having this reputation, my question was ? Was a RAID 1 System too slow? Was it slower than not having any RAID? In fact, there are people that say that it is the other way round -these people say that having Software RAID 1 is faster than having just one hard disk drive.. In any case, who is right?

    For that reason I'm going to ask the following question: "Is Software RAID slower than a simple hard disk?". We will try to repeat that benchmark again when a RAID card is available.
    The only reference that we found is in Chapter 9 Perfomance inside RAID Howto, but it's not very complete.

    A short review of RAID


    This article is not going to explain neither the RAID configuration nor how it works internally. But we will do a short review on it, just for reference.

    RAID is a management system for hard disks (a RAID can be done between a Pendrive and a file inside a hard disk, and many more combinations). Software RAID allows many possible configurations, such as:


  • RAID 0 Linear Mode: a 20GB partition with another 20GB partition will become a 40GB one - like being side by side
  • RAID 0 Striped: it's the same as the other, 20 GB + 20 GB = 40 GB. But in this case, a 10MB file will probably have 5MB in one partition and 5MB in another one. Theoretically it has excellent file access speeds
  • RAID 1 Mirroring: 20 GB + 20 GB = 20 GB, but in this case, if a hard disk fails, it doesn't matter because it works properly and informs us that we have to change the hard disk. You can add more hard disks: 20 GB + 20 GB + 20 GB, just in case the two of them fail
  • RAID 5: 20 GB + 20 GB + 20 GB = 40 GB, one hard disk can fail (not two). This system has not yet been benchmarked


    There many reasons to choose to use RAID Systems. We can use it in order to protect our data (RAID 1 and RAID 5) or just for having only one volume with a lot of capacity (RAID 0)

    Benchmarks and Methodology


    The Benchmarks


    The following benchmarks are minimally realistic and varied. For that reason, each person should study the part that is most interesting for him and extract his own conclusions.

    The first benchmark has been done with bonnie++. This program does a series of operations in the indicated directory. Firstly, it does a database simulation: writing and reading blocks and characters with and without order. Afterwards, it creates, reads and deletes a lot of files like Postfix or Squid do. Bonnie provides a result for every section (for example in KB/s) but this benchmark only takes the total time of the process. Maybe an annex will be done in the future with this information.

    Then, we do a dd from /dev/zero to RAID device. We are trying to know which the writing speed to RAID is.
    Next, we do the same but the other way round: read the copied file from /dev/zero and return them to /dev/null. We are trying to prove how efficient readings are.

    The following action we do is a cat /no_raid/linux-2.6.13.tar > /dev/null in order to have it all in the buffers and cache and copy it to the RAID (writing again).

    Then, we unpack using tar -xf linux.tar inside the RAID (it is almost written because it is copied and the RAM of the system is enough to fit the file).

    Finally a cp -rf linux-2.6.13 test/ is done, so the whole kernel tree is copied, almost 300MB.

    SUMMARY

  • Bonnie++ (varied operations)
  • dd /dev/zero file: writing
  • cp fichero /dev/null: reading
  • cat /noraid/linux.tar > /dev/null; cp /noraid/linux.tar /raid/linux.tar: few writings. This file is too small for conclusions (cat is done to copy the file into RAM memory)
  • tar -xf linux.tar: almost 300 MB. Reading/writing
  • cp -rf linux/ directory: 300 MB copied. Reading/writing

    It's worth mentioning that in "real time" measurement has been used in each test. In any test, the difference between "real time", "user time" and "system time" is not very significant using just one hard disk or RAID. There must be more CPU time because Software RAID is done via the Kernel without hardware support (it is not real hardware RAID). This has been neither critical nor restricting in our benchmarks.. It is only when synchronizing the RAID 1 mirroring system (normally, only the first time) but it should not be that critical for the today's CPUs.

    Concurrence


    Here is the main point of the article. What happens when there is more than one process accessing the RAID? Is it slower or faster than a simple Hard Disk? If so then in which RAIDs?

    We have done benchmarks with 1 process, two, three and four concurrent processes. It can be less useful for domestic environments, but it's commonly used in servers (various databases accessing simultaneously, file servers, web servers, etc.). There are very interesting results: mainly in speed relations, concurrency, RAID 0, RAID 1, etc.

    Repeating the benchmarks


    We have repeated the afore mentioned benchmarks 4 times (excluding 4 processes that have been repeated 3 times) and extracted the average results. Benchmarks have been automatised using a script, so we have done the same in every case.

    Results


    Tables with times


    The total time isn't very relevant by copying differing amounts of data we can change the best result into the worst result. You have to think about which operations you normally do and try to find a benchmark that fits you.



    Resum table times            
                 
    1 Process (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie 07:08 07:06 07:07 07:03 06:31 07:35
    dd null 7000M 02:39 02:39 02:39 02:41 02:05 04:01
    cp 7000M null 03:15 03:19 03:17 03:18 02:38 03:32
    cp linux.tar to RAID 00:04 00:05 00:04 00:04 00:05 00:05
    tar -xf linux.tar 00:29 00:27 00:28 00:25 00:25 00:31
    cp -rf linux test 00:58 00:57 00:57 00:49 00:50 01:01
    Total time 00:14:32 00:14:31 00:14:32 00:14:19 00:12:34 00:16:44
    % more slow 0,06% -0,06% 0,00% -1,46% -13,56% 15,17%
                 
                 
    2 Processes (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie 15:29 15:57 15:43 13:39 13:52 15:02
    dd null 7000M 05:34 05:41 05:38 05:37 04:01 07:57
    cp 7000M null 06:00 07:54 06:57 07:12 06:33 05:39
    cp linux.tar to RAID 00:09 00:17 00:13 00:14 00:12 00:20
    tar -xf linux.tar 02:26 01:15 01:50 00:56 00:56 01:09
    cp -rf linux test 03:04 02:45 02:55 02:00 02:08 02:07
    Total time 00:32:41 00:33:49 00:33:15 00:29:38 00:27:42 00:32:13
    % more slow -1,70% 1,70% 0,00% -10,87% -16,70% -3,11%
                 
                 
    3 Processes (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie 23:52 24:58 24:25 20:54 20:42 22:10
    dd null 7000M 08:24 08:52 08:38 08:45 06:01 11:51
    cp 7000M null 09:30 12:56 11:13 12:38 09:25 08:12
    cp linux.tar to RAID 00:14 00:23 00:19 00:23 00:20 00:27
    tar -xf linux.tar 01:49 01:42 01:45 01:25 01:22 01:44
    cp -rf linux test 05:15 04:53 05:04 03:12 03:47 03:18
    Total time 00:49:04 00:53:45 00:51:25 00:47:18 00:41:36 00:47:43
    % more slow -4,55% 4,55% 0,00% -7,99% -19,08% -7,19%
                 
                 
    4 Processes (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie 32:44 34:07 33:25 28:02 27:5329:54
    dd null 7000M 11:11 11:56 11:33 11:49 08:04 15:47
    cp 7000M null 12:47 17:35 15:11 17:23 12:36 10:02
    cp linux.tar to RAID 00:21 00:34 00:28 00:32 00:27 00:40
    tar -xf linux.tar 02:28 02:08 02:18 01:51 01:46 02:18
    cp -rf linux test 05:56 05:36 05:46 03:11 04:16 03:50
    Total time 01:05:26 01:11:55 01:08:41 01:02:48 00:55:02 01:02:31
    % more slow -4,72% 4,72% 0,00% -8,55% -19,86% -8,97%




    Percentage table


    The last row of every table is the average percentage. The time percentage has not been done becausethe result is not dependent upon the duration of benchmarks.







    Resume table 2            
                 
    1 Proces (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie 0,29% -0,29% 0,00% -1,00% -8,32% 6,50%
    dd null 7000M -0,08% 0,08% 0,00% 1,18% -21,20% 52,09%
    cp 7000M null -0,95% 0,95% 0,00% 0,57% -19,87% 7,56%
    cp linux.tar to RAID -5,88% 5,88% 0,00% 0,00% 5,88% 23,53%
    tar -xf linux.tar 2,68% -2,68% 0,00% -11,61% -9,82% 9,82%
    cp -rf linux test 1,31% -1,31% 0,00% -14,41% -13,10% 5,68%
    Total time 00:14:32 00:14:31 00:14:32 00:14:19 00:12:34 00:16:44
    % slower 0,06% -0,06% 0,00% -1,46% -13,56% 15,17%
    % average slower -0,44% 0,44%0,00% -4,21% -11,07% 17,53%
                 
    2 Proceses (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie -1,46% 1,46% 0,00% -13,15% -11,77% -4,39%
    dd null 7000M -1,06% 1,06% 0,00% -0,28% -28,49% 41,34%
    cp 7000M null -13,72% 13,72% 0,00% 3,76% -5,77% -18,70%
    cp linux.tar to RAID -32,04% 32,04% 0,00% 6,80% -6,80% 53,40%
    tar -xf linux.tar 32,12% -32,12% 0,00% -49,24% -49,69% -37,45%
    cp -rf linux test 5,26% -5,26% 0,00% -31,04% -26,53% -27,39%
    Total time 00:32:41 00:33:49 00:33:15 00:29:38 00:27:42 00:32:13
    % more slow -1,70% 1,70% 0,00% -10,87% -16,70% -3,11%
    % average slower -1,81% 1,81% 0,00% -13,86% -21,51% 1,14%
                 
    3 Proceses (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie -2,25% 2,25% 0,00% -14,39% -15,26% -9,22%
    dd null 7000M -2,72% 2,72% 0,00% 1,31% -30,41% 37,18%
    cp 7000M null -15,30% 15,30% 0,00% 12,71% -16,04% -26,92%
    cp linux.tar to RAID -25,39% 25,39% 0,00% 25,39%8,76% 45,62%
    tar -xf linux.tar 3,13% -3,13% 0,00% -19,43% -22,36% -0,83%
    cp -rf linux test 3,60% -3,60% 0,00% -36,90% -25,51% -34,79%
    Total time 00:49:04 00:53:45 00:51:25 00:47:18 00:41:36 00:47:43
    % slower -4,55% 4,55% 0,00% -7,99% -19,08% -7,19%
    % average slower -6,49% 6,49% 0,00% -5,22% -16,80% 1,84%
                 
    4 Proceses (average)            
      Single disk 1 Single disk 2 Disks average RAID 0 Linear RAID 0 Striped RAID 1
    Bonnie -2,08% 2,08% 0,00% -16,10% -16,55% -10,54%
    dd null 7000M -3,21% 3,21% 0,00% 2,20% -30,23% 36,65%
    cp 7000M null -15,85% 15,85% 0,00% 14,51% -16,97% -33,93%
    cp linux.tar to RAID -22,24% 22,24% 0,00% 17,40% -0,76% 46,44%
    tar -xf linux.tar 7,26%-7,26% 0,00% -19,30% -23,23% 0,18%
    cp -rf linux test 2,84% -2,84% 0,00% -44,82% -26,01% -33,57%
    Total time 01:05:26 01:11:55 01:08:41 01:02:48 00:55:02 01:02:31
    % slower -4,72% 4,72% 0,00% -8,55% -19,86% -8,97%
    % average slower -5,55% 5,55% 0,00% -7,69% -18,96% 0,87%



    Graphics


    Las gráficas son respecto a los promedios de los porcentajes, al considerar-se más representativos que el tiempo.





    Conclusions



    We can extract many conclusions, however the most pertinent are:

  • RAID 1 Mirroring, 1 process. It is slower on the whole because it is 17.5% in the partial percentage. It can be useful by changing speed for the security of having another disk just in case one breaks. So it is safer, but slower. Time increases depending on the operation you do. Writing to disk (dd /dev/null file) penalises it a lot (50% slower). The other slow point (cp /home/carles/linux.tar /mnt/md0) may be not representative because it is too short. You will see that if we hadn't done the dd or cp benchmarks the result would have been between a 5 and a 10% slower (which agrees with the global results of Bonnie, which was 6% slower). It is worth mentioning that if it is a file server through the net, Web without database, etc. it may be possible that the network speed will be slower than the hard disk speed (with the exception of databases, which can make heavy use of the hard disk whilst only sending a bit of information.) so the final user won't be affected.

  • RAID 1 Mirroring, 2, 3 and 4 simultaneous processes. Speed is prone to levelling out (with 2 and 3 processes it is a 1 or 2% slower, with 4 processes it is below 1%). It seems that the write operations are the most penalised. You must think about the writes and reads in your system. In general there are more reads (queries) than writes. Normally data is written once but read many times. You can also see that it is faster (with the exception of dd and cp) than just one disk. Bonnie++ says that it is 9% faster.

  • RAID 0 Striped. It is faster than the others (with more than one process in almost every benchmark). If we wanted to increase the speed of our system it would be the best solution. For example it is commonly used in video edition environments, where large files are used. The good news is that you don't need Hardware RAID to increase speed. Software RAID is adequate and you can get up to 20% ( some operations, like the ones with 2 processes, can reach 50%).


  • Raid 1 Mirroring VS Raid 0 Striped. You can easily see that write operations penalise RAID 1 Mirroring more than RAID0 Striped. On the other hand, reading with various processes in RAID 1 is better than RAID 0.


    Annex


    Benchmark relativity


    Any benchmark is completely relative and must be read very carefully. Actually, it is possible that in any comparison the product/configuration/application A beats the B product in most sections.
    You have to be conscious of this fact. There is no such thing as a 100% trustworthy benchmark, it is simply a guide that people can use for evaluation.
    In any particular case, we should determine whether the results bear any relationship to the hardware used. Will the results change if we increase or decrease RAM memory? It shouldn't, but we must think about it. Is the hard disk usage similar to the benchmark's, and if so in which sections? Or maybe just changing the RAID chunk size, or using another filesystem, will mean that the results change. For all of these reasons, we should read benchmarks just like an indication. Every system has a particular configuration and it is impossible to benchmark everybody's configuration.

    Hardware


    I have used a Pentium 3 at 860 MHz, with 756 MB RAM and a 2.6.12 Kernel. The distribution used is Debian Sarge.

    One hard disk is a Western Digital with a capacity of 80 GB, configured with UDMA 5 as master of primary IDE channel (hda). The other hard disk is a Seagate of 120 GB, also configured with UDMA 5 and as master of secondary IDE channel (hdc). It may be strange having two different hard disks in a RAID but it is more common than having the same model.

    Links



    The most consulted pages are:

  • Gentoo Install on Software RAID
  • Software RAID Howto



    Statistics


    You can download a spreadsheet with all given data.

    Thanks


    Thanks to Scarborough LUG, who has language-reviewed the article. After the revision, we have added some text, so it is possible that we have added some mistakes.
  • Categories: Articles, Servidor


    Comentaris

    • Sense comentaris
    Arxius