Does RAID1 increase performance with Linux mdadm?
Yes, Linux implementation of RAID1 speeds up disk read operations by a factor of two as long as two separate disk read operations are performed at the same time. That means reading one 10GB file won't be any faster on RAID1 than on a single disk, but reading two distinct 10GB files*will be faster.
To demonstrate it, just read some data with dd
. Before performing anything, clear the disk read cache with sync && echo 3 > /proc/sys/vm/drop_caches
. Otherwise hdparm
will claim super fast reads.
Single file:
# COUNT=1000; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT &
(...)
10485760000 bytes (10 GB) copied, 65,9659 s, 159 MB/s
Two files:
# COUNT=1000; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT &; dd if=/dev/md127 of=/dev/null bs=10M count=$COUNT skip=$COUNT &
(...)
10485760000 bytes (10 GB) copied, 64,9794 s, 161 MB/s
10485760000 bytes (10 GB) copied, 68,6484 s, 153 MB/s
Reading 10 GB of data took 65 seconds whereas reading 10 GB + 10 GB = 20 GB data took 68.7 seconds in total, which means multiple disk reads benefit greatly from RAID1 on Linux. skip=$COUNT
part is very important. The second process reads 10 GB of data from the 10 GB offset.
Jared's answer and ssh's comments refering to http://www.unicom.com/node/459 are wrong. The benchmark from there proves disk reads don't benefit from RAID1. However, the test was performed with bonnie++ benchmarking tool which doesn't perform two separate reads at one time. The author explictly states bonnie++ is not usable for benchmarking RAID arrays (refer to readme).
Yes, you will get a reading performance boost + the redundancy. You can easily imagine that as you can read the parts of the files at the same from two different HDDs as the files are on both of the HDDs.
So theoretically, if the RAID controller does its job right, you could gain a speedup of O(n).
man 4 md
states: "… Note that the read balancing done by the driver does not make the RAID1 performance profile be the same as for RAID0; a single stream of input will not be accelerated (e.g. a single dd), but multiple sequential streams or a random workload will use more than one spindle. In theory, having an N-disk RAID1 will allow N sequential threads to read from all disks. …"To top it off — in practice, based on
iostat
output being observed on a typical 2 HDDs software RAID set-up, there's none of balancing. In fact it effectively looks likemdadm
's option--write-mostly
is always on.