How to safely replace a not-yet-failed disk in a Linux RAID5 array?
Using mdadm 3.3
Since mdadm
3.3 (released 2013, Sep 3), if you have a 3.2+ kernel, you can proceed as follows:
# mdadm /dev/md0 --add /dev/sdc1
# mdadm /dev/md0 --replace /dev/sdd1 --with /dev/sdc1
sdd1
is the device you want to replace, sdc1
is the preferred device to do so and must be declared as a spare on your array.
The --with
option is optional, if not specified, any available spare will be used.
Older mdadm version
Note: You still need a 3.2+ kernel.
First, add a new drive as a spare (replace md0
and sdc1
with your RAID and disk device, respectively):
# mdadm /dev/md0 --add /dev/sdc1
Then, initiate a copy-replace operation like this (sdd1
being the failing device):
# echo want_replacement > /sys/block/md0/md/dev-sdd1/state
Result
The system will copy all readable blocks from sdd1
to sdc1
. If it comes to an unreadable block, it will reconstruct it from parity. Once the operation is complete, the former spare (here: sdc1
) will become active, and the failing drive will be marked as failed (F) so you can remove it.
Note: credit goes to frostschutz and Ansgar Esztermann who found the original solution (see the duplicate question).
Older kernels
Other answers suggest:
- Johnny's approach: convert array to RAID6, "replace" the disk, then back to RAID5,
- Hauke Laging's approach: briefly remove the disk from the RAID5 array, make it part of a RAID1 (mirror) with the new disk and add that mirror drive back to the RAID5 array (theoretical)...
If you don't mind running RAID-6 (2 parity disks rather than 1), and if you're running mdadmin 3.1.x or higher, you could convert your RAID-5 array to RAID-6 to add an additional parity disk. This will will place the array under stress during the rebuild, however. And it has some performance implications since there are more parity disks to update during writes.
But if it completes successfully, then you can keep your failing disk in place and when it ultimately fails, you've still got parity protection for the array. I think you can conver the array from RAID6 back to RAID5 if you don't wait to keep it as RAID6.
I don't know of an online way to keep the array as RAID-5 and replace the disk without putting the array in degraded mode, as I think you have to mark it as failed to replace it. Your dd copy idea might be the way to do that.
This may be possible meeting the requirements
- online
- don't stress any disk except for the one which is to be replaced
But even if the following may work you will probably not find any recommendation of that kind "in the books"...
Idea:
- Take disk OLD out of the array (for a short moment):
mdadm --manage /dev/raid5 --fail /dev/OLD
- Create a new md device (RAID-1) from disks OLD and NEW:
mdadm --build /dev/md42 --level=mirror --raid-devices=2 /dev/OLD /dev/NEW
- Put the RAID-1 back in the array (instead of /dev/OLD):
mdadm --manage /dev/raid5 --re-add /dev/md42
What should :-) happen:
- The RAID-5 gets /dev/md42 in sync. This should not take long.
- The RAID-5 is normally operational again (but slower).
- /dev/NEW is synced with /dev/OLD.
Watch the sync progress (cat /proc/mdstat
or mdadm --monitor
). If the sync is finished take the RAID-1 out of the RAID-5, stop the RAID-1, re-add /dev/NEW to the RAID-5. If everything is fine, overwrite the mdraid superblocks on /dev/OLD in order to avoid problems: mdadm --zero-superblock
Warning: The fast RAID-5 sync may work only if you use a bitmap. If you don't have one then better make a test with a dummy RAID-5 (without a bitmap) first. Or add one. At least adding an external one should be possible. Otherwise it may be necessary to stop the RAID-5 before changing the devices. If you boot from the RAID-5 this would become a bit complicated, though.