How to get an inactive RAID device working again?
For your bonus question:
mdadm --examine --scan >> /etc/mdadm/mdadm.conf
I have found that I have to add the array manually in /etc/mdadm/mdadm.conf
in order to make Linux mount it on reboot. Otherwise I get exactly what you have here - md_d1
-devices that are inactive etc.
The conf-file should look like below - i.e. one ARRAY
-line for each md-device. In my case the new arrays were missing in this file, but if you have them listed this is probably not a fix to your problem.
# definitions of existing MD arrays
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=f10f5f96:106599e0:a2f56e56:f5d3ad6d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=aa591bbe:bbbec94d:a2f56e56:f5d3ad6d
Add one array per md-device, and add them after the comment included above, or if no such comment exists, at the end of the file. You get the UUIDs by doing sudo mdadm -E --scan
:
$ sudo mdadm -E --scan
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=f10f5f96:106599e0:a2f56e56:f5d3ad6d
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=aa591bbe:bbbec94d:a2f56e56:f5d3ad6d
As you can see you can pretty much just copy the output from the scan-result into the file.
I run ubuntu desktop 10.04 LTS, and as far as I remember this behavior differs from the server version of Ubuntu, however it was such a long time ago I created my md-devices on the server I may be wrong. It may also be that I just missed some option.
Anyway, adding the array in the conf-file seems to do the trick. I've run the above raid 1 and raid 5 for years with no problems.
Warning: First of all let me say that the below (due to the use of "--force") seems risky to me, and if you have irrecoverable data I'd recommend making copies of the partitions involved before you start trying any of the things below. However, this worked for me.
I had the same problem, with an array showing up as inactive, and nothing I did including the "mdadm --examine --scan >/etc/mdadm.conf", as suggested by others here, helped at all.
In my case, when it tried to start the RAID-5 array after a drive replacement, it was saying that it was dirty (via dmesg
):
md/raid:md2: not clean -- starting background reconstruction
md/raid:md2: device sda4 operational as raid disk 0
md/raid:md2: device sdd4 operational as raid disk 3
md/raid:md2: device sdc4 operational as raid disk 2
md/raid:md2: device sde4 operational as raid disk 4
md/raid:md2: allocated 5334kB
md/raid:md2: cannot start dirty degraded array.
Causing it to show up as inactive in /proc/mdstat
:
md2 : inactive sda4[0] sdd4[3] sdc4[2] sde4[5]
3888504544 blocks super 1.2
I did find that all the devices had the same events on them, except for the drive I had replaced (/dev/sdb4
):
[root@nfs1 sr]# mdadm -E /dev/sd*4 | grep Event
mdadm: No md superblock detected on /dev/sdb4.
Events : 8448
Events : 8448
Events : 8448
Events : 8448
However, the array details showed that it had 4 out of 5 devices available:
[root@nfs1 sr]# mdadm --detail /dev/md2
/dev/md2:
[...]
Raid Devices : 5
Total Devices : 4
[...]
Active Devices : 4
Working Devices : 4
[...]
Number Major Minor RaidDevice State
0 8 4 0 inactive dirty /dev/sda4
2 8 36 2 inactive dirty /dev/sdc4
3 8 52 3 inactive dirty /dev/sdd4
5 8 68 4 inactive dirty /dev/sde4
(The above is from memory on the "State" column, I can't find it in my scroll-back buffer).
I was able to resolve this by stopping the array and then re-assembling it:
mdadm --stop /dev/md2
mdadm -A --force /dev/md2 /dev/sd[acde]4
At that point the array was up, running with 4 of the 5 devices, and I was able to add the replacement device and it's rebuilding. I'm able to access the file-system without any problem.