recovery - mdadm always degraded on reboot - how to solve?

recovery reboot mdadm

07
2014-07

Eddie Parker

Whenever I reboot, my RAID6 array is always degraded, and I have to force add two drives (seen here after re-adding):

Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0] [linear] [multipath]
md127 : active raid6 sdf1[6] sdd1[4] sdb1[0] sdc1[5]
    3907023872 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/2] [U__U]
    [>....................]  recovery =  0.4% (9160908/1953511936) finish=294.9min speed=109862K/sec

I'm not sure why this happens, but in a similar vein, one of my other (non-RAID) hard-drives is never mounted successfully either, and I have to mount it after startup. dmesg seems to report these hard-drives spinning up after all the basic file systems have loaded.

Anyone know what I should be trying?

SuperUser won't let me post the entirety of my dmesg,so here's a pastebin: http://bpaste.net/raw/179955/

Answers

Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

Related Question

linux - mdadm volume works, but won't assemble/mount on startup?

linux ubuntu mdadm lvm2

Stride

I have an mdadm/lvm2 volume with 4 HDs that I created in Ubuntu 10.04. I just upgraded the computer to Ubuntu 10.10.

I redid the mdadm commands to get volume up and running, did mdadm --detail --scan > /etc/mdadm/mdadm.conf to get the configuration file.

But now, every time I reboot, it tells me that the volume is not ready. /proc/mdstat says that I always have one disk of the volume "inactive" as md_d127. I need to stop this volume and reassemble the whole thing to get it working.

This is what I get out of mdadm --detail --scan and put inside /etc/mdadm/mdadm.conf:

ARRAY /dev/md127 level=raid5 num-devices=4 metadata=01.02 name=:r0 UUID=7610a895:a54fe65b:c9876d2a:67f4a179

And this is my /proc/mdstat on boot:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sdb1[2](S) sdd1[0](S) sda1[4](S)
      2930279595 blocks super 1.2

md_d127 : inactive sdc1[1](S)
      976759865 blocks super 1.2

unused devices: <none>

I need to do mdadm -S /dev/md_d127, mdadm -S /dev/md127, mdadm -A --scan to get this volume working again.

What's going on? This did not happen with Ubuntu 10.04. I'm really fearing the loss of my raid5 data now.

Related Answers

Trevor Robinson

The issue is that the updated version of mdadm relies on the mdadm.conf present in your initrd, which is probably not accurate/complete. To verify its contents, do this:

gunzip -c /boot/initrd.img-2.6.38-11-generic | cpio -i --quiet --to-stdout etc/mdadm/mdadm.conf

If it doesn't contain accurate ARRAY entries, mdadm will try to use the name configured in the superblock as the link name under /dev/md/, which will link to something like /dev/md127. This obviously does not match the earlier behavior.

Rather than directly using mdadm -Ds or mdadm -Es to generate /etc/mdadm/mdadm.conf, it's probably better to use the /usr/share/mdadm/mkconf script:

sudo /usr/share/mdadm/mkconf force-generate /etc/mdadm/mdadm.conf

The most important step is to rebuild your initramfs to include the updated configuration:

sudo update-initramfs -u

Actually, thanks to the magic in /usr/share/initramfs-tools/hooks/mdadm, /usr/share/mdadm/mkconf will be run automatically if /etc/mdadm/mdadm.conf does not exist or contains no arrays. If it exists and contains only a subset of your active arrays, a warning is displayed for each missing array, and you should manually generate a new mdadm.conf.

Stride

I've resorted to reformatting the entire array. This works in Ubuntu 10.10.

sudo mdadm -C /dev/md0 -l 5 -n 4 -e 1.2 /dev/sd[bcde]1
sudo mdadm -Ds | sudo tee /etc/mdadm/mdadm.conf

sudo pvcreate /dev/md0
sudo vgcreate vg0 /dev/md0
sudo lvcreate vg0 --name lv0 --extents '100%FREE'

sudo mkfs.ext4 /dev/vg0/lv0

8088

You may also check that udev is loading mdadm.

Look for /lib/udev/rules.d/85-mdadm.rules; make sure that it has something like this:

\# This file causes block devices with Linux RAID (mdadm) signatures to
\# automatically cause mdadm to be run.
\# See udev(8) for syntax

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
        RUN+="/sbin/mdadm --incremental $env{DEVNAME}"

If not copy this into /etc/udev/rules.d/85-mdadm.rules - NOTE /etc NOT /lib.

8088

Please edit this

metadata=01.02

with

metadata=1.02

Because results from

#mdadm --detail --scan > /etc/mdadm/mdadm.conf

isn't completely correct.

Home

recovery - mdadm always degraded on reboot - how to solve?