recovery - mdadm always degraded on reboot - how to solve?

07
2014-07
  • Eddie Parker

    Whenever I reboot, my RAID6 array is always degraded, and I have to force add two drives (seen here after re-adding):

    Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0] [linear] [multipath]
    md127 : active raid6 sdf1[6] sdd1[4] sdb1[0] sdc1[5]
        3907023872 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/2] [U__U]
        [>....................]  recovery =  0.4% (9160908/1953511936) finish=294.9min speed=109862K/sec
    

    I'm not sure why this happens, but in a similar vein, one of my other (non-RAID) hard-drives is never mounted successfully either, and I have to mount it after startup. dmesg seems to report these hard-drives spinning up after all the basic file systems have loaded.

    Anyone know what I should be trying?

    SuperUser won't let me post the entirety of my dmesg,so here's a pastebin: http://bpaste.net/raw/179955/

  • Answers
    Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

    Related Question

    linux - mdadm volume works, but won't assemble/mount on startup?
  • Stride

    I have an mdadm/lvm2 volume with 4 HDs that I created in Ubuntu 10.04. I just upgraded the computer to Ubuntu 10.10.

    I redid the mdadm commands to get volume up and running, did mdadm --detail --scan > /etc/mdadm/mdadm.conf to get the configuration file.

    But now, every time I reboot, it tells me that the volume is not ready. /proc/mdstat says that I always have one disk of the volume "inactive" as md_d127. I need to stop this volume and reassemble the whole thing to get it working.

    This is what I get out of mdadm --detail --scan and put inside /etc/mdadm/mdadm.conf:

    ARRAY /dev/md127 level=raid5 num-devices=4 metadata=01.02 name=:r0 UUID=7610a895:a54fe65b:c9876d2a:67f4a179
    

    And this is my /proc/mdstat on boot:

    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md127 : inactive sdb1[2](S) sdd1[0](S) sda1[4](S)
          2930279595 blocks super 1.2
    
    md_d127 : inactive sdc1[1](S)
          976759865 blocks super 1.2
    
    unused devices: <none>
    

    I need to do mdadm -S /dev/md_d127, mdadm -S /dev/md127, mdadm -A --scan to get this volume working again.

    What's going on? This did not happen with Ubuntu 10.04. I'm really fearing the loss of my raid5 data now.


  • Related Answers
  • Trevor Robinson

    The issue is that the updated version of mdadm relies on the mdadm.conf present in your initrd, which is probably not accurate/complete. To verify its contents, do this:

    gunzip -c /boot/initrd.img-2.6.38-11-generic | cpio -i --quiet --to-stdout etc/mdadm/mdadm.conf
    

    If it doesn't contain accurate ARRAY entries, mdadm will try to use the name configured in the superblock as the link name under /dev/md/, which will link to something like /dev/md127. This obviously does not match the earlier behavior.

    Rather than directly using mdadm -Ds or mdadm -Es to generate /etc/mdadm/mdadm.conf, it's probably better to use the /usr/share/mdadm/mkconf script:

    sudo /usr/share/mdadm/mkconf force-generate /etc/mdadm/mdadm.conf
    

    The most important step is to rebuild your initramfs to include the updated configuration:

    sudo update-initramfs -u
    

    Actually, thanks to the magic in /usr/share/initramfs-tools/hooks/mdadm, /usr/share/mdadm/mkconf will be run automatically if /etc/mdadm/mdadm.conf does not exist or contains no arrays. If it exists and contains only a subset of your active arrays, a warning is displayed for each missing array, and you should manually generate a new mdadm.conf.

  • Stride

    I've resorted to reformatting the entire array. This works in Ubuntu 10.10.

    sudo mdadm -C /dev/md0 -l 5 -n 4 -e 1.2 /dev/sd[bcde]1
    sudo mdadm -Ds | sudo tee /etc/mdadm/mdadm.conf
    
    sudo pvcreate /dev/md0
    sudo vgcreate vg0 /dev/md0
    sudo lvcreate vg0 --name lv0 --extents '100%FREE'
    
    sudo mkfs.ext4 /dev/vg0/lv0
    
  • 8088

    You may also check that udev is loading mdadm.

    Look for /lib/udev/rules.d/85-mdadm.rules; make sure that it has something like this:

    \# This file causes block devices with Linux RAID (mdadm) signatures to
    \# automatically cause mdadm to be run.
    \# See udev(8) for syntax
    
    SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
            RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
    

    If not copy this into /etc/udev/rules.d/85-mdadm.rules - NOTE /etc NOT /lib.

  • 8088

    Please edit this

    metadata=01.02
    

    with

    metadata=1.02
    

    Because results from

    #mdadm --detail --scan > /etc/mdadm/mdadm.conf
    

    isn't completely correct.