After having setup a RAID 10 array with Ubuntu 10.04, almost 2 years has gone by.  It is amazing how solid the software RAID in Ubuntu was implemented.  I have had no problem with Ubuntu and the same installation from September 2010 works very well in the Acer h340 Home Server hardware. The only thing that failed was one of the drives in the Sans Digital TowerRAID external enclosure.  The 1TB Seagate Barracuda 7200.11 drive failed at the age of 4 years.  Luckily the drive was covered by a 5 year warranty.

After some testing of the defective drive using SeaTool in Windows, a drive defective report was generated and a RMA was accepted by Seagate.  A week later, my replacement drive arrived.  Unlike Western Digital, Seagate sent me a “Certified Repaired HDD”, not a new drive.  I did have a “Certified Repaired HDD” failed only after 1 month of usage.  But your mileage might vary….  Also to get an advanced replacement drive from Seagate, I had to pay US$9.99 to have the “Certified Repaired HDD” shipped to me before I send the defective drive back.  As for the warranty service from Western Digital, they would ship me a brand new drive to replace a reported defective drive without charging me anything.  This definitely will affect my purchase choices from these two manufacturers since the pricing of their new drives are very similar these days.

I replaced the dead drive with the replacement, reattached the array to Ubuntu and rebooted.

From Disk Utility, I partitioned the drive using GUID partition table, then formatted it using ext4 file system.

Upon firing up the terminal app, I raised my privilege to super user using the “su” command.  Once I have root level access, I ran the following command:

> cat /proc/mdstat

This gave me the following output:

Personalities : [raid10] [linear] [multipath] [raid0] 
[raid1] [raid6] [raid5] [raid4]
md1 : inactive sdh1[1](S) sdg1[0](S) sdi1[2](S)
 2930287296 blocks

md2 : active raid10 sdc1[2] sdd1[3] sda1[0] sdb1[1]
 1953524864 blocks 64K chunks 2 near-copies [4/4] [UUUU]

unused devices: <none>

It shows that my RAID array “md1″ has only 3 drives attached and it is inactive.

To be on the safe side, I stopped the RAID array:

> mdadm --manage --stop /dev/md1

Terminal output of the above command is:

mdadm: stopped /dev/md1

To start the array, I used this command:

> mdadm --assemble /dev/md1

Terminal output of the above command is:

mdadm: /dev/md1 has been started with 3 drives (out of 4).

Finally, I added the replacement drive back to the RAID array:

> mdadm /dev/md1 --manage --add /dev/sdj1

Terminal output of the above command is:

mdadm: added /dev/sdj1

By adding a drive back to the array, the array will automatically recover by replicating the data from the mirrored drive to the newly formatted drive.  You can see the progress of the recovery by:

> watch -n 1 cat /proc/mdstat

The terminal window will refresh once every second to display the rebuilding process:

Every 1.0s: cat /proc/mdstat Tue Jun 5 23:29:27 2012
Personalities : [raid10] [linear] [multipath] [raid0]
[raid1] [raid6] [raid5] [r
aid4]
md1 : active raid10 sdj1[4] sdg1[0] sdi1[2] sdh1[1]
 1953524864 blocks 64K chunks 2 near-copies [4/3] [UUU_]
 [====>................] recovery = 21.5% (210768064/
976762432) finish=197
.2min speed=64726K/sec
md2 : active raid10 sdc1[2] sdd1[3] sda1[0] sdb1[1]
 1953524864 blocks 64K chunks 2 near-copies [4/4] [UUUU]
unused devices: <none>

My RAID array took 4 hours to recover (replicate data to the new drive).

Tagged with: hard drivehome serverlinuxRAIDrecoveryubuntu
 

3 Responses to How to replace a defective drive from a Ubuntu RAID 10 Array

  1. Howie says:

    Intense read! This is over my head. I’m thinking of just getting a drobo.. what do you think? Nice to see you back blogging Kam! Don’t work too hard!

    • Kam says:

      Hey Howie, from what I heard Drobo is great and it has crossed my mind to get one as well. If you get one, make sure you don’t fill it up all the way. I heard Drobo needs some “working space” to expand and replicate files.

      Sorry work has taken a lot of my personal time and hope I will be able to get back to blogging from now :)

  2. […] that were they and set about replacing each drive, one by one. The instructions I found at kamlau.com worked great. The drives rebuilt fine and I got all the way to where I was going to issue the […]

Leave a Reply

Set your Twitter account name in your settings to use the TwitterBar Section.
%d bloggers like this: