Hard Disk IDs in Linux

Daniel Fussell dfussell at byu.edu
Thu Mar 14 16:19:55 MDT 2013


On 03/12/2013 06:13 PM, Michael Torrie wrote:
> Though I've not
> personally experienced losing 20 TB of data
I've lost 4TB, 3 different times on the same server.  It sucked.  
Sixteen SAS disks in 2 RAID5 arrays and a small RAID1 for OS, with a hot 
spare on every array.  The disk manufacturer had one guy on the 
production line using the wrong lubricant for a 6 month period.  I'd 
lose a disk, replace it, and within hours of finishing the rebuild 
another would fail.  I'd replace that, and another would fail.  On my 
4th replacement in one week, I got a punctured stripe, and the whole 
array fell apart.  Every time I replaced a bad drive, from there on it's 
replacement was marked bad after about twenty minutes of rebuild.  This 
was within 3 weeks of buying the server, and it took a week for the tape 
robot to restore 4TB.  It happened two other times after that in an 18 
month period.  We lost all confidence in that server and moved the user 
group back to the SAN where they belonged.  Never had a problem after that.

Moral of the story is, don't trust a Philippine with a stack of disks 
and the wrong lube.  Four arrays of 8 disks may lose you 32 TB of space 
compared to 8TB on one big RAID6 group; but it won't likely lose you 
120TB on your first triple-disk failure.

;-Daniel Fussell


More information about the PLUG mailing list