backtop


Print 73 comment(s) - last by Chernobyl68.. on Mar 15 at 6:01 PM

Study says failure rates 15 times that of what manufacturers indicate

A study released this week by Carnegie Mellon University revealed that hard drive manufacturers may be exaggerating their mean-time before failure (MTBF) ratings on hard drives. In fact, researchers at Carnegie indicated that on the average, failure rates were as high as 15 times the rated MTBFs.

Rounding-up roughly 100,000 hard drives across a variety of manufacturers, researchers at Carnegie tested the drives in various operating conditions as well as real world scenarios. Some drives were at Internet services providers, others at large data centers and some were at research labs. According to test results, the majority of the drives did not appear to be affected by their operating environment. In fact, researchers indicated that drive operating temperatures had little to no effect on failure rates -- a cool hard drive survived no longer than one running hot.

The types of drives used in the study ranged from Serial ATA drives, SCSI and even high-end fiber-channel (FC) drives. Typically, customers will be paying a much larger premium for SCSI and FC drives, which also happen to usually carry longer warranty periods and higher MTBF ratings.

Carnegie researchers found that these high-end drives did not outlast their mainstream counterparts:
In our data sets, the replacement rates of SATA disks are not worse than the replacement rates of SCSI or FC disks. This may indicate that disk-independent factors, such as operating conditions, usage and environmental factors affect replacement rates more than component specific factors.
According to the study, the number one cause of drive failures was simply age. The longer the drive has been in operation, the more likely it will fail. According to the study, drives tended to start showing signs of failure after roughly five to seven years of service, after which there was a significant increase in average failure rates (AFR). The failure rates of drives that failed in their first year of service or shorter was just as high as those after the seven year mark.

According to Carnegie researchers, manufacturer MTBF ratings are highly overrated. Take for example the Seagate Cheetah X15 series, which has a MTBF rating of 1.5 million hours. This equates to roughly over 171 years of constant service before problems. Carnegie's researchers said however that customers should expect a more reasonable 9 to 11 years. Interestingly, real world tests in the study showed a consistent average failure of about six years.

The average replacement rate of drives ranged from 2-percent to a whopping 13-percent annually, indicating that there is a need for manufacturers to reevaluate the way a MTBF rating is generated. Worst of all, these rates were for drives with MTBF ratings between 1 million and 1.5 million hours.

Garth Gibson, associate professor of computer science at Carnegie indicated that the study was proof that MTBFs are not a reliable way of measuring drive quality. "We had no evidence that SATA drives are less reliable than the SCSI or Fiber Channel drives," said Gibson.

Carnegie researchers concluded that backup measures are a necessity with critically important data, no matter what kind of hard drive is being used. It is interesting to note that even Google's own data centers use mainly SATA and PATA drives. At the current rate, it is only a matter of time before SATA will perform equal or better than SCSI and FC drives, offering the same reliability, and for much less money.


Comments     Threshold


This article is over a month old, voting and posting comments is disabled

RE: MTBF numbers are a lie
By PandaBear on 3/9/2007 9:15:53 PM , Rating: 2
Agree, MTBF is usually useful only for things that fail on random in infant mortality. It is a prediction of how many DOA drives you get if you order a large quantity. Once you power it on, it is not a prediction on how many of them die within 1, 3, 5, 7 years.

Every drive is designed differently and eventually fail due to different reason. So there is no universal formula to quantify it. Most large OEM (i.e. Dell or HP) run their own qualification test on each new design/model before they take a huge order of millions of drives, so they know and always get the best prime drives.

The rest that failed, like 250GB rather than 300GB, or 120GB rather than 160GB, goes into retail for the average users. Big OEM won't accept them with 1/4 head clipped or the outer 1/4 ring disabled.

The ones that do very bad? goes to Fry's as white box. I once saw an IBM Deskstar with hand soldered resistor on the PCB, clearly a reject.


"Google fired a shot heard 'round the world, and now a second American company has answered the call to defend the rights of the Chinese people." -- Rep. Christopher H. Smith (R-N.J.)











botimage
Copyright 2014 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki