Biz & IT —

Putting hard drive reliability to the test shows not all disks are equal

Failure rates vary from 2 percent to 24 percent per year, depending on make, model.

Hitachi drives crush competing models from Seagate and Western Digital when it comes to reliability, according to data from cloud backup provider Backblaze. Their collection of more than 27,000 consumer-grade drives indicated that the Hitachi drives have a sub-two percent annualized failure rate, compared to three to four percent for Western Digital models, and as high as 25 percent for some Seagate units.

Hard drive manufacturers like to claim that their disks are extremely reliable. The main reliability measure used of hard disks is the mean time between failures (MTBF), and typically this is quoted as being somewhere between 100,000 and 1 million hours, or between 11 and 110 years.

These failures are generally assumed to follow a so-called bathtub curve, with relatively high failure rates when the drive is new—"infant mortality," caused by manufacturing defects—and similarly when the drive nears the end of its useful life, but low failure rates in between.

Data to actually support these beliefs, however, has always been a little scarce. Even when studies are published, the data within them is often anonymized. Backblaze's data names names and shows some big differences between the manufacturers.

The Hitachi disks are consistent performers. The Seagate ones are not.
The Hitachi disks are consistent performers. The Seagate ones are not.

Backblaze's data covers a range of drive models, capacities, and ages. In aggregate, the company has just under 40PB of Seagate storage, 36PB of Hitachi storage, and 2.6PB of Western Digital storage. The company has a few Samsung units (sold to Seagate in 2011) and Toshiba units, but too few to draw any meaningful conclusions.

Even from the same manufacturer there are some big differences. The least reliable drives are 1.5TB Seagate Barracudas that are nearing four years old on average, with an astonishing 25.4 percent annual failure rate. The newest Seagate drives, 4TB models, have a much more reasonable 3.8 percent annual failure rate. The units from Hitachi prove a lot more consistent, with the oldest drives, 2TB units averaging about three years old, having a failure rate of 1.1 percent, and the newest, 4TB units, having a 1.5 percent annual failure rate.

These numbers show just how useless the manufacturer's MTBF numbers can really be. Those 1.5TB Seagate drives are mostly split between two models. There are 539 model ST31500341AS Barracuda 7200.11 drives, which are specified by Seagate to have an annualized failure rate of 0.34 percent and an MTBF of 0.7 million hours, and there are 1929 model ST31500541AS Barracuda LP drives specified to have an AFR of 0.32 percent and an MTBF of 0.75 million hours.

Backblaze recorded AFRs of 25.4 percent and 9.9 percent respectively; substantially worse than the spec sheet number. However, there's a big difference. Backblaze's drives are operated 24/7. They're powered on all the time and aren't spun down or put to sleep by the system software using them. Seagate's AFR and MTBF numbers assume that the drives are powered on for only 2,400 hours each year, but conversely are spun down and backed up either 10,000 times per year (for the Barracuda 7200.11 units) or 50,000 times a year (for the Barracuda LP models).

Clearly, Seagate's usage model doesn't correspond at all to the way Backblaze uses the drives. These are consumer desktop drives, being used in a server scenario. However, it's not clear that Seagate's model corresponds very well with even desktop usage. For example, if you leave your desktop turned on 24/7, Seagate's MTBF number is irrelevant to you. If your desktop doesn't power down its hard drive when idle, Seagate's MTBF number is irrelevant to you. While there are sure to be some people who fit the assumptions, it's difficult to say that the specified MTBF actually represents typical usage.

In contrast, the company's enterprise-oriented disks calculate their AFRs and MTBFs assuming 8760 powered-on hours per year—which is to say, 24/7 operation. Even these have constraints, though, as they assume that only a certain amount of I/O is performed each year; do more than this amount, and the life of the disk may again be shortened.

In practice, it's likely that any large-scale collection of hard drive data is going to have these same discrepancies, leaving little good way to evaluate the accuracy of MTBF specs for desktop drives. The people using hard disks in bulk are going to be those operating data centers, not desktop systems. The use of consumer drives isn't unusual, due to their lower pricing, and clearly the drives do work well enough in servers.

The Western Digital drives do show an early drop followed by a leveling out. The Seagates for some reason show a big dip at 20 months.
The Western Digital drives do show an early drop followed by a leveling out. The Seagates for some reason show a big dip at 20 months.

Another feature of the Backblaze data is that it doesn't consistently show the bathtub curve. The Western Digital drives do appear to show a bathtub curve, with an initial burst of failures followed by long-term reliability, but neither the Seagate nor Hitachi drives appear to do the same.

Though backups are always important, none of this means that owners of Seagate drives should crack open their PCs and rush to replace their drives. Backblaze notes that its conditions are pretty hostile. Two particular kinds of drives, Western Digital 3TB units and Seagate LP (low power) 2TB units, suffered extreme failure rates. However, the company believes that this is primarily due to the level of vibration in their drive cages which pack 45 disks into a 4U case, combined with both drives being energy-efficient models that aggressively spin down when not in use. These things are less likely to be an issue in regular desktop machines.

Even with the higher failure rate, Backblaze says that it is still buying Seagate drives, as they're cost-effective for the company's RAID usage. It might not be best to run out and buy Hitachi drives for their reliability, either. Hitachi sold its drive business to Western Digital last year, and Western Digital subsequently sold the 3.5 inch drive division to Toshiba, and it's too soon to know whether this has had any impact on their longevity.

Listing image by Alpha six

Channel Ars Technica