Hardware Reliability

Duncan Garland duncan.garland at ntlworld.com
Sun Jun 7 12:13:58 BST 2009


Most of our server drives our mirrored, so I wasn't really thinking about
the failure of individual drives. I was thinking about a more serious
failure which might take out the whole disk array or server.

The starting point then is "How often will a failure occur if we keep our
kit new (ie three years)", then the next step is to try and categorize them
and decide how much loss they will cause to the company.

It's interesting that there aren't any independent statistics. I supoose
that's because the cards etc are changed so often that any statistics are
out of date by the time they are published.

I wonder if the problem can be approached from the other end. I wonder if
there is a design standard (ISO or such like) which states that a
manufacturer should aim for an MTBF of whatever.

I'll let you know if I find anything.


-----Original Message-----
From: london.pm-bounces at london.pm.org
[mailto:london.pm-bounces at london.pm.org]On Behalf Of Avleen Vig
Sent: 06 June 2009 03:40
To: London.pm Perl M[ou]ngers
Subject: Re: Hardware Reliability

I don't have any written analysis for you, just 15+ years of experience.

Most server hardware (cheap or expensive) will run 5 years without
many issues, 10 years with some issues.
By "issues" I mean the occasional bad disk, etc.

IMO most drives which are going to die, do so within the first 12
months. After that they often last 3+ years, and 5+ years isn't
unheard of.
The biggest killer of old drives, is power cycling them. It requires
bearings which have been in constant motion for years to suddenly stop
abd then be exposed to sheer forces when starting up.

If you are happy running older, slower, less efficient hardware, you
can probably keep it longer than 3 years without a problem.

On Thu, Jun 4, 2009 at 9:49 PM, <duncan.garland at ntlworld.com> wrote:
> Hi,
> Can somebody please point me in the direction of some authorative
reliability statistics for server hardware, preferably including add-ons
such as disc arrays?
> I case to put together a case for the number of failures we can expect if
we replace our hardware every three years.
> Everybody has an opinion but I can't find any proper published data.
> Thanks
> Duncan

More information about the london.pm mailing list