Tier/Rated-4 data centers are considered to be Fault Tolerant meaning that there is no Single Point Of Failure (SPOF) which could have an impact on the critical IT load.
So how do I dare to claim that your Tier/Rated-4 data center, whether you reference Uptime or TIA-942, still has at least 10 SPOFs??
There is a very simple way to find out how many SPOFs you might have in your data center. Go to a meeting room. Invite all your operational staff, including facilities, floor management staff etc, into the room. Now, count the number of people…… Yes, each and every one of your staff is a potential point of failure who could bring your critical IT down!!
In order to run a high-available, effective and efficient data center you need to do three things right;
Recently, EPI conducted an industry-wide survey to find out the typical causes of downtime. Below are the results.
The outcome of this survey is very much the same compared to other organizations who conducted these kinds of surveys over the last few years. However, we also asked another question which gave a very interesting result.
This clearly indicates that the majority of the data centers do know that these issues can be prevented.
But what proactive actions do data center owners need to take to get this matter under control?
As said, most data centers have done their due diligence on the site infrastructure which is indeed the first step to take. The next part is to make sure the policies, procedures and work instructions are aligned with the business objectives. ISO standards are often used but there are a number of shortcomings. You can refer to this article to read more about why ISO standards are a great start but certainly not enough.
How about training? It is interesting to see that most data centers have little issues to throw in yet another network device costing 50-80k USD to add redundancy but at the same time, they are not willing to invest dollars to properly train their staff. We frequently hear that data centers have their own training programs internally with the aim to reduce training costs. But how effective is that really? A great car mechanic doesn’t necessarily make him/her also a good driver. Same as with data center engineers, they might be great engineers but are they also really good at being a trainer and transferring knowledge? What about politics? Do you think that the senior engineer who has built up his/her competences over many years is willing to just share his/her years of hard-earned knowledge with a new person and therefore potentially make himself/herself less valuable? Let’s face it, we live in a world where competition is everywhere and senior engineers like to stay senior…
So how to fix the problem? There are two critical steps.
First of all, read up and/or download the DCOS®. Get access to it here. Do your own gap analysis, or even better, get an external audit conducted to ensure an impartial and unbiased view of the maturity of your processes etc. Click here if you want more information on DCOS® audits.
Second, for training, try out the DCPT by clicking here to see how you can, in a few simple steps, create a full-blown training plan for yourself or your staff based on the Data Center Competence Framework.
Author: Edward van Leent, Chairman & CEO, EPI Groupd of Companies