Alex Blake considers three key precautions to avoid a data centre meltdown.
06 January 2020 | Alex Blake
The UK has the largest data centre market in Europe and is second worldwide, behind the US. With fantastic network connectivity and a solid regulatory environment, growth of data centres in the UK has been significant in recent years.
As data centres have progressed, businesses have become reliant on information technology. The sudden loss of data could lead to lost income, which is why a reliable provider of technical cleans is essential.
Here are three considerations to prevent downtime.
Frequency of cleans
Before the team gets to work, it is important to fully survey the environment. Factors that influence the frequency of a clean include:
- How much foot traffic goes in and out;
- Whether it is located near a construction site;
- Weather conditions - such as pollen in the summer; and
- The age of the data centre.
We recommend that you alternate between two types of cleans, every six months. A surface clean is a quick and productive clean that doesn't require going into the ceilings or floor voids.
A deep clean delves into the microscopic levels of a data centre to protect equipment from air and surface contamination.
Following a technical clean, include a recommendation for future cleans. Circumstances may have changed since the previous assignment, so keep track of the biggest problem areas. These include egress and ingress points, filters, racks and areas where snagging or works have been carried out.
In addition, including a cleaning programme will indicate whether any preventative measures (such as restricting access, introducing tacky mats, wearing of shoe protectors) should be introduced. Regular and thorough cleans will ensure that the data centre is working optimally and could deliver increased energy efficiency while reducing the risk of equipment failure.
A common cause of system downtime is zinc whiskers. Almost impossible to see with the naked eye, these are tiny particles of zinc that form on electroplated surfaces or those that are galvanised with zinc.
These can cause circuit trips and system failures when they enter critical spaces. So critical teams must have the correct monitoring in place to identify their presence. Keeping a stock of swabs and testing equipment guarantees that swift action can be taken when the presence of zinc whiskers is suspected. When a short circuit occurs these particles disintegrate, so unless the technician is aware of this problem the source of the failure may go undetected.
Mechanical, electrical and plant (MEP) areas should be checked periodically and during every technical clean for signs of zinc whiskers to ensure that clients are getting the best possible service and maximising the performance of their facility.
Specialist cleans, air and surface quality tests and visual inspections are vital even though system operators may not believe there is a problem. To confirm if zinc whiskers are present, the critical space undertakes Scanning Electron Microscopy (SEM) testing on the suspected areas: the analysis is collected on a sticky tape stud, which is examined using a scanning electron microscope and energy-dispersive X-ray analysis (EXD).
One project undertaken by ABM Critical Solutions saw the team design and build a protective frame that was installed around the IT equipment to protect the racks from zinc whiskers while the technicians removed the infected material from inside the data centre's ceiling void for cleaning.
Regular technical cleans to the sub floor are essential and should be completed using specialised equipment and materials, including triple-filtration high-efficiency particulate air (HEPA) or S-class vacuums.
Underfloor contaminants left within the floor void increase the risk of unexplained server outages, which can be caused by particulate matter with a conductive element being caught within the flow of conditioned air and finding its way onto printed circuit boards.
Low and high-speed electric floor rotary machines should be used to cleanse the raised floor surface within the data hall - the buffers allow for interchangeable cleaning pads to be used.
Remember: Human error and infrastructure failure caused by insufficient maintenance are the main reasons for critical downtime and outages, but they can be avoided if the above steps are followed correctly.
Alex Blake is director at ABM Critical Solutions