Datacenter downtime is bad but not nuclear silo explosion bad

Screen Shot 2018-01-12 at 15.32.00
Source: This American Life.

Writing about datacenters and tech I am always looking for parallels with other industries to try and contextualise some of the issues that emerge.

Managing datacenters is challenging but what about other types of critical infrastructure like airports, railways and power stations?

I think I have found another great example.

I just listened to a recent webcast from the always excellent This American Life. Titled, ‘Human Error in Volatile Situations’ it does pretty much what it says on the tin.

The first story in the episode is the most gripping and probably the most infamous. For anyone who’s had experience of managing complex facilities equipment, it’s a must listen.

“In 1980, deep in a nuclear missile silo in Arkansas, a simple human error nearly caused the destruction of a giant portion of the Midwest.”

A devastating explosion, and a near nuclear incident, was caused by human error – use of the wrong tool – but exacerbated by extremely poor decision making from above and emergency operating procedures that seemed comprehensive but didn’t extend to the unthinkable.

Check out the podcast at This American Life.

I’m planning to check out the book on which some of the podcast is based next – Command and Control: Nuclear Weapons, the Damascus Accident, and the Illusion of Safety – but I’m also conscious that where nuclear incident safety is concerned, ignorance is also bliss. 

Open Compute Project targets new markets including colocation

I was lucky enough to speak with Steve Helvie, VP of channel at the non-profit Open Compute Project (OCP) Foundation recently.

Helvie said OCP is targeting several key markets in 2018 as it looks to maintain its momentum and grow beyond hyper scalers. These include telcos, service providers (from SaaS to colocation), financial services (including blockchain), high-performance computing, healthcare, and government.

Regarding colocation operators, the group has released guidelines and a check-list to help with adoption of OCP equipment in colocation facilities. There are also plans for some kind of stamp or certification which has been discussed for over a year now.

However, the exact form the OCP-ready stamp will take is still being developed, according to Helvie. “We are likely not going to have another brand, but it will be a level of formal recognition. I want enterprises to be able to go into our marketplace and say, ‘Where can I find someone who is ready to host Open Compute?’”

Head to Data Center Knowledge for the full article.

OCP servers OCP summit 2017_1
Microsoft’s custom cloud servers, open sourced through the Open Compute Project, as seen at the OCP Summit 2017

Where and how to build your next datacenter for maximum energy and carbon efficiency

Andrew at the Catalonia Institute for Energy Research’s facility in Tarragona Spain where the RenewIT tool was developed

I was lucky enough to be involved recently in an in-depth European Union research project called RenewIT. The project had a number of outputs but the main one was a web-based tool to enable different datacenter designs, and locations for those designs, to be compared across Europe in terms of energy efficiency and carbon emission reduction.

I just published an overview of the tool, which was a finalist in the recent Datacenter Dynamics awards, over at Verne Global’s site. The tool has some particular relevancy for the colocation and cloud services operator as it facilities are based in Iceland. Verne benefits from Iceland’s cheap and plentiful renewable energy and is encouraging more organisations to locate their workloads at its facilities.

Screen Shot 2017-11-10 at 16.51.41
The RenewIT tool enables locations across Europe to be compared in terms of cost and access to renewable energy

Head over to Verne’s website to access the full blog. The RenewIT tool also has its own dedicated site and there is a separate site with more background on the project and its other outputs.

Those [data center operators] that cannot remember the past are condemned to repeat it

train-wreck-steam-locomotive-locomotive-railway-73821The sentiment in the headline is a pithy reminder of the importance of understanding the past.

The unfortunately long list of datacenter operators that suffered outages in 2017 would do well to heed those words.

Specifically, how can operators that don’t undertake a thorough root-cause analysis after an outage expect to prevent further downtime in the future?

I’ve been working with UK datacenter design company Future-tech that provides specialist forensic engineering services to help root out the causes of downtime and help harden facilities against future outages.

Head over to Future-tech’s site to see their take on the importance of thoroughly investigating the causes of unplanned downtime.

Interview with Dominic Ward from Verne Global on new HPC cloud service

Screen Shot 2017-12-15 at 10.40.52I recently spoke with Iceland-based colocation and cloud services provider Verne Global about their new HPC-as-a-service (HPCaaS) offering hpcDIRECT.

Verne’s managing director Dominic Ward explained how the hpcDIRECT was a natural extension of its colocation services but will also take the company into some new areas in the future.

“I think the balance over time will shift towards more customers wanting to consume more HPCaaS. However for now I think the balance will remain that customers will want the majority – anything over 50% – in a colocation environment while wanting to start to test our HPCaaS. But I do think there will be gradual migration in the same way we have seen that shifting for enterprise cloud environments, or enterprise applications, I do think that is coming for HPC as well”

Head to Verne’s website for the full interview and for more about hpcDIRECT.

As the Climate Changes, So Should Data Center Operations

irma sept 5 iss nasa-1_0
The eye of Hurricane Irma is clearly visible from the International Space Station as it orbited over the Category 5 storm on Sept. 5, 2017.

My last column of the year over at Data Center Knowledge.

It’s based on a very informative and wide-ranging webcast from Uptime Institute earlier this week entitled: 10 Must-Answer Questions For Your 2018 Data Center Strategy.

One of the issues examined by the Uptime panel was how data center operators should respond to extreme weather events caused by global warming.

Uptime CTO Chris Brown argued that hardening facilities against extreme weather and temperatures was not the only issue. Operators also need to put the right procedures in place around data center staffing to better manage extreme weather events. “These last few storms have got people thinking about the operations personnel,” he said. “If you have a major storm coming through, people living and working in that area have their own homes, their own families, their own things to worry about. They are usually going to give those things their attention first before the data center. That is just human nature.”

Head to DCK for the full article.

Cloud is Shaking Up Japan’s Earthquake-Ready Data Centers


My latest column over at DCK is based on a conversation with the chief executive of Colt Data Centre Services (DCS) Detlef Spang whose company has just opened its latest facility in Japan.

He outlined some of the opportunities and challenges for building out new capacity in Japan. Land costs necessitate multi-story data centers despite the ever present risk of earthquakes.

High energy costs also have to be weighed against humidity levels and temperatures in the summer which make some free-cooling technologies challenging to deploy.

We also touched on Colt’s foray into prefabricated modular designs which it has now scaled back.

For more head over to Data Center Knowledge.

Verne Global moves into HPC as a service

Iceland-based data centre operator Verne Global is expanding beyond its core collocation services.

The company launched an HPC-as-a-service platform this week:

hpcDIRECT provides a fully scalable, bare metal service with the ability to rapidly provision the full performance of HPC servers uncontended and in a secure manner.

I recently spoke with Verne managing director Dominic Ward about the service and the resulting interview will be published on Verne’s website soon.

For more details on hpcDIRECT see the full release.

Subterranean data centers emerge from the underground

Screen Shot 2017-12-01 at 18.06.27
Lefdal Mine Data Center, Norway

My latest column over at Data Center Knowledge asks whether underground data centers, such as the recently opened Lefdal Mine facility, in Norway are becoming more commonplace.

Lefdal has taken the concept of underground data centers and run with it. The facility, backed by regional investors and Norwegian power company SFE, has potential to reach capacity of 120,000 square meters (1.3 million square feet) of  data center space and more than 200MW of IT capacity. If fully utilized it would be the biggest data center in Europe.

As with other underground data centers, the organizations behind LMD – which also include Rittal and IBM – make much of the site’s physical security. However, its cooling system and access to cheap renewable energy are probably the standout features of the site.

More at DCK.