How to Fly a Data Center – A Pilot’s View

January 21, 2021

Cloudify your data center with InCommand

What Flying teaches you

As a certified airline transport pilot (ATP) with over 5,000 hours flying jets, aerobatics and experimental aircraft, I have learned much about critical infrastructure design and operations.

I’ve also learned about design, redundancy and the importance of establishing and managing processes and procedures and about human factor.

Between my job leading a data center company and my love of flying, this is a life-long study.

Staying airborne, returning safely to the ground and keeping a data center operating, rely on understanding your aircraft or data center and establishing safe continuous operations.

Flying has taught me to appreciate both good design and reliable technology. Yet, it has also taught me that design, while very important, must evolve and be smart. Both in the air and on the ground, redundancy has the potential to lull a person into complacency.

There are reasons why 4 engine airplanes are rapidly being replaced by 2 engine aircraft, and why 2N data centers builds are being replaced by N+x concurrently maintainable designs. Both industries learned that, at some point, more systems are less efficient, introduce more components and therefore a higher potential for failure.

“Flying has taught me the difference between a crisis and a drama.”

More systems also mean an operating crew faced with even a noncritical outage on their hands operates in a degraded way even if the plane or data center are safe. A degraded operating crew becomes the pre-imminent failure risk.

If something does go wrong, it is good established procedures, training and a good understanding of the design that informs the proper decision. Risk avoidance, risk management and risk mitigation are achieved by quality maintenance, good training and by applying the correct procedures.

Flying has taught me the difference between a crisis and a drama. Experience and training teach you how to spot a routine warning and when an alarm means there is real cause for concern. More importantly, it teaches you what to do about it.

For example, a nonpressurized aircraft door opening inflight (as has happened to me) is dramatic and will scare the passengers but, as long as the pilot doesn’t panic, it does not lead to a crisis.

There are many more movies about heroic acts in the sky than in data centers. But in reality, in both domains operational safety is not about becoming the person who saves the day at the last second. When someone is forced to be a hero it is usually because of a system or process failure. Real heroism is knowing how things work, how they should operate and what to do when things go wrong.

Check and check again

Flying safely starts on ground before you get into the pilot seat. The most important routine in any normal flight is the pre-flight check. The importance of the logbook, maintenance, process and communication are vital. There is no substitute for doing the prep work. Not doing it will put you behind the machine, and these machines are fast and hard to catch up with.

I once found a flashlight in the engine compartment of a plane. It had just been in for maintenance. The logbook, maintenance history, who last flew the plane, any comments on the state of aircraft, how it handled, all must be studied before take-off. Poor pilots fail to log issues. Often this is because they believe it reflects on them. No-one likes talking about their near misses and outages. When handling aircraft there are strict rules about reporting when something breaks. Abnormalities must be investigated, most things degrade over time and do not break at once – and therefore trend monitoring is essential. Luckily, technology is of great assistance here.

“My biggest take away is that an airplane operator’s job is to make sure you don’t get into a situation that leaves you narrow survival options.”

In my business, I believe it would be great if there were a data center reporting system. Businesses are reluctant to share, which I understand. Therefore, we should establish an anonymous reporting process to report incident investigations. This will bring remarkable advance to the industry. The most fatal attitude in aviation and data centers is the belief that ‘this will never happen to me.’ And the best remedy is to learn from mistakes. The more learning the better.

Dealing with the unexpected

Here are some examples of unexpected things I’ve experienced in the cockpit. The flight mistakes that most worried me are when I made bad decisions. For example, I took off from Aspen, Colorado, in a Piston twin aircraft in complete whiteout conditions. My radar fried as I took off, but I couldn’t go back and land due to the weather. Everything else went well. But I could not get out of my mind that if I lost an engine on that flight with the mountainous terrain all around, I would be in real trouble. My biggest take away is that an airplane operator’s job is to make sure you don’t get into a situation that leaves you narrow survival options.

Once, when descending for landing on a flight, I had an engine outage on a twin-engine airplane – it was a pilot mistake. It was a new plane for me, and the engines were not well set up. It was a piston engine with the fuel mixture (yes, they still exist on aircraft) set too rich. As I advanced the mixture during the descent, it choked the engine. It was in a descent and I didn’t need much power. It took a while until I realized the outage as the plane didn’t have a central warning system (the airplane equivalent of a building management system) and when I did, I reversed the last input. The engine started and proceeded to a safe landing.

I’ve also had instrument outages and radio malfunctions where redundancy and training kicked in. I’ve had a few false alarms caused by bad probes and faulty sensors, mostly after maintenance. A checklist and a call to the manufacturer confirmed the malfunction.

The unexpected can and does happen in the air and inside the data center. My approach to flying is reflected in our approach to operating our own data centers and helping manage those of our clients. We do everything possible to avoid heroic actions. We develop procedures with a very precise actions list to take when something out of the ordinary occurs.

How we learn

The aircraft industry and commercial airline businesses are many decades older than the data center industry. Regulations and standards in the airline industry are well established and yet things still go wrong – sometimes tragically. The most infamous last words of a cockpit’s voice recorder are ‘what does this do?’ and ‘I know what I’m doing’.

In the data center space, we constantly try to learn from other industries. How are they improving reliability? How are they improving safety? What’s new in their training and operations

Individual experience matters a lot. So, for example, because I’ve not been able to fly as much as I’d like in 2020, I’ve focused on doubling down on training.

Things can always be improved. No one can say that data centers are the world’s most efficient buildings. We constantly learn of ways to improve reliability and efficiency. The opportunity is to constantly strive for better through giving people the tools to view and change how they operate through daily, continuous advances.

Kicking the tires, lighting the fires and flying by the seat of your pants makes flying fun only when nothing goes wrong. It is not a recipe for long life and it is not a smart way to operate any critical system.

In my next article I’m going to get into the details of what the data center industry can learn from aircraft makers and operators about redundancy – when it works and when it fails.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_devsite	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_71524600_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Data Centers

Solutions

InCommand Services

About Us

Resources

How to Fly a Data Center – A Pilot’s View

What Flying teaches you

Check and check again

Dealing with the unexpected

How we learn

Recent Posts

Serverfarm Implements Inaugural $1.637 Billion Sustainability-Linked Loan with Ambitious Energy Efficiency Targets

Datacloud Congress 2025 – Serverfarm’s future ready AI data centers

Serverfarm is an NVIDIA Certified DGX Platform Partner of Choice

Quick Links

Resources

Stay Up to date