
Blog Article
In a data center, Operations is responsible for keeping critical infrastructure up and running at all times. But as companies deploy their servers for artificial intelligence (AI), they are utilizing more power than ever before. The increasing density requirements of AI are presenting a new set of challenges for Sabey’s Operations teams.
“As our clients deploy servers supporting their AI applications in our facilities, we can expect them to use a much higher percentage of the power they’ve leased,” explains Cristifer R. Engel, Vice President of Operations at Sabey Data Centers. “The Ops team has to be on top of its game. With this kind of density, we have to react much quicker to maintenance emergencies. We also have to ensure proper hot aisle containment in less forgiving densities, predict upcoming maintenance needs and provide service to critical equipment before it breaks down and causes a problem.”
In any data center facility, Operations takes care of the following:
With traditional IT infrastructure, a data center customer might only use a percentage of their total leased space and power. On top of that, ramp periods are traditionally long, taking several months and even years to scale up to their total deployment.
But with HPC infrastructure, a customer can very quickly ramp up to their full leased capacity, even if their HPC deployment uses the same amount of floor space as their regular IT deployment.
“In 2011, when we built out Sabey’s first data center building in our Quincy, Washington campus, the building had 7.2 megawatts IT capacity,” says Engel. “During the time I was there, the building usually operated at a density between four and five megawatts.”
“But today, the first building on Sabey’s Austin, Texas campus offers 34 megawatts, and we expect our clients to utilize this building to nearly full capacity. Their HPC infrastructure is using a scale of power that is so much bigger now.”
The need for more efficient data center operations also increases at scale as power requirements of HPC deployments increase dramatically.
“When a data center has a regular load, the risk of running the facility is not as bad,” says Engel. “If an air-cooling fan fails or a piece of equipment goes offline, the Operations team has time to step in and repair or replace it. But when the data center is fully loaded, that time to react goes down exponentially. If an emergency happens, the Ops team has to be on top of it.”
At each Sabey facility, the Operations team is constantly looking for new ways to improve efficiency. They must understand the useful life of all power and cooling systems components and anticipate when a piece of equipment needs to be replaced before it breaks.
“The funny thing is, we can now use AI to monitor system health and predict our maintenance needs,” says Engel. “The same technology that requires us to maximize power capacity in each facility can also help us run those facilities more smoothly.”
In the next blog in our AI series, we’ll look at how Sabey’s Operations team is using and planning to use AI applications to optimize maintenance schedules for critical power and cooling infrastructure. We’ll also look at how AI is helping Sabey to move from a reactive to a predictive model for operations support.
For information about pre-leasing critical environments for HPC deployments for AI, contact Sabey Data Centers today.