News | 08/26/24
Using AI to Improve Data Center Services
Today, many companies are setting up deployments in our data centers to host and develop Artificial Intelligence (AI) technologies. Sabey’s facilities provide essential mission-critical environments for its customers’ infrastructure, with high power densities and liquid- or air-cooled environments designed to ensure 24/7 uptime.
It falls to Sabey’s operations team to provide critical power and cooling to HPC infrastructure, through ongoing maintenance of equipment. Now, the very technology we support is coming full circle, and our operations team is leveraging AI tools to drive our operational excellence, increase efficiency and maximize uptime.
“Using AI has allowed us to take the next steps toward a predictive stance in maintaining our facilities,” explains Cristofer R. Engel, VP of Data Center Operations at Sabey Data Centers. “Traditionally, the industry has largely operated in a reactive stance. In other words, we closely monitor our environments and react as quickly as possible when a piece of equipment breaks down. Now we can feed the massive amounts of facility-monitoring data directly into AI tools that analyze it to identify possible problems in our critical equipment. Now we can fix things before they break and eliminate that risk of downtime entirely.”
The Sabey Data Centers operations team has already implemented AI tools to:
- Improve Processes
With the sheer number of maintenance events in the normal annual rotation of a data center operations team, Engel thought there could be an opportunity for optimization. Using AI to compare the frequency of maintenance cycles in each facility against the data center industry’s best practices, the ops team can adjust its schedules for maintaining mission-critical equipment.
“When we started using generative AI for this comparison, it was very interesting to see the differences,” says Engel. “Some maintenance activities we were doing too often, while others we weren’t doing enough. But using AI allowed us to see the places where we needed to align our maintenance calendar with industry standards.”
2. Detect Anomalies
By analyzing data from sensors deployed across Sabey facilities, AI identifies anomalies in mission-critical equipment and alerts the Operations team to possible maintenance issues.
“We’re not necessarily talking about a sudden spike or drop in power levels,” explains Engel. “We’d utilize AI to look for things like a gradual decline in power to a customer’s HPC environment over several weeks. An integrated AI system would notify us of this trend, and indicate that we may need to provide service to a Power Distribution Unit (PDU) or replace a defective cable.”
“Likewise, if a sensor indicates that a certain deployment is gradually increasing in temperature, that might indicate that its liquid- or air-cooling units have an issue that needs to be addressed. If a battery starts to go out of our defined thermal resistance ranges, the unit might be approaching end-of-life. With AI to identify this anomaly, we have the opportunity to replace the battery before it dies.”
3. Analyze Data
In the fast-paced world of data center management, efficiency and reliability are critical. We are now leveraging artificial intelligence (AI) to gain deeper insights into our operations and transform raw performance data into actionable intelligence. One powerful application of AI is in analyzing Key Performance Indicators (KPIs) to identify trends and pinpoint underlying issues in facility management.
“Imagine a large-scale data center operation that experiences 100 incidents over a quarter,” says Engel. “Of these, AI might categorize 40 as false fire alarms, 35 as electrical failures, and 25 as HVAC-related issues. Beyond identifying these trends, AI digs deeper to uncover the root causes—such as delayed maintenance checks, incomplete repairs or unaddressed system changes. These insights allow our management team to quickly identify recurring patterns and address systemic problems, such as insufficient staffing or outdated procedures, before they lead to more significant disruptions.”
“By scaling this capability across multiple data centers, we have proactively optimized our operations, improved safety and prevented costly downtime. AI has become an indispensable tool for our modern data center management.”
4. Perform Administrative Duties
Sabey is using AI to optimize routine administrative tasks. In our day-to-day work, for example, AI tools automatically surface specific email messages and/or documents in our file structure related to whatever we are working on. It summarizes meetings and provides action items. AI is also being used to help create business documents, presentations and even video for all areas of operations, including training, scheduling and communicating with customers. At each data center, the operations staff will often use AI to check emails and other communiques for accuracy, grammar and tone before sending them out to customers.
The Future of AI in Data Center Operations
But AI can be utilized in more comprehensive ways. As the technology advances and is more proven in a mission critical environment, Sabey will be able to integrate more fully automated AI systems to assist in operations. For example, Sabey plans to integrate AI with its Building Management System (BMS). This will allow its operations team to compare internal and external conditions (e.g. inside/outside temperatures and humidity levels) for each facility against client expectations based on Service Level Agreements (SLAs). AI will use this data to make suggestions for how the operations team can further optimize mission-critical systems and energy usage in each of Sabey’s data centers. Sabey will also integrate AI with its data center infrastructure management (DCIM) software to identify maintenance needs and automate workflows.
“Today, AI is just making recommendations for improvements and leaving it up to our ops team to make the changes,” says Engel. “But I look forward to the day when we can train AI to run data center systems under our supervision. A safety toggle would allow us to disable the automation and resume control whenever necessary, but with all the data at its disposal, the AI will more effectively manage the environmental variables more efficiently than any human could.”
“AI has already transformed the way we work. In seconds, AI analyzes data and performs tedious tasks that previously took up much of our time. As AI drives perpetual improvements in the power and cooling performance of Sabey facilities, our guarantee to ensure the uptime of our customers’ IT and HPC servers is strengthened.”
For more information about establishing HPC deployments for AI in Sabey’s data centers, contact us.