Qumulo, the Accelerated Data company, today announced the Qumulo Cloud AI Accelerator, a new approach to enterprise AI infrastructure that presents distributed enterprise data in real time to GPU resources across regions, clouds, and hybrid environments without replication, staging delays, or data-consistency trade-offs. According to a recent analysis, the average enterprise GPU utilization hovers around a staggering 5%. This means hundreds of billions of dollars' worth of accelerated compute infrastructure sits idle roughly 95% of the time because data must be staged, replicated, and moved into position before a workload can even start.
Qumulo launches Cloud AI Accelerator to create GPU liquidity for enterprise AI infrastructure.
Average enterprise GPU utilization is approximately 5%, with 95% idle time due to data staging delays.
Cloud AI Accelerator eliminates data-gravity constraints without replication or staging delays.
Connects to Microsoft AI Foundry, AWS Bedrock, and Google Vertex AI without copying data.
Built on Cisco UCS and Cisco networking for hybrid and multi-cloud deployments.
Available now across AWS, Azure, Google Cloud, and Oracle Cloud Infrastructure.
"Every enterprise we talk to is focused on GPU availability, but availability is only half the problem. The deeper issue is utilization, and the culprit is data gravity," said Douglas Gourlay, President and CEO, Qumulo. "The industry's response has been to sell enterprises more tightly coupled storage attached directly to GPU clusters, which optimizes a tiny window of active compute time while doing nothing about the idle time that surrounds it. This only leads to more expensive tokens and storage islands to maintain. Cloud AI Accelerator was built to solve the actual problem of getting the data to the GPUs instantly, wherever they are, without ever copying it."
Rather than forcing enterprises to move massive datasets wherever GPUs happen to be available, Qumulo Cloud AI Accelerator takes a fundamentally different approach. It eliminates the data-gravity and data-staging bottlenecks that cause GPU idle time in the first place, enabling enterprises to build an agile AI infrastructure that adapts in minutes to changing GPU availability across clouds and regions—creating true enterprise GPU liquidity.
Qumulo Cloud AI Accelerator creates GPU liquidity by building an intelligent data fabric that integrates Cloud Native Qumulo (CNQ), Qumulo Cloud Data Fabric, and Qumulo NeuralCache across on-premises, edge, and multi-cloud environments. This allows enterprises to run workloads wherever GPU capacity is available, rather than wherever data happens to be trapped. This transforms GPU hunting from a costly logistical problem into a flexible scheduling operation, delivering any enterprise dataset in real time to any GPU farm in any cloud.
With Qumulo Cloud AI Accelerator, enterprises can:
Connect Without Copying: Seamlessly and securely connect on-premises or cloud-native Qumulo systems to Microsoft AI Foundry, AWS Bedrock, and Google Vertex AI without copying data.
Capture Global GPU Capacity: Run AI workloads wherever and whenever GPU capacity becomes available, across any region, cloud, or availability zone.
Eliminate Staging Delays: Wipe out the weeks-long data-staging delays that keep GPU infrastructure idle before training or inference workloads begin.
Eradicate Storage Islands: Avoid maintaining multiple, isolated, and replicated storage silos across every environment where GPUs might be sourced.
Slash Idle Compute Costs: Drastically reduce idle GPU costs by eliminating the heavy load phase into GPU-attached flash storage.
Cisco's networking, security, and compute play a foundational role in the Cloud AI Accelerator architecture. Cisco Unified Computing System (UCS) provides scalable enterprise AI compute infrastructure for on-premises and hybrid deployments, while Cisco's high-performance networking enables secure, low-latency data movement across hybrid and multi-cloud AI environments. Together, Qumulo and Cisco enable enterprises to build agile AI infrastructure that adapts in minutes to changing GPU availability, providing the operational flexibility that makes GPU liquidity achievable at enterprise scale.
About Qumulo
Qumulo is the only seven-time Leader in the Gartner Magic Quadrant for Distributed File and Object Storage and the foremost provider of cloud data platforms. With exabytes under management and more than 1,000 production customers, Qumulo is trusted by Fortune 500 companies and global enterprises to manage, store, curate, and protect their data, unlocking new possibilities and driving innovation across diverse industries.