Gcore Adds NVIDIA Dynamo for 6x AI Inference Throughput

by:
|
February 25, 2026

Gcore, a global provider of AI, cloud, network, and security infrastructure, has integrated NVIDIA Dynamo into its AI inference solutions. This addition delivers up to 6x higher throughput and 2x lower latency as a fully managed, one-click deployment available on Gcore Everywhere Inference and Gcore Everywhere AI.

Quick Intel

NVIDIA Dynamo, an open-source inference framework, optimizes large-scale generative AI by addressing GPU underutilization, static allocation, memory bottlenecks, and data transfer inefficiencies.
Gcore offers Dynamo as a pre-optimized, managed service with single-click activation in the Customer Portal, eliminating the need to manage routing, KV cache logic, or GPU scheduling.
Performance gains include significantly higher effective throughput, steadier tail latency, and reduced cost per token through better GPU utilization and efficient prefill/decode disaggregation.
The integration supports private cloud, hybrid, and on-premises environments across Gcore's Everywhere AI and Inference platforms.
Dynamo uses KV cache-aware routing and NIXL for inter-node communication to process more requests on the same hardware, improving ROI.
Gcore will showcase Dynamo-powered inference at MWC Barcelona (March 2–5) and NVIDIA GTC San Jose (March 16–19).

Modern AI inference demands sophisticated handling of batching, dynamic workloads, extended contexts, and strict service-level objectives. Traditional approaches often result in substantial performance and cost penalties from minor inefficiencies in scheduling and utilization. Gcore's managed integration of NVIDIA Dynamo brings advanced GPU optimization directly into the inference runtime, enabling customers to achieve superior results without operational complexity.

Seva Vayner, Product Director of Edge Cloud and AI at Gcore, comments: "Modern inference isn't just 'run a model'—it's batching, routing, dynamic workloads, longer contexts, and tight SLOs. In that reality, small scheduling and utilization losses become big performance and cost penalties. By integrating Dynamo as a managed service in Gcore, we bring advanced GPU optimization directly into the runtime path so customers see higher effective throughput and steadier tail latency, without operating the complexity themselves."

NVIDIA Dynamo tackles key scaling challenges by disaggregating prefill and decode phases, applying intelligent KV cache routing, and optimizing inter-node data movement with NIXL. These enhancements maximize hardware efficiency, reduce wasted cycles during decode and cache recomputation, and allow more inference requests to run concurrently. Gcore's fully managed approach makes these benefits immediately accessible, simplifying deployment while delivering measurable cost savings and performance improvements.

Dynamo-powered inference is live today on Gcore Inference and Everywhere AI platforms. Interested organizations can experience demonstrations in person at MWC Barcelona from March 2–5 or NVIDIA GTC in San Jose from March 16–19.

About Gcore

Gcore is a global infrastructure and software provider for AI, cloud, network, and security solutions. Headquartered in Luxembourg, Gcore operates its own sovereign infrastructure across six continents, delivering ultra-low latency and compliance-ready performance for mission-critical workloads. Its AI-native cloud stack combines software innovation with hyperscaler-grade functionality, enabling enterprises and service providers to build, train, and scale AI everywhere—across public, private, and hybrid environments. By integrating AI, compute, networking, and security into a single platform, Gcore accelerates digital transformation and empowers organizations to unlock the full potential of AI-driven services.

AI InferenceGenerative AI

Share

Join 30,000+ Avid Tech Readers!

Trending tech news, interviews & insights straight to your inbox.

I agree to the Privacy Policy terms

Gcore Adds NVIDIA Dynamo for 6x AI Inference Throughput

Join 30,000+ Avid Tech Readers!

About Us

Quick Links

Connect With Us

Search TechIntelPro

Subscribe to Our Newsletter

Gcore Adds NVIDIA Dynamo for 6x AI Inference Throughput

Join 30,000+ Avid Tech Readers!

About Us

Quick Links

Connect With Us