Home
News
Tech Grid
Interviews
Anecdotes
Think Stack
Press Releases
Articles
  • Home
  • /
  • News
  • /
  • AI
  • /
  • Enterprise AI
  • /
  • VDURA Unveils RDMA Support and Context-Aware Tiering for GPU-Native AI Infrastructure
  • Enterprise AI

VDURA Unveils RDMA Support and Context-Aware Tiering for GPU-Native AI Infrastructure


VDURA Unveils RDMA Support and Context-Aware Tiering for GPU-Native AI Infrastructure
  • by: Source Logo
  • |
  • March 17, 2026

VDURA has announced major advancements at NVIDIA GTC 2026, including immediate availability of Remote Direct Memory Access (RDMA) for GPU-native data transfers, optimized infrastructure configurations using AMD EPYC Turin processors and NVIDIA ConnectX-7 networking, and a preview of Context-Aware Tiering technology planned for general availability later in 2026. These innovations deliver low-latency, high-throughput storage performance while improving efficiency and cost control for large-scale AI training and inference workloads.

Quick Intel

  • RDMA capability now available, enabling direct GPU-to-storage data transfers that bypass CPU bottlenecks for lower latency and higher throughput.
  • Optimized configurations combine AMD EPYC Turin processors with NVIDIA ConnectX-7 adapters for peak AI cluster performance.
  • Context-Aware Tiering Phase 1 (coming later 2026) introduces intelligent data placement across tiers based on workload patterns and access needs.
  • Features extended DirectFlow buffer to local NVMe SSD, KVCache writeback for persistence SLAs, and unified Context Cache Tiering for LMCache-speed access.
  • Positions VDURA Data Platform to span full data hierarchy—from memory to long-term retention—with no performance compromises.
  • Supports production AI environments by keeping GPU clusters saturated while delivering hyperscale durability and economics.

AI workloads increasingly demand storage systems that match GPU speeds without introducing CPU overhead or unnecessary data movement. Traditional architectures create bottlenecks during high-volume training and inference, limiting scalability and raising costs. VDURA addresses these challenges by enabling direct, GPU-native access and intelligent, context-driven tiering.

RDMA Enables GPU-Native Direct Memory Access

VDURA's RDMA implementation allows GPU server nodes to transfer data directly to and from the VDURA Data Platform over the network, eliminating CPU involvement in the data path. This zero-overhead approach sustains peak throughput, reduces end-to-end latency, and frees GPU compute resources for model execution rather than data handling. Available now on V5000 and V7000-class systems, RDMA leverages NVIDIA ConnectX-7 high-speed networking for seamless integration into modern AI clusters.

Context-Aware Tiering Brings Intelligence to Storage Hierarchy

The upcoming first phase of Context-Aware Tiering (general availability later 2026) dynamically places data across tiers—DRAM, local NVMe SSD, and durable storage—based on real-time workload characteristics and access patterns. Key initial features include:

  • Extension of DirectFlow buffer to local SSD, reducing network dependency for hot data and minimizing latency in active AI workloads.
  • Intelligent KVCache writeback that persists only critical data to durable storage, optimizing I/O while meeting inference SLA requirements.
  • Unified Context Cache Tiering framework enabling high-speed read/write access across local SSD and DRAM at LMCache-equivalent speeds, ideal for long-context LLM serving and retrieval-augmented generation (RAG).

VDURA has outlined a roadmap through 2027 for additional capabilities, including deeper application-directed placement, cross-node cache coherence, and expanded support for NVIDIA BlueField-4 DPUs.

Full-Stack Optimization for Production AI

Combining RDMA with Context-Aware Tiering creates a comprehensive AI storage platform that eliminates CPU bottlenecks while ensuring data resides in the optimal tier at all times. This delivers the performance required to run larger models and serve more inference requests, alongside the efficiency to scale infrastructure economically and reliably.

"Today's announcements at GTC 2026 reflect our commitment to delivering the AI storage platform that spans the full data hierarchy — from memory to long-term retention — with no compromises on performance,” said Ken Claffey, CEO of VDURA. “RDMA gives AI teams direct, zero-CPU-overhead access to their data. Context-Aware Tiering brings intelligence to every tier of the extended storage hierarchy, so data is always in the right place at the right time. Together, these capabilities enable organizations to run larger models, serve more inference requests, and efficiently scale AI infrastructure with the operational reliability that production AI demands.”

 

About VDURA

VDURA builds the world's most powerful data platform for AI and high-performance computing, bringing hyperscale-class storage to the rest of the world, powered by HYDRA, the only high-performance distributed architecture purpose-built to unify memory, flash and disk in a single software-defined platform that keeps GPU clusters saturated while delivering hyperscale-class durability and economics. Visit vdura.com for more information.

  • Context Aware TieringAI Data PlatformAI Storage
News Disclaimer
  • Share