
Nvidia GTC 2026: Vdura Unveils RDMA Support and Context-Aware Tiering for GPU-Native AI Infrastructure
Key Takeaways
- •RDMA enables GPU-to-storage transfers without CPU.
- •Vdura uses AMD EPYC Turin and Nvidia ConnectX‑7.
- •Context‑Aware Tiering dynamically places data across SSD, DRAM, memory.
- •KVCache write‑back reduces unnecessary I/O while meeting SLAs.
- •Phase 1 tiering available later 2026, boosting AI inference latency.
Summary
Vdura announced GPU‑native RDMA support for its Data Platform, allowing direct memory access between GPUs and storage without CPU involvement. The company also previewed Phase 1 of Context‑Aware Tiering, which will automatically move data across local NVMe SSD, DRAM and persistent tiers based on workload patterns. Both features are built on AMD EPYC Turin CPUs and Nvidia ConnectX‑7 networking, and are aimed at accelerating AI training and inference while cutting infrastructure costs. RDMA is shipping now on V5000 and V7000 systems; tiering is slated for general availability later this year.
Pulse Analysis
The introduction of GPU‑native RDMA marks a pivotal shift for AI infrastructure. By moving data directly from GPU memory to storage over high‑speed ConnectX‑7 fabric, Vdura removes the traditional CPU mediation layer that throttles bandwidth and adds latency. This architectural change frees CPU cycles for model computation, enabling higher sustained throughput during both training and inference. In practice, AI clusters can now sustain peak data rates with lower end‑to‑end latency, a critical advantage as models grow larger and datasets become more demanding.
Context‑Aware Tiering adds a layer of intelligence to the storage hierarchy. The system monitors workload characteristics and automatically migrates hot tensors to local NVMe SSD or DRAM, while less‑frequent data resides on slower, cost‑effective media. Features such as the extended DirectFlow buffer and KVCache write‑back ensure that only persistence‑critical information hits durable storage, trimming unnecessary I/O and preserving service‑level agreements. For use cases like long‑context language model serving or retrieval‑augmented generation, this dynamic placement reduces inference latency and improves overall response times.
Strategically, Vdura’s combined RDMA and tiering capabilities position it as a strong contender in the emerging AI‑storage market, competing with offerings from Dell, HPE and IBM. Leveraging AMD EPYC Turin CPUs and Nvidia BlueField‑class DPUs provides a hardware foundation that scales with future AI workloads. The roadmap extending through 2027 promises deeper application‑directed placement and broader DPU support, signaling a long‑term commitment that could attract enterprises seeking to future‑proof their AI pipelines while controlling operational expenses.
Comments
Want to join the conversation?