NSDI '26 - Co-Designing Traffic Control with NVMe-oF for Disaggregated Storage: A Comparative Study

USENIX Association
USENIX AssociationJun 2, 2026

Why It Matters

Switchless NVMe‑of fabrics deliver datacenter‑grade performance while halving infrastructure costs, accelerating adoption of storage disaggregation.

Key Takeaways

  • Switchless storage fabric achieves comparable throughput with lower latency.
  • Network delay negligible; SSD access dominates overall I/O latency.
  • Pairwise, asymmetric traffic enables proactive bandwidth reservation for reads.
  • In‑network telemetry guides symmetric routing to avoid congested paths.
  • Switchless architecture cuts capital costs by over 50% at scale.

Summary

The presentation compares two disaggregated‑storage fabrics— a traditional switched SAN and a novel switchless mesh— and proposes traffic‑control mechanisms co‑designed with NVMe‑of. By decoupling compute and storage, modern data centers face rapidly scaling PCIe bandwidth and higher drive density, prompting a choice between expensive high‑port switches and cheaper multi‑hop adapters.

Three observations drive the design: network round‑trip time is a tiny fraction of total I/O latency, NVMe‑of traffic is inherently pairwise and asymmetric, and SSD performance throttles the entire fabric, causing back‑pressure that fills switch buffers. Existing congestion‑control schemes ignore these traits, so the authors introduce in‑network telemetry‑assisted symmetric routing, eager bandwidth reservation, and storage‑driven traffic scheduling.

Experiments on four nodes equipped with Samsung PM93 SSDs show that both architectures deliver similar throughput, but the switchless mesh reduces average latency by 20‑28% and tail latency by about 25% across YCSB workloads. Cost analysis reveals that at 1,000 ports the switchless design saves roughly 50% of capital expenditure compared to a fully switched fabric.

The findings suggest that, for disaggregated storage, a switchless topology combined with traffic‑aware control can deliver lower latency and substantial cost savings without sacrificing performance, offering a practical blueprint for next‑generation data‑center storage networks.

Original Description

Co-Designing Traffic Control with NVMe-oF for Disaggregated Storage: A Comparative Study of Switched and Switchless SAN Architectures
Chendong Wang, Joontaek Oh, and Ming Liu, University of Wisconsin–Madison
Disaggregated storage is a pivotal component of today’s cluster infrastructures. With the advent of high-bandwidth server interconnects and new NVMe form factors, commodity storage appliances are becoming denser, delivering tens of millions of IOPS. This calls for today’s storage area network (SAN) fabric to expand the bandwidth capacity drastically. Industry practices tackle this issue via either (i) a scale-up approach, upgrading the per-port bandwidth in a switched SAN, or (ii) a scale-out strategy, integrating more paths in a switchless SAN. However, it is unclear which network architecture is more suitable for scaling storage disaggregation.
This paper presents a comparative study of switched and switchless SAN architectures from several angles. We begin by developing an experimental methodology that integrates both small-scale real-system prototypes and large-scale simulations, providing the flexibility needed to explore architectural trade-offs. We then characterize NVMe-oF I/O flows and co-design SAN traffic control mechanisms around these characteristics to improve I/O transmission efficiency in both settings. Our evaluation yields several key findings. First, the switchless SAN achieves throughput comparable to that of the switched SAN, despite involving additional routing hops, while simultaneously reducing latency through the use of multiple load-aware I/O paths that mitigate interference. Second, the switchless SAN reduces capital costs by obviating the need for expensive high-radix switches, scales effectively under heterogeneous I/O workloads, and avoids the single point of failure associated with top-of-rack (ToR) switches. Collectively, these results demonstrate that switchless SANs provide a compelling alternative to traditional switched designs for disaggregated storage environments.

Comments

Want to join the conversation?

Loading comments...