
NSDI '26 - Co-Designing Traffic Control with NVMe-oF for Disaggregated Storage: A Comparative Study
The presentation compares two disaggregated‑storage fabrics— a traditional switched SAN and a novel switchless mesh— and proposes traffic‑control mechanisms co‑designed with NVMe‑of. By decoupling compute and storage, modern data centers face rapidly scaling PCIe bandwidth and higher drive density, prompting a choice between expensive high‑port switches and cheaper multi‑hop adapters. Three observations drive the design: network round‑trip time is a tiny fraction of total I/O latency, NVMe‑of traffic is inherently pairwise and asymmetric, and SSD performance throttles the entire fabric, causing back‑pressure that fills switch buffers. Existing congestion‑control schemes ignore these traits, so the authors introduce in‑network telemetry‑assisted symmetric routing, eager bandwidth reservation, and storage‑driven traffic scheduling. Experiments on four nodes equipped with Samsung PM93 SSDs show that both architectures deliver similar throughput, but the switchless mesh reduces average latency by 20‑28% and tail latency by about 25% across YCSB workloads. Cost analysis reveals that at 1,000 ports the switchless design saves roughly 50% of capital expenditure compared to a fully switched fabric. The findings suggest that, for disaggregated storage, a switchless topology combined with traffic‑aware control can deliver lower latency and substantial cost savings without sacrificing performance, offering a practical blueprint for next‑generation data‑center storage networks.

NSDI '26 - Defending Against Traffic Analysis Attacks with Flexible In-Network Obfuscation
The NSDI ’26 presentation introduced “Securities,” a flexible in‑network obfuscation system designed to thwart traffic‑analysis attacks without relying on external proxy services. By moving the obfuscation logic to the user’s edge network, the framework eliminates the need for cooperation from...

NSDI '26 - Net-P4ct: Enhanced WAN Bandwidth Fair Sharing Using P4 Programmable Switches
The talk introduces NetPack, a WAN‑wide bandwidth management system that shifts traffic policing from per‑host eBPF agents to line‑rate P4 programmable switches. By installing service‑specific policies at ingress points, NetPack can recognize jobs via a unique identifier and enforce guaranteed...

SREcon26 Americas - Intelligent Load Balancing in Kubernetes
The SREcon26 talk details Databricks’ effort to solve request‑imbalance issues in its Kubernetes‑based services by moving from the platform’s default load‑balancing to a custom, intelligent solution. Databricks discovered that Kubernetes distributes connections uniformly, not individual requests. Because their traffic relies heavily...

SREcon26 Americas - Intelligent Load Balancing in Kubernetes
Databricks engineers Gaurav Nanda and Vincent Cheng revealed that Kubernetes’ default kube‑proxy and DNS model struggles with long‑lived HTTP streams and high‑throughput gRPC, leading to pod hot‑spotting and tail‑latency spikes. They propose a client‑side, control‑plane‑driven load balancer that removes kube‑proxy...

SREcon26 Americas - Stop Reading Changelogs: Safer Kubernetes Upgrades with Simulation
The talk, “Stop Reading Changelogs: Safer Kubernetes Upgrades with Simulation,” opens with a vivid reminder of Reddit’s 314‑minute outage in March 2023, caused by a label change in a Kubernetes 1.23‑to‑1.24 upgrade that broke Calico’s node selectors. Speaker David “Dr....

FAST '26 - AdaCheck: An Adaptive Checkpointing System for Efficient LLM Training with Redundancy...
The video introduces AdaCheck, an adaptive checkpointing framework designed to curb the massive resource waste inherent in large‑language‑model (LLM) training. By recognizing that parallelism and model‑parallel architectures create duplicated tensors across workers, the authors propose a system that dynamically trims...

FAST '26 - Rearchitecting Buffered I/O in the Era of High-Bandwidth SSDs
The presentation, delivered by Chao of Hajing University of Science Technology, tackles the growing mismatch between buffered I/O architectures and today’s ultra‑high‑bandwidth SSDs. Over the past 15 years, SSD throughput has leapt from roughly 500 MB/s to 28 GB/s—a 56‑fold increase—rendering the...

OSDI '20 - AGAMOTTO: How Persistent Is Your Persistent Memory Application?
The presentation introduced Agamoto, a symbolic‑execution framework designed to automatically uncover persistency bugs in applications that use Intel’s emerging persistent‑memory (PM) technology. By mapping PM directly into a process’s address space, developers can avoid file‑system overhead, but they must also...