FAST '26 - Rearchitecting Buffered I/O in the Era of High-Bandwidth SSDs
Why It Matters
By eliminating page‑cache bottlenecks, WS‑Buffer enables enterprise storage to fully exploit high‑bandwidth SSDs, delivering faster, more cost‑effective data services.
Key Takeaways
- •SSD bandwidth grew 56×, challenging traditional buffered I/O.
- •Page cache management overhead limits buffered I/O throughput.
- •Partial-page writes incur 1.15–64× higher latency than full pages.
- •WS‑Buffer redesign cuts memory use and eliminates read‑before‑write.
- •WS‑Buffer delivers up to 6.3× latency improvement and 4.5× throughput gains.
Summary
The presentation, delivered by Chao of Hajing University of Science Technology, tackles the growing mismatch between buffered I/O architectures and today’s ultra‑high‑bandwidth SSDs. Over the past 15 years, SSD throughput has leapt from roughly 500 MB/s to 28 GB/s—a 56‑fold increase—rendering the legacy page‑cache‑centric buffered I/O model increasingly inefficient.
The authors identify three core bottlenecks: costly page‑cache allocation and state‑maintenance, excessive memory consumption to sustain write throughput, and the read‑before‑write penalty that inflates latency for partial‑page writes by up to 64×. Experiments comparing buffered I/O, direct I/O, and hybrid schemes on eight PCIe 4.0 Samsung drives show that conventional buffered I/O lags direct I/O by 1.1‑4.5×, especially under write‑intensive workloads.
To address these issues, the team proposes WS‑Buffer, a rearchitected buffered I/O layer that introduces a lightweight scratch buffer, a two‑stage “OT‑Flash” mechanism, and contention‑aware page management. Benchmarks on XFS/Linux 6.8 reveal up to 6.3× lower write latency for full‑page writes and up to 2.18× improvement for partial writes, with memory usage cut by as much as 99.6% and CPU utilization reduced by 28.4%.
The findings suggest that storage stacks can retain the usability of buffered I/O while unlocking the full potential of modern SSDs. Data‑center operators and file‑system developers are urged to consider WS‑Buffer‑style designs to achieve higher throughput, lower latency, and better resource efficiency as SSD bandwidth continues to close the gap with DRAM.
Comments
Want to join the conversation?
Loading comments...