Understanding & Designing Modern Storage Systems - L2: Basics of NAND Flash-Based SSDs (Spring 2026)
Why It Matters
Grasping SSD internals enables engineers to design systems that maximize performance and lifespan, while informing procurement decisions for data‑intensive enterprises.
Key Takeaways
- •SSD controllers embed multiple cores for parallel request handling
- •DRAM cache buffers writes, reducing latency and wear on flash cells
- •Logical‑to‑physical mapping enables out‑of‑place writes and wear leveling
- •Garbage collection and data refresh mitigate invalid pages and retention errors
- •NVMe interface provides high parallelism versus legacy SATA protocol
- •
Summary
The lecture provides a detailed walkthrough of modern NAND flash‑based SSD architecture, beginning with a high‑level view of the SSD PCB that houses multiple flash packages, a low‑power DRAM cache, and a multi‑core controller. It explains how the host interface layer receives I/O commands over SATA or NVMe, and how the flash translation layer (FTL) orchestrates data caching, address translation, garbage collection, wear leveling, and data refresh. Key technical insights include the role of the DRAM write buffer in aggregating writes to avoid frequent program‑erase cycles, the limited size of this buffer due to power‑loss protection, and the 4 KB page granularity that drives logical‑to‑physical mapping entries (≈4 bytes per entry, about 1 % of SSD capacity). The FTL’s out‑of‑place write policy, combined with garbage collection and block‑level wear leveling, ensures even P/E cycle distribution and prolongs device lifespan. Illustrative examples feature the Samsung BM99‑53T TLC SSD, which integrates eight 128 GB NAND packages for a 1 TB capacity, and the use of ECC (≈72 parity bits per kilobyte) and data randomization to counteract pattern‑dependent errors and retention loss. The instructor also highlights how hot‑data is steered to fresh blocks while aged blocks store cold data, and how power‑loss scenarios dictate the buffer’s capacity. For practitioners, understanding these mechanisms is crucial for optimizing SSD performance, reliability, and endurance in data‑center and consumer workloads. The lecture underscores that NVMe’s parallel command queues dramatically improve throughput compared with SATA, and that effective FTL design directly impacts latency, wear, and overall cost of ownership.
Comments
Want to join the conversation?
Loading comments...