P&S: Arch. & Algo. For Health & Life Sciences- L3: Storage Centric (Meta)Genomics I (Spr 2026)

Onur Mutlu Lectures
Onur Mutlu LecturesMar 17, 2026

Why It Matters

Embedding genomics filters in storage cuts data‑movement costs and accelerates analysis, enabling faster, cheaper insights for precision medicine and public‑health applications.

Key Takeaways

  • Storage-centric filters reduce data movement in genomics pipelines.
  • GenStore identifies exact‑match and non‑match reads inside SSDs.
  • Read‑size k‑mers and sorted indexes enable sequential access.
  • In‑storage filtering improves performance, energy efficiency, cost significantly.
  • Approach adapts to varied read lengths and genetic variation.

Summary

The lecture introduces storage‑centric architectures for genomics and metagenomics, focusing on how embedding filtering logic directly inside storage devices can alleviate the massive data‑movement and preparation bottlenecks that dominate current pipelines. By moving simple, low‑cost operations—such as exact‑match detection and non‑match elimination—into SSDs, systems like GenStore aim to send only the reads that truly require expensive alignment to downstream CPUs or accelerators. Key insights include the use of read‑size k‑mers to collapse multiple index lookups into a single operation, and the sorting of both k‑mers and read tables to transform random accesses into sequential scans. Experiments show that an ideal in‑storage filter can dramatically cut both computation and I/O overhead, and that hardware accelerators shift the bottleneck from compute to I/O, underscoring the value of storage‑side processing. The presenter highlights concrete examples: GenStore‑EM filters exact‑matching reads, while GenStore‑NM discards reads with no viable alignment. By leveraging a single index lookup per read and simple comparison logic, the design achieves high throughput with minimal DRAM and flash resources. Real‑world case studies on human and microbial datasets demonstrate performance gains and energy savings without incurring significant hardware cost. Overall, embedding genomics‑specific filters in storage promises faster turnaround for precision‑medicine, outbreak monitoring, and agricultural research, while reducing operational expenses and power consumption—critical factors as sequencing data volumes surge past 100 TB and continue to grow.

Original Description

Project & Seminar (P&S), ETH Zürich, Spring 2026
Architectures & Algorithms for Health & Life Sciences (https://safari.ethz.ch/projects_and_seminars/spring2026/doku.php?id=archforhealt
Lecture 3: Storage Centric Designs for Genomics and Metagenomics I
Lecturer: Nika Mansouri Ghiasi
Date: March 18, 2026
Lecture 3 Slides (pptx):
Lecture 3 Slides (pdf):
Recommended Reading:
====================
A Modern Primer on Processing in Memory
Memory-Centric Computing: Solving Computing's Memory Problem
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
Intelligent Architectures for Intelligent Computing Systems
RowHammer: A Retrospective
Fundamentally Understanding and Solving RowHammer
Accelerating Genome Analysis via Algorithm-Architecture Co-Design
From Molecules to Genomic Variations: Accelerating Genome Analysis via Intelligent Algorithms and Architectures
RECOMMENDED LECTURE VIDEOS & PLAYLISTS:
========================================
Digital Design and Computer Architecture Spring 2025 Livestream Lectures Playlist:
Fundamentals of Computer Architecture Fall 2025 Livestream Lectures Playlist:
Seminar in Computer Architecture Spring 2025 Livestream Lectures Playlist:
Computer Architecture Fall 2024 Lectures Playlist:
Interview with Professor Onur Mutlu:
TCuARCH meets Prof. Onur Mutlu
Arch. Mentoring Workshop @ISCA'21 - Doing Impactful Research
The Story of RowHammer Lecture:
Accelerating Genome Analysis Lecture:
Memory-Centric Computing Systems Tutorial at IEDM 2021:
Intelligent Architectures for Intelligent Machines Lecture:
Featured Lectures:

Comments

Want to join the conversation?

Loading comments...