Reducing data movement and colocating memory and compute can slash latency, energy use, and costs for critical, data-heavy domains—enabling faster machine-learning training and real-time genomic analysis that can affect clinical outcomes. Shifting to memory-centric designs will be key for scaling next-generation workloads and avoiding prohibitive infrastructure and energy bottlenecks.
The tutorial frames memory-centric computing as a response to rapidly growing data demands that are outpacing traditional compute-centric architectures. The speaker highlights that modern workloads—large neural networks, databases, graph analytics, and mobile applications—are increasingly bottlenecked by memory bandwidth, capacity, and data movement rather than raw compute. He reviews approaches such as processing-in-memory, processing-in-storage, and wafer-scale integration (citing Cerebras’ large on-chip SRAM) as ways to keep compute and data colocated, and uses genomics as a concrete example where sequencing throughput has exploded but analysis remains constrained by data transfer and processing. The talk argues for architectural change to reduce off-chip data movement and accelerate time-sensitive, data-intensive applications.
Comments
Want to join the conversation?
Loading comments...