
The penultimate lecture of the Spring 2022 Digital Design & Computer Architecture series focuses on prefetching – the proactive loading of data into cache or registers before it is demanded by the processor. The instructor emphasizes that prefetching is one of the most impactful techniques for alleviating memory‑hierarchy bottlenecks, especially as systems adopt deeper, heterogeneous memory stacks. Key insights include how prefetching can cut both cache‑miss rates and miss‑latency by bringing blocks into the appropriate cache level ahead of time. The design space spans from L1 to main memory and even remote nodes, raising coordination challenges across levels. The lecture also covers software‑driven prefetching (e.g., browsers preloading likely links) and hardware mechanisms, noting that aggressive strategies may introduce inter‑core interference and bandwidth pressure. The professor highlights concrete examples: a web browser predicting user clicks, distributed prefetches from remote memory nodes, and the subtle interplay with cache‑coherence protocols such as MESI. He warns that a prefetch triggered by one core can evict useful data for another, underscoring the need for intelligent prediction and throttling. For system architects, the takeaway is clear: effective prefetching can unlock significant performance gains, but only if designers carefully balance prediction accuracy, placement, and coherence overhead. As multicore and emerging memory technologies (e.g., PCM, FeRAM) become mainstream, refined prefetch strategies will be essential to sustain scaling and latency targets.

Professor Onur Mutlu outlined the case for memory-centric computing, arguing that modern workloads—especially machine learning and genomics—generate far more data than current systems can efficiently process. He highlighted trends like wafer-scale processor designs and high-bandwidth memory attachments as steps toward...

The lecture examines cache design challenges in multicore and multithreaded systems, highlighting trade-offs between private and shared caches. Shared caches improve utilization, reduce data replication and communication latency, and align with shared-memory programming, while private caches avoid contention and offer...

In this lecture on advanced caches the instructor reviews memory hierarchy principles and current extensions, including remote memory and memory-blade architectures used to support data‑intensive applications. He revisits basic cache designs (direct‑mapped, set‑associative, fully associative), explaining how associativity trades off...

In a recorded talk for the Montenegro Academy of Sciences, the speaker outlined the urgent need to accelerate genome analysis, concentrating on the read-mapping bottleneck that impedes turning high-throughput sequencing outputs into actionable genomic insight. He traced advances in sequencing—especially...

The lecture is an introductory session to digital design and computer architecture, framing the course as a ground-up exploration of how computers are built—starting from CMOS transistors as the fundamental switching element and progressing to logic, arithmetic, memory, and whole...

The lecture reviewed fundamentals of memory organization and the design of memory hierarchies and caches, emphasizing why SRAM is used for on-chip caches while DRAM serves as main memory due to differing fabrication and capacitor requirements. It surveyed memory technologies...

In a workshop on memory-centric computing, the speaker argued that modern computing is bottlenecked by data movement rather than raw compute, urging co-design of hardware and software to keep memory and compute tightly coupled. He highlighted neural networks and genome...

In this lecture the professor shifts focus from processors to memory, arguing that memory and storage are the dominant bottlenecks in modern computing. He frames the discussion with Amdahl’s law to show why accelerating computation alone yields limited system speedups...

The tutorial frames memory-centric computing as a response to rapidly growing data demands that are outpacing traditional compute-centric architectures. The speaker highlights that modern workloads—large neural networks, databases, graph analytics, and mobile applications—are increasingly bottlenecked by memory bandwidth, capacity, and...