Hardware Videos

All News Deals Social Blogs Videos Podcasts Digests

Hardware Semiconductors

Digital Design & Comp. Arch: L20: GPU Architectures (Spring 2026)

•May 5, 2026

Onur Mutlu Lectures

Onur Mutlu Lectures•May 5, 2026

Why It Matters

Understanding GPU architecture’s blend of SIMD, memory banking, and scalar cores helps engineers design systems that avoid bottlenecks and fully exploit parallelism, directly impacting performance and cost in AI and graphics markets.

Key Takeaways

•GPUs combine array and vector processing for flexible SIMD execution.
•Memory banking and avoiding bank conflicts are critical for GPU throughput.
•Modern GPUs integrate scalar cores to mitigate Amdahl’s serial bottleneck.
•Compiler auto‑vectorization enables massive pixel‑level parallelism in graphics.
•Historical SIMD extensions (MMX, AVX) paved the way for GPU acceleration.

Summary

The lecture focuses on modern GPU architectures, positioning them as flexible extensions of classic SIMD, array, and vector processors. Building on last week’s SIMD fundamentals, the professor explains how GPUs blend space‑time parallelism, allowing scalar instructions to be dispatched across thousands of threads while retaining vector‑like throughput. Key insights include the central role of memory banking to sustain massive parallel loads, the necessity of avoiding bank conflicts, and the integration of scalar execution units to address Amdahl’s law‑driven serial bottlenecks. The discussion also highlights automatic code vectorization as a practical pathway for turning pixel‑wise graphics loops into highly parallel GPU kernels. Illustrative examples reference Bob Brow’s seminal 35‑year‑old paper on VIW architectures, Cray’s early memory‑banking strategy, and Intel’s controversial MMX rollout, which ultimately seeded today’s AVX extensions and GPU‑friendly instruction sets. The professor underscores how compiler analysis determines vectorizability, using image‑processing loops as a canonical case. For practitioners, the lecture signals that future GPU designs must co‑optimize memory hierarchy, bank‑conflict mitigation, and scalar performance. It also suggests that continued ISA evolution and compiler sophistication will be decisive in extending GPU applicability beyond graphics into broader AI and high‑performance workloads.

Original Description

Digital Design and Computer Architecture, ETH Zürich, Spring 2026 (https://safari.ethz.ch/ddca/spring2026/)

Lecture 20: GPU Architectures

Lecturer: Prof. Onur Mutlu

Date: 7 May 2026

L20: GPU Architectures

Slides (pptx): https://safari.ethz.ch/ddca/spring2026/lib/exe/fetch.php?media=onur-ddca-2026-lecture20-gpu-beforelecture.pptx

Slides (pdf): https://safari.ethz.ch/ddca/spring2026/lib/exe/fetch.php?media=onur-ddca-2026-lecture20-gpu-beforelecture.pdf

Recommended Reading:

====================

A Modern Primer on Processing in Memory

https://arxiv.org/pdf/2012.03112.pdf

Memory-Centric Computing: Solving Computing's Memory Problem

https://www.arxiv.org/pdf/2505.00458

Memory-Centric Computing: Recent Advances in Processing-in-DRAM

https://arxiv.org/pdf/2412.19275

Intelligent Architectures for Intelligent Computing Systems

https://people.inf.ethz.ch/omutlu/pub/intelligent-architectures-for-intelligent-computingsystems-invited_paper_DATE21.pdf

RowHammer: A Retrospective

https://people.inf.ethz.ch/omutlu/pub/RowHammer-Retrospective_ieee_tcad19.pdf

Fundamentally Understanding and Solving RowHammer

https://arxiv.org/pdf/2211.07613.pdf

Accelerating Genome Analysis via Algorithm-Architecture Co-Design

https://people.inf.ethz.ch/omutlu/pub/AcceleratingGenomeAnalysis_dac23.pdf

From Molecules to Genomic Variations: Accelerating Genome Analysis via Intelligent Algorithms and Architectures

https://people.inf.ethz.ch/omutlu/pub/IntelligentGenomeAnalysis_csbj22.pdf

RECOMMENDED LECTURE VIDEOS & PLAYLISTS:

========================================

Digital Design and Computer Architecture Spring 2025 Livestream Lectures Playlist:

https://www.youtube.com/watch?v=ubhxKNlOlRg&list=PL5Q2soXY2Zi9Eo29LMgKVcaydS7V1zZW3&index=3

Fundamentals of Computer Architecture Fall 2025 Livestream Lectures Playlist:

https://www.youtube.com/watch?v=uKgMFj1eQQc&list=PL5Q2soXY2Zi_ZMtqz1r-GHm-zzuE1QfIg&index=2

Seminar in Computer Architecture Spring 2025 Livestream Lectures Playlist:

https://www.youtube.com/watch?v=rqeKNZrLzng&list=PL5Q2soXY2Zi-oIW66TLOjtiqQxlDwNHng&index=2

Computer Architecture Fall 2024 Lectures Playlist:

https://www.youtube.com/watch?v=ziMRjDlLEwo&list=PL5Q2soXY2Zi-LfDdGgWyLcTSqzm6a26wD&index=2

Interview with Professor Onur Mutlu:

https://www.youtube.com/watch?v=8ffSEKZhmvo&list=PL5Q2soXY2Zi8VrmOTz44l2WupethSdh-M&index=9

TCuARCH meets Prof. Onur Mutlu

https://www.youtube.com/watch?v=6Hpn4SAX0dI

Arch. Mentoring Workshop @ISCA'21 - Doing Impactful Research

https://www.youtube.com/watch?v=83tlorht7Mc

The Story of RowHammer Lecture:

https://www.youtube.com/watch?v=sgd7PHQQ1AI&list=PL5Q2soXY2Zi8D_5MGV6EnXEJHnV2YFBJl&index=39

Accelerating Genome Analysis Lecture:

https://www.youtube.com/watch?v=r7sn41lH-4A&list=PL5Q2soXY2Zi8D_5MGV6EnXEJHnV2YFBJl&index=41

Memory-Centric Computing Systems Tutorial at IEDM 2021:

https://www.youtube.com/watch?v=H3sEaINPBOE&list=PL5Q2soXY2Zi8D_5MGV6EnXEJHnV2YFBJl&index=35

Intelligent Architectures for Intelligent Machines Lecture:

https://www.youtube.com/watch?v=GTieZPY4Wmc&list=PL5Q2soXY2Zi8D_5MGV6EnXEJHnV2YFBJl&index=38

Featured Lectures:

https://www.youtube.com/watch?v=jVYCchBGNVc&list=PL5Q2soXY2Zi8VrmOTz44l2WupethSdh-M&index=1

Comments

Want to join the conversation?

Loading comments...