Digital Design & Computer Architecture D10: Problem-Solving Session 10 (Spring 2026)

Onur Mutlu Lectures
Onur Mutlu LecturesMay 11, 2026

Why It Matters

Understanding VLIW’s dependency‑driven bundling and systolic arrays’ regular‑pattern constraints helps engineers design compilers and workloads that fully exploit parallel hardware, directly impacting performance and energy efficiency.

Key Takeaways

  • VLIW relies on compiler to bundle independent instructions for parallel execution.
  • Ideal IPC equals bundle width, but stalls reduce actual throughput.
  • Systolic arrays excel with regular, repetitive data‑flow patterns.
  • Scheduling exercise shows poor utilization due to instruction dependencies.
  • Execution cycles scale linearly with loop count, highlighting bottlenecks.

Summary

The session opened with a dual focus on VLIW (referred to as VIW) architectures and systolic‑array designs, providing a quick theoretical refresher before diving into hands‑on exercises. The instructor emphasized that VLIW’s performance hinges on the compiler’s ability to group independent instructions into bundles, allowing the hardware to execute them in lockstep without runtime dependency checks. Ideal instructions‑per‑cycle (IPC) matches the bundle width—two for a two‑wide bundle—but any stall in one instruction forces the entire bundle to wait, reducing real‑world IPC.

Key insights included the contrast between VLIW’s static scheduling and traditional out‑of‑order execution, as well as the core principles of systolic arrays: multiple processing elements transform data streams in a regular, weight‑driven pattern. The lecturer illustrated how systolic designs thrive on uniform compute and memory access patterns, yet become inefficient when faced with irregular code structures. The example of a multiply‑accumulate pipeline highlighted the need for tightly coupled data flow to achieve high throughput.

During the exercise, students mapped a seven‑instruction loop (loads, add, multiply, store, branch) onto a VLIW processor with three load units, one store, add, multiply, and branch unit. The optimal schedule packed the first three independent loads into a single V1 bundle, followed by isolated multiply, add, store, and branch bundles, leaving many no‑ops. The resulting useful‑operation‑to‑bundle ratio was 7/5, and total execution cycles grew linearly with the loop counter n, exposing the under‑utilization caused by instruction dependencies.

The discussion underscored that VLIW efficiency is a compiler‑driven problem: without careful dependency analysis, hardware resources remain idle. Likewise, systolic arrays demand algorithmic regularity, limiting their applicability. These lessons inform hardware designers and software engineers about the trade‑offs between static scheduling simplicity and dynamic execution flexibility, guiding future processor and accelerator architectures.

Original Description

Digital Design and Computer Architecture, ETH Zürich, Spring 2026 (https://safari.ethz.ch/ddca/spring2026/)
D10: Problem-Solving Session 10
Lecturer: Prof. Onur Mutlu
Date: 11 May 2026
Recommended Reading:
====================
A Modern Primer on Processing in Memory
Memory-Centric Computing: Solving Computing's Memory Problem
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
Intelligent Architectures for Intelligent Computing Systems
RowHammer: A Retrospective
Fundamentally Understanding and Solving RowHammer
Accelerating Genome Analysis via Algorithm-Architecture Co-Design
From Molecules to Genomic Variations: Accelerating Genome Analysis via Intelligent Algorithms and Architectures
RECOMMENDED LECTURE VIDEOS & PLAYLISTS:
========================================
Digital Design and Computer Architecture Spring 2025 Livestream Lectures Playlist:
Fundamentals of Computer Architecture Fall 2025 Livestream Lectures Playlist:
Seminar in Computer Architecture Spring 2025 Livestream Lectures Playlist:
Computer Architecture Fall 2024 Lectures Playlist:
Interview with Professor Onur Mutlu:
TCuARCH meets Prof. Onur Mutlu
Arch. Mentoring Workshop @ISCA'21 - Doing Impactful Research
The Story of RowHammer Lecture:
Accelerating Genome Analysis Lecture:
Memory-Centric Computing Systems Tutorial at IEDM 2021:
Intelligent Architectures for Intelligent Machines Lecture:
Featured Lectures:

Comments

Want to join the conversation?

Loading comments...