Veteran Windows Dev Shows Off AI Running on 47-Year-Old PDP11 with 6 MHz CPU and 64KB of RAM — 'Gloriously Absurd' Project Runs Transformer Model Written in PDP-11 Assembly Language

Veteran Windows Dev Shows Off AI Running on 47-Year-Old PDP11 with 6 MHz CPU and 64KB of RAM — 'Gloriously Absurd' Project Runs Transformer Model Written in PDP-11 Assembly Language

Tom's Hardware
Tom's HardwareApr 14, 2026

Why It Matters

The demo proves that the core learning algorithm behind today’s AI can run on extreme low‑resource hardware, highlighting efficiency as a competitive lever as compute costs rise. It also offers a tangible teaching tool to demystify neural‑network fundamentals for engineers and students.

Key Takeaways

  • Attention 11 transformer runs on 6 MHz PDP‑11 with 64 KB RAM
  • Model uses 1,216 parameters and 8‑bit fixed‑point math
  • Achieves 100 % accuracy on digit‑reversal after 350 steps
  • Demonstrates AI fundamentals can operate on ultra‑low‑resource hardware

Pulse Analysis

The PDP‑11 experiment is more than a nostalgic stunt; it underscores a timeless engineering principle—constraints drive innovation. By fitting a transformer into a machine that predates modern microprocessors, Dave Plummer showcases how algorithmic efficiency can offset raw compute scarcity. The 1,216‑parameter network, stripped to 8‑bit fixed‑point operations, mirrors the essential arithmetic of today’s massive models, reminding developers that the underlying mathematics remain unchanged regardless of hardware scale.

For educators and AI practitioners, the project offers a concrete, hands‑on illustration of how attention mechanisms function. Training the model to reverse an eight‑digit sequence forces the network to internalize a structural rule, mirroring the way large‑language models learn patterns from massive text corpora. Achieving perfect accuracy in just 350 steps demonstrates that even a modest parameter count can capture non‑trivial transformations when the training task is well‑defined, providing a valuable teaching example for courses on neural‑network fundamentals.

From a business perspective, the demonstration arrives as the industry grapples with soaring compute costs and environmental concerns. Companies that prioritize low‑power, highly optimized inference—whether on edge devices or specialized ASICs—stand to gain a strategic edge. Plummer’s work suggests that revisiting classic hardware philosophies, such as tight memory footprints and fixed‑point math, could yield substantial savings while maintaining model performance, a lesson increasingly relevant as AI deployment scales across diverse environments.

Veteran Windows dev shows off AI running on 47-year-old PDP11 with 6 MHz CPU and 64KB of RAM — 'gloriously absurd' project runs transformer model written in PDP-11 assembly language

Comments

Want to join the conversation?

Loading comments...