AI Videos
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
AIVideosHow Might LLMs Store Facts | Deep Learning Chapter 7
AI

How Might LLMs Store Facts | Deep Learning Chapter 7

•August 31, 2024
0
Grant Sanderson
Grant Sanderson•Aug 31, 2024

Why It Matters

Identifying that MLPs concentrate factual storage helps interpretability and targeted model editing, with implications for debiasing, correcting misinformation, and improving model safety and reliability.

Summary

Researchers and the video explain how factual knowledge in transformer language models may be stored primarily inside the feedforward multi-layer perceptron (MLP) blocks rather than attention. Using a toy example—how the fact “Michael Jordan plays basketball” could be encoded—the presenter shows that high-dimensional token vectors can align with directions for first name, last name and concepts, and that MLPs can map a vector encoding a person’s full name into the concept direction for their sport via two matrix multiplications and a nonlinearity. The walkthrough emphasizes that MLPs act on each token vector in parallel (no cross-token communication) and that interpreting these simple computations is hard despite their conceptual simplicity. The discussion draws on recent DeepMind work and frames this as a partial, mechanistic explanation for where models “memorize” facts.

Original Description

Unpacking the multilayer perceptrons in a transformer, and how they may store facts
Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support
An equally valuable form of support is to share the videos.
AI Alignment forum post from the Deepmind researchers referenced at the video's start:
https://www.alignmentforum.org/posts/iGuwZTHWb6DFY3sKB/fact-finding-attempting-to-reverse-engineer-factual-recall
Anthropic posts about superposition referenced near the end:
https://transformer-circuits.pub/2022/toy_model/index.html
https://transformer-circuits.pub/2023/monosemantic-features
Some added resources for those interested in learning more about mechanistic interpretability, offered by Neel Nanda
Mechanistic interpretability paper reading list
https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite
Getting started in mechanistic interpretability
https://www.neelnanda.io/mechanistic-interpretability/getting-started
An interactive demo of sparse autoencoders (made by Neuronpedia)
https://www.neuronpedia.org/gemma-scope#main
Coding tutorials for mechanistic interpretability (made by ARENA)
https://arena3-chapter1-transformer-interp.streamlit.app/
Звуковая дорожка на русском языке: Влад Бурмистров.
Sections:
0:00 - Where facts in LLMs live
2:15 - Quick refresher on transformers
4:39 - Assumptions for our toy example
6:07 - Inside a multilayer perceptron
15:38 - Counting parameters
17:04 - Superposition
21:37 - Up next
------------------
These animations are largely made using a custom Python library, manim. See the FAQ comments here:
https://3b1b.co/faq#manim
https://github.com/3b1b/manim
https://github.com/ManimCommunity/manim/
All code for specific videos is visible here:
https://github.com/3b1b/videos/
The music is by Vincent Rubinetti.
https://www.vincentrubinetti.com
https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown
https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u
------------------
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
Mailing list: https://3blue1brown.substack.com
Twitter: https://twitter.com/3blue1brown
Instagram: https://www.instagram.com/3blue1brown
Reddit: https://www.reddit.com/r/3blue1brown
Facebook: https://www.facebook.com/3blue1brown
Patreon: https://patreon.com/3blue1brown
Website: https://www.3blue1brown.com
0

Comments

Want to join the conversation?

Loading comments...