AI Videos
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
AIVideosCoding a ChatGPT Like Transformer From Scratch in PyTorch
AI

Coding a ChatGPT Like Transformer From Scratch in PyTorch

•July 1, 2024
0
Josh Starmer
Josh Starmer•Jul 1, 2024

Why It Matters

The tutorial demystifies the key building blocks of modern large‑language models, giving practitioners a practical, runnable blueprint to understand and prototype decoder‑only Transformers. That lowers the barrier to experimentation and learning for researchers and engineers working on generative NLP models.

Summary

In a hands‑on tutorial, StatQuest walks through building a decoder‑only Transformer (the architecture behind ChatGPT) from scratch in PyTorch and PyTorch Lightning. The video covers creating a minimal token vocabulary and dataset for two prompt–response pairs, mapping tokens to IDs, packaging inputs and labels into TensorDataset/DataLoader, and implementing embeddings and sinusoidal positional encodings. It then assembles the core Transformer components — attention, decoder blocks, and training loop — and shows how to precompute positional encodings and train the model end‑to‑end. The episode emphasizes readable code and links to a complete, downloadable example for follow‑along practice.

Original Description

In this StatQuest we walk through the code required to code your own ChatGPT like Transformer in PyTorch and we do it one step at a time, with every little detail clearly explained.
NOTE: This StatQuest assumes that you are already familiar with the concepts behind...
Decoder-Only Transformers: https://youtu.be/bQ5BoolX9Ag
The Essential Matrix Algebra for Neural Networks: https://youtu.be/ZTt9gsGcdDo
The Matrix Math Behind Transformers: https://youtu.be/KphmOJnLAdI
You can get the code here: https://github.com/StatQuest/decoder_transformer_from_scratch
The full Neural Networks playlist, from the basics to AI, is here: https://www.youtube.com/watch?v=CqOfi41LfDw&list=PLblh5JKOoLUIxGDQs4LFFD--41Vzf-ME1
Learn more about GiveInternet.org: https://giveinternet.org/StatQuest NOTE: Donations up to $30 will be matched by an Angel Investor - so a $30 donation would give $60 to the organization. DOUBLE BAM!!!
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
1:12 Loading the modules
2:04 Creating the training dataset
6:17 Coding Position Encoding
14:09 Coding Attention
21:04 Coding a Decoder-Only Transformer
26:39 Running the model (untrained)
29:18 Training and using the model
#StatQuest #PyTorch #chatgpt
0

Comments

Want to join the conversation?

Loading comments...