Coding a ChatGPT Like Transformer From Scratch in PyTorch

•July 1, 2024

0

Josh Starmer

Josh Starmer•Jul 1, 2024

Why It Matters

The tutorial demystifies the key building blocks of modern large‑language models, giving practitioners a practical, runnable blueprint to understand and prototype decoder‑only Transformers. That lowers the barrier to experimentation and learning for researchers and engineers working on generative NLP models.

Summary

In a hands‑on tutorial, StatQuest walks through building a decoder‑only Transformer (the architecture behind ChatGPT) from scratch in PyTorch and PyTorch Lightning. The video covers creating a minimal token vocabulary and dataset for two prompt–response pairs, mapping tokens to IDs, packaging inputs and labels into TensorDataset/DataLoader, and implementing embeddings and sinusoidal positional encodings. It then assembles the core Transformer components — attention, decoder blocks, and training loop — and shows how to precompute positional encodings and train the model end‑to‑end. The episode emphasizes readable code and links to a complete, downloadable example for follow‑along practice.

Original Description

In this StatQuest we walk through the code required to code your own ChatGPT like Transformer in PyTorch and we do it one step at a time, with every little detail clearly explained.

NOTE: This StatQuest assumes that you are already familiar with the concepts behind...

Decoder-Only Transformers: https://youtu.be/bQ5BoolX9Ag

The Essential Matrix Algebra for Neural Networks: https://youtu.be/ZTt9gsGcdDo

The Matrix Math Behind Transformers: https://youtu.be/KphmOJnLAdI

You can get the code here: https://github.com/StatQuest/decoder_transformer_from_scratch

The full Neural Networks playlist, from the basics to AI, is here: https://www.youtube.com/watch?v=CqOfi41LfDw&list=PLblh5JKOoLUIxGDQs4LFFD--41Vzf-ME1

Learn more about GiveInternet.org: https://giveinternet.org/StatQuest NOTE: Donations up to $30 will be matched by an Angel Investor - so a $30 donation would give $60 to the organization. DOUBLE BAM!!!

For a complete index of all the StatQuest videos, check out:

https://statquest.org/video-index/

If you'd like to support StatQuest, please consider...

Patreon: https://www.patreon.com/statquest

...or...

YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...

https://statquest.org/statquest-store/

...or just donating to StatQuest!

https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

https://twitter.com/joshuastarmer

0:00 Awesome song and introduction

1:12 Loading the modules

2:04 Creating the training dataset

6:17 Coding Position Encoding

14:09 Coding Attention

21:04 Coding a Decoder-Only Transformer

26:39 Running the model (untrained)

29:18 Training and using the model

#StatQuest #PyTorch #chatgpt

0

Comments

Want to join the conversation?

Loading comments...