video thumbnail 27:14
But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5

2024-04-01

[public] 887K views, 108K likes, dislikes audio only

channel thumb3Blue1Brown
4K

Breaking down how Large Language Models work

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

---

Here are a few other relevant resources

Build a GPT from scratch, by Andrej Karpathy

https://youtu.be/kCc8FmEb1nY

If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic:

https://youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX

If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources.

https://transformer-circuits.pub/2021/framework/index.html

Site with exercises related to ML programming and GPTs

https://www.gptandchill.ai/codingproblems

History of language models by Brit Cruise, @ArtOfTheProblem

https://youtu.be/OFS90-FX6pg

An early paper on how directions in embedding spaces have meaning:

https://arxiv.org/pdf/1301.3781.pdf

---

Timestamps

0:00 - Predict, sample, repeat

3:03 - Inside a transformer

6:36 - Chapter layout

7:20 - The premise of Deep Learning

12:27 - Word embeddings

18:25 - Embeddings beyond words

20:22 - Unembedding

22:22 - Softmax with temperature

26:03 - Up next


Predict, sample, repeat
/youtube/video/wjZofJX0v4M?t=0
Inside a transformer
/youtube/video/wjZofJX0v4M?t=183
Chapter layout
/youtube/video/wjZofJX0v4M?t=396
The premise of Deep Learning
/youtube/video/wjZofJX0v4M?t=440
Word embeddings
/youtube/video/wjZofJX0v4M?t=747
Embeddings beyond words
/youtube/video/wjZofJX0v4M?t=1105
Unembedding
/youtube/video/wjZofJX0v4M?t=1222
Softmax with temperature
/youtube/video/wjZofJX0v4M?t=1342
Up next
/youtube/video/wjZofJX0v4M?t=1563
Answering refractive index questions from viewers | Optics puzzles 4 610,824 views
/youtube/video/Cz4Q4QOuoo8
Get more from 3Blue1Brown on Patreon 3blue1brown.com
https://www.3blue1brown.com/early-attention