video thumbnail 22:42
How might LLMs store facts | Chapter 7, Deep Learning

2024-08-31

[public] 134K views, 26.5K likes, dislikes audio only

channel thumb3Blue1Brown
4K

Unpacking the multilayer perceptrons in a transformer, and how they may store facts

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos.

AI Alignment forum post from the Deepmind researchers referenced at the video's start:

https://www.alignmentforum.org/posts/iGuwZTHWb6DFY3sKB/fact-finding-attempting-to-reverse-engineer-factual-recall

Anthropic posts about superposition referenced near the end:

https://transformer-circuits.pub/2022/toy_model/index.html

https://transformer-circuits.pub/2023/monosemantic-features

Some added resources for those interested in learning more about mechanistic interpretability, offered by Neel Nanda

Mechanistic interpretability paper reading list

https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite

Getting started in mechanistic interpretability

https://www.neelnanda.io/mechanistic-interpretability/getting-started

An interactive demo of sparse autoencoders (made by Neuronpedia)

https://www.neuronpedia.org/gemma-scope#main

Coding tutorials for mechanistic interpretability (made by ARENA)

https://arena3-chapter1-transformer-interp.streamlit.app/

Sections:

0:00 - Where facts in LLMs live

2:15 - Quick refresher on transformers

4:39 - Assumptions for our toy example

6:07 - Inside a multilayer perceptron

15:38 - Counting parameters

17:04 - Superposition

21:37 - Up next

------------------

These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/

All code for specific videos is visible here:

https://github.com/3b1b/videos/

The music is by Vincent Rubinetti.

https://www.vincentrubinetti.com

https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown

https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u

------------------

3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.

Mailing list: https://3blue1brown.substack.com

Twitter: https://twitter.com/3blue1brown

Instagram: https://www.instagram.com/3blue1brown

Reddit: https://www.reddit.com/r/3blue1brown

Facebook: https://www.facebook.com/3blue1brown

Patreon: https://patreon.com/3blue1brown

Website: https://www.3blue1brown.com


Where facts in LLMs live
/youtube/video/9-Jl0dxWQs8?t=0
Quick refresher on transformers
/youtube/video/9-Jl0dxWQs8?t=135
Assumptions for our toy example
/youtube/video/9-Jl0dxWQs8?t=279
Inside a multilayer perceptron
/youtube/video/9-Jl0dxWQs8?t=367
Counting parameters
/youtube/video/9-Jl0dxWQs8?t=938
Superposition
/youtube/video/9-Jl0dxWQs8?t=1024
Up next
/youtube/video/9-Jl0dxWQs8?t=1297
What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024 888,942 views
/youtube/video/W3I3kAg2J7w
3Blue1Brown is creating videos animating math | Patreon patreon.com
patreon.com/3blue1brown
3Blue1Brown My name is Grant Sanderson. Videos here cover a variety of topics in math, or adjacent fields like physics and CS, all with an emphasis on visualizing the core ideas. The goal is to use animation to help elucidate and motivate otherwise tricky topics, and for difficult problems to be made simple with changes in perspective. For more information, other projects, FAQs, and inquiries see the website: https://www.3blue1brown.com
/youtube/channel/UCYO_jab_esuFRV4b17AJtAw