video thumbnail 34:08
Why Deep Learning Works Unreasonably Well

2025-08-10

[public] 16.0K views, 1.94K likes, dislikes audio only

channel thumbWelch Labs

Take your personal data back with Incogni! Use code WELCHLABS and get 60% off an annual plan: http://incogni.com/welchlabs

New Patreon Rewards 33:31- own a piece of Welch Labs history!

https://www.patreon.com/welchlabs

Books & Posters

https://www.welchlabs.com/resources

Sections

0:00 - Intro

4:49 - How Incogni Saves Me Time

6:32 - Part 2 Recap

8:10 - Moving to Two Layers

9:15 - How Activation Functions Fold Space

11:45 - Numerical Walkthrough

13:42 - Universal Approximation Theorem

15:45 - The Geometry of Backpropagation

19:52 - The Geometry of Depth

24:27 - Exponentially Better?

30:23 - Neural Networks Demystifed

31:50 - The Time I Quit YouTube

33:31 - New Patreon Rewards!

Special Thanks to Patrons https://www.patreon.com/welchlabs

Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman, Jake Ehrlich, Mitch Jacobs, Lauren Steely, Jeff Eastman, Rodolfo Ibarra, Clark Barrus, Rob Napier, Andrew White, Richard B Johnston, abhiteja mandava, Burt Humburg, Kevin Mitchell, Daniel Sanchez, Ferdie Wang, Tripp Hill, Richard Harbaugh Jr, Prasad Raje, Kalle Aaltonen, Midori Switch Hound, Zach Wilson, Chris Seltzer, Ven Popov, Hunter Nelson, Amit Bueno, Scott Olsen, Johan Rimez, Shehryar Saroya, Tyler Christensen, Beckett Madden-Woods, Darrell Thomas, Javier Soto

References

Simon Prince, Understanding Deep Learning. https://udlbook.github.io/udlbook/

Liang, Shiyu, and Rayadurgam Srikant. "Why deep neural networks for function approximation?." arXiv preprint arXiv:1610.04161 (2016).

Hanin, Boris, and David Rolnick. "Deep relu networks have surprisingly few activation patterns." *Advances in neural information processing systems* 32 (2019).

Hanin, Boris, and David Rolnick. "Complexity of linear regions in deep networks." *International Conference on Machine Learning*. PMLR, 2019.

Fan, Feng-Lei, et al. "Deep relu networks have surprisingly simple polytopes." *arXiv preprint arXiv:2305.09145* (2023).

All Code:

https://github.com/stephencwelch/manim_videos

100k neuron wide training code: https://github.com/stephencwelch/manim_videos/blob/master/_2025/backprop_3/notebooks/BaarleNet%20%5BPart%206d%20-%20long%20run%5D.ipynb

Written by: Stephen Welch

Produced by: Stephen Welch, Sam Baskin, and Pranav Gundu

Premium Beat IDs

EEDYZ3FP44YX8OWTe

MWROXNAY0SPXCMBS


Intro
/youtube/video/qx7hirqgfuU?t=0
How Incogni Saves Me Time
/youtube/video/qx7hirqgfuU?t=289
Part 2 Recap
/youtube/video/qx7hirqgfuU?t=392
Moving to Two Layers
/youtube/video/qx7hirqgfuU?t=490
How Activation Functions Fold Space
/youtube/video/qx7hirqgfuU?t=555
Numerical Walkthrough
/youtube/video/qx7hirqgfuU?t=705
Universal Approximation Theorem
/youtube/video/qx7hirqgfuU?t=822
The Geometry of Backpropagation
/youtube/video/qx7hirqgfuU?t=945
The Geometry of Depth
/youtube/video/qx7hirqgfuU?t=1192
Exponentially Better?
/youtube/video/qx7hirqgfuU?t=1467
Neural Networks Demystifed
/youtube/video/qx7hirqgfuU?t=1823
The Time I Quit YouTube
/youtube/video/qx7hirqgfuU?t=1910
New Patreon Rewards!
/youtube/video/qx7hirqgfuU?t=2011