YouTube doesn't want you know this subscribers secret
Get Free YouTube Subscribers, Views and Likes

But what is a GPT? Visual intro to transformers | Chapter 5 Deep Learning

Follow
3Blue1Brown

Breaking down how Large Language Models work
Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support



Here are a few other relevant resources

Build a GPT from scratch, by Andrej Karpathy
   • Let's build GPT: from scratch, in cod...  

If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic:
   • What does it mean for computers to un...  

If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined lowrank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources.
https://transformercircuits.pub/2021...

Site with exercises related to ML programming and GPTs
https://www.gptandchill.ai/codingprob...

History of language models by Brit Cruise, @ArtOfTheProblem
   • ChatGPT: 30 Year History | How AI Lea...  

An early paper on how directions in embedding spaces have meaning:
https://arxiv.org/pdf/1301.3781.pdf



Timestamps

0:00 Predict, sample, repeat
3:03 Inside a transformer
6:36 Chapter layout
7:20 The premise of Deep Learning
12:27 Word embeddings
18:25 Embeddings beyond words
20:22 Unembedding
22:22 Softmax with temperature
26:03 Up next

posted by Hemme1v