Easy way to get 15 free YouTube views, likes and subscribers
Get Free YouTube Subscribers, Views and Likes

Coding a Transformer from scratch on PyTorch with full explanation training and inference.

Follow
Umar Jamil

In this video I teach how to code a Transformer model from scratch using PyTorch. I highly recommend watching my previous video to understand the underlying concepts, but I will also rehearse them in this video again while coding. All of the code is mine, except for the attention visualization function to plot the chart, which I have found online at the Harvard university's website.

Paper: Attention is all you need https://arxiv.org/abs/1706.03762

The full code is available on GitHub: https://github.com/hkproj/pytorchtra...
It also includes a Colab Notebook so you can train the model directly on Colab.

Chapters
00:00:00 Introduction
00:01:20 Input Embeddings
00:04:56 Positional Encodings
00:13:30 Layer Normalization
00:18:12 Feed Forward
00:21:43 MultiHead Attention
00:42:41 Residual Connection
00:44:50 Encoder
00:51:52 Decoder
00:59:20 Linear Layer
01:01:25 Transformer
01:17:00 Task overview
01:18:42 Tokenizer
01:31:35 Dataset
01:55:25 Training loop
02:20:05 Validation loop
02:41:30 Attention visualization

posted by Waingeripdari79