A little secret to rock your YouTube subscribers
Get Free YouTube Subscribers, Views and Likes

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

Follow
Algorithmic Simplicity

Mamba is a new neural network architecture that came out this year, and it performs better than transformers at language modelling! This is probably the most exciting development in AI since 2017. In this video I explain how to derive Mamba from the perspective of linear RNNs. And don't worry, there's no state space model theory needed!

Mamba paper: https://openreview.net/forum?id=AL1fq...
Linear RNN paper: https://openreview.net/forum?id=M3Yd3...

#mamba
#deeplearning
#largelanguagemodels

00:00 Intro
01:33 Recurrent Neural Networks
05:24 Linear Recurrent Neural Networks
06:57 Parallelizing Linear RNNs
15:33 Vanishing and Exploding Gradients
19:08 Stable initialization
21:53 State Space Models
24:33 Mamba
25:26 The High Performance Memory Trick
27:35 The Mamba Drama

posted by linalamont512c0