In this video, I go over how LoRA works and why it's crucial for affordable Transformer finetuning.
LoRA learns lowrank matrix decompositions to slash the costs of training huge language models. It adapts only lowrank factors instead of entire weight matrices, achieving major memory and performance wins.
LoRA Paper: https://arxiv.org/pdf/2106.09685.pdf
Intrinsic Dimensionality Paper: https://arxiv.org/abs/2012.13255
About me:
Follow me on LinkedIn: / csalexiuk
Check out what I'm working on: https://getox.ai/