15 YouTube views, likes subscribers in 10 minutes. Free!
Get Free YouTube Subscribers, Views and Likes

Napkin Math For Fine Tuning Pt. 1 w/Johno Whitaker

Follow
Hamel Husain

We will show you how to build intuition around training performance with a focus on GPUpoor fine tuning.

Part 2 of this talk:    • Napkin Math For Fine Tuning Pt. 2 w/J...  

More resources available here:
https://parlancelabs.com/education/f...


00:00 Introduction
Johno introduces the topic "Napkin Math for Fine Tuning," aiming to answer common questions related to model training, especially for beginners in finetuning large existing models.

01:23 About Johno and AnswerAI
Johno shares his background and his work at AnswerAI, an applied R&D lab focusing on the societal benefits of AI.

03:18 Plan for the Talk
Johno outlines the structure of the talk, including objectives, running experiments, and live napkin math to estimate memory use.

04:40 Training and FineTuning Loop
Description of the training loop: feeding data through a model, measuring accuracy, updating the model, and repeating the process.

09:05 Hardware Considerations
Discussion on the different hardware components (CPU, GPU, RAM) and how they affect training performance.

12:28 Tricks for Efficient Training
Overview of various techniques to optimize training efficiency, including LoRa, quantization, and CPU offloading.

13:12 Full FineTuning
Describes the parameters and memory involved with full finetuning

18:14 LoRA
Detailed explanation of full finetuning versus parameterefficient finetuning techniques like LoRa.

21:04 Quantization and Memory Savings
Discussion on quantization methods to reduce memory usage and enable training of larger models.

23:10 Combining Techniques
Combining different techniques like quantization and LoRa to maximize training efficiency.

22:55 Running Experiments
Importance of running controlled experiments to understand the impact of various training parameters.

25:46 CPU Offloading
How CPU offloading works and the tradeoffs.

28:31 RealWorld Example
Demo of memory optimization and problemsolving during model training, with code. This also includes pragmatic ways to profile your code.

45:44 Case Study: QLoRA + FSDP
Discussion of QLorA with FSDP, along with a discussion of tradeoffs.

54:25 Recap / Conclusion
Johno summarizes the key points of his talk.

posted by armpsycho5q