Get real, active and permanent YouTube subscribers
Get Free YouTube Subscribers, Views and Likes

Build Your Own Finance LLM for FREE with SEC Data

Follow
Adam Lucek

In this video we get into the details of setting up a custom financial Q&A LLM and connecting it to current SEC form 10K’s for context retrieval, all using completely free and local models in a Google Colab notebook! We’ll cover setting up a financial Q&A dataset for fine tuning, using LoRA adapters for parameter efficient fine tuning of local models, running inference on your fine tuned model, creating a data pipeline into the SEC form 10K database, creating embeddings and storing the 10K in a local vector store using a local embedding model, and putting it all together finally to create a super custom Finance LLM Assistant.

Build With Me for Free Here! https://colab.research.google.com/dri...

Resources:
SECAPI: https://secapi.io/
Unsloth Library: https://github.com/unslothai/unsloth
LLaMa 3 8B Instruct: https://huggingface.co/metallama/Met...
Financial Q&A Dataset: https://huggingface.co/datasets/virat...
HF Supervised Fine Tuning: https://huggingface.co/docs/trl/sft_t...
BAAI bgelargeenv1.5 Embedding Model: https://huggingface.co/BAAI/bgelarge...
Facebook AI Semantic Search: https://ai.meta.com/tools/faiss/

Chapters:
00:00 Introduction & Overview
01:07 Colab Notebook Setup
02:14 Installing Dependencies
03:39 Fine Tuning: Picking Model & FT Package
05:15 Fine Tuning: Loading Model with Unsloth
08:18 Fine Tuning: Low Rank Adaptation Overview
09:31 Fine Tuning: Adding LoRA Adapters
13:06 Fine Tuning: Financial QA Dataset Preparation
13:50 Fine Tuning: Dataset Prompt Defining
15:21 Fine Tuning: Training Data Creation
17:32 Fine Tuning: Creating Supervised Fine Tuning Trainer
19:23 Fine Tuning: SFTT Arguments
21:34 Fine Tuning: Training the LLM!
23:42 Fine Tuning: Saving the Trained LLM
24:33 Chatting w/Trained LLM
27:36 Data Pipeline: Overview of SEC 10K Retrieval Process
28:57 Data Pipeline: Using SEC API to get Form 10K Data
30:12 Data Pipeline: Overview of Embeddings
32:03 Data Pipeline: Loading Local Embedding Model
32:50 Data Pipeline: Using Embeddings w/Vector DB Overview
35:10 Data Pipeline: Embedding Form 10K Data
37:08 Data Pipeline: Setting Up Retrieval Functions
38:10 Putting it All Together: Running the Main Script
40:51 Pros & Cons
42:35 Future Ideas for Improvement

#artificialintelligence #ai #finance

Disclaimer:

The information provided from this notebook and video is for educational purposes only and does not constitute financial advice. I do not provide personalized investment, financial, or legal advice. All financial products, services, and investments carry inherent risks, and you should conduct your own research or consult with a qualified financial advisor before making any financial decisions. I am not responsible for any losses or damages that may arise from the use of the information provided on this platform. Always consider your individual financial situation and objectives before making any investment or financial decision.

posted by vrygestelb6