Learn how to get Free YouTube subscribers, views and likes
Get Free YouTube Subscribers, Views and Likes

How to build and automate a python ETL pipeline with airflow on AWS EC2 | Data Engineering Project

Follow
tuplespectra

In this data engineering project, we will learn how to build and automate an ETL process that can extract current weather data from open weather map API, transform the data and load the data into an S3 bucket using Apache Airflow. Apache Airflow is an opensource platform used for orchestrating and scheduling workflows of tasks and data pipelines. This project will be entirely carried out on AWS cloud platform.
We will cover the fundamental concepts of Apache Airflow such as DAG and Operators and I will show you how to install Apache airflow from scratch and schedule your ETL pipeline. I will also show you how to use sensor in your ETL pipeline.
As this is a handson project, I highly encourage you to first watch the video in its entirety without following along so that you can better understand the concepts and the workflows after which you should either try to replicate the example I showed without watching the video but consult the video when you are stuck or you could watch the video again the second time in its entirety while also following along this time.

Remember the best way to learn is by doing it yourself – Get your hands dirty!
If you have any questions or comments, ok to ask or leave comments in the comment section below.

Books I recommend
1. Grit: The Power of Passion and Perseverance https://amzn.to/3EZKSgb
2. Think and Grow Rich!: The Original Version, Restored and Revised: https://amzn.to/3Q2K68s
3. The Book on Rental Property Investing: How to Create Wealth With Intelligent Buy and Hold Real Estate Investing: https://amzn.to/3LLpXRy
4. How to Invest in Real Estate: The Ultimate Beginner's Guide to Getting Started: https://amzn.to/48RbuOb
5. Introducing Python: Modern Computing in Simple Packages https://amzn.to/3Q4driR
6. Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition: https://amzn.to/3rGF73G

**************** Commands used in this video ****************
sudo apt update
sudo apt install python3pip
sudo apt install python3.10venv
python3 m venv airflow_venv
sudo pip install pandas
sudo pip install s3fs
sudo pip install apacheairflow
airflow standalone
sudo apt install awscli
aws configure
aws sts getsessiontoken
**************** USEFUL LINKS ****************
Extract current weather data from Open Weather Map API using python on AWS EC2:    • Extract current weather data from Ope...  

How to remotely SSH (connect) Visual Studio Code to AWS EC2:    • How to remotely SSH (connect) Visual ...  

PostgreSQL Playlist:    • Tutorial 1  What is Data? | What is ...  

Weather Map API: https://openweathermap.org/api

Github Repo: https://github.com/YemiOla/data_engin...

Please don’t forget to LIKE, SHARE, COMMENT and SUBSCRIBE to our channel for more AWESOME videos.

DISCLAIMER: This video and description has affiliate links. This means when you buy through one of these links, we will receive a small commission and this is at no cost to you. This will help support us to continue making awesome and valuable contents for you.

posted by fornlegh6