In this video, we will cover how to automate your Python ETL (Extract, Transform, Load) with Apache Airflow. In this session, we will use the TaskFlow API introduced in Airflow 2.0. TaskFlow API makes it much easier to author clean ETL code without extra boilerplate, by using the @task decorator. Airflow organizes your workflows as Directed Acyclic Graphs (DAGs) composed of tasks.
In this tutorial, we will see how to design a ETL Pipeline with Python. We will use SQL Server’s AdventureWorks database as a source and load data in PostgreSQL with Python. We will focus on Product's hierarchy and enhance our initial data pipeline to give you a complete overview of the extract, load and transform process.
Link to medium article on this topic: [ Ссылка ]
Link to previous video: [ Ссылка ]
Link to Pandas video: [ Ссылка ]
Link to GitHub repo: [ Ссылка ]
Link to Cron Expressions: [ Ссылка ]
Subscribe to our channel:
[ Ссылка ]
---------------------------------------------
Follow me on social media!
GitHub: [ Ссылка ]
Instagram: [ Ссылка ]
LinkedIn: [ Ссылка ]
---------------------------------------------
#ETL #Python #Airflow
Topics covered in this video:
0:00 - Introduction to Airflow
2:49 - The Setup
3:40 - Script ETL pipeline: Extract
5:52 - Transform
7:39 - Load
8:00 - Define Direct Acyclic Graph (DAG)
9:36 - Airflow UI: Dag enable & run
10:09 - DAG Overview
10:29 - Test ETL Pipeline
Ещё видео!