In this video we will continue with the data load tool (dlt) library. We will explore how to perform incremental data load using dlt. The incremental data load in ETL (Extract, Transform and Load) is the act of loading only new or changed data. With this approach we process minimal data, use fewer resources and therefore less time. DLT refers to this as the merge write disposition.
We keep the lastest snapshot of a recrod in the data warehouse. We update and/or insert a new record in the dimension table. This is referred to as upsert.
Link to GitHub repo: [ Ссылка ]
Link to previous video (db setup): [ Ссылка ]
Python based Incremental laod (Source Change Detection): [ Ссылка ]
Python based incremental load (Destination Change Comparison) : [ Ссылка ]
DLT docs on incremental load: [ Ссылка ]
Link to Channel's site:
[ Ссылка ]
--------------------------------------------------------------
💥Subscribe to our channel:
[ Ссылка ]
📌 Links
-----------------------------------------
Follow me on social media!
🔗 GitHub: [ Ссылка ]
📸 Instagram: [ Ссылка ]
📝 LinkedIn: [ Ссылка ]
🔗 [ Ссылка ]
🚀 [ Ссылка ]
-----------------------------------------
#ETL #incremental #dlt
Topics in this video (click to jump around):
==================================
0:00 - Introduction to data load tool (dlt) incremental load
0:42 - Source Change Detection: Merge Write Disposition
1:38 - How Merge Write Disposition works
2:08 - Source SQL Server DB setup
2:33 - DLT Incremental Load Function
3:27 - Test Incremental Load Function
5:03 - Update/Insert records in Source SQL DB
5:29 - Run the dlt pipeline
5:36 - Review pipelines results
6:00 - Coming Soon
Ещё видео!