Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
-
Updated
Jan 11, 2024 - Python
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
Data Engineering portfolio projects, resources used to study data tools...
A first course for data engineering workflows
Examples that I use to learn and show Apache Beam
This repository is the collection point for all of the projects completed during the Udacity Data Engineering Nano Degree program.
Automated trade data pipeline for analyzing the impact of Free Trade Agreements (FTAs) and tariffs on U.S. agricultural exports. Integrates global datasets from USDA, WTO, World Bank, and IMF.
Leveraging AWS Cloud Services, an ETL pipeline transforms YouTube video statistics data. Data is downloaded from Kaggle, uploaded to an S3 bucket, and cataloged using AWS Glue for querying with Athena. AWS Lambda and Glue converts to Parquet format and stores it in a cleansed S3 bucket. AWS QuickSight then visualizes the materialised data.
Add a description, image, and links to the data-engineering-workflows topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering-workflows topic, visit your repo's landing page and select "manage topics."