Skip to content
#

data-engineering-project

Here are 24 public repositories matching this topic...

End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.

  • Updated Jan 19, 2026
  • Jupyter Notebook

Building a next-generation hybrid data pipeline architecture that combines the power of Microsoft Fabric, Azure Cloud, and Power BI. This pipeline is engineered to tackle the challenges of real-time data ingestion, multi-layered processing, and analytics, delivering business-critical insights.

  • Updated Dec 29, 2024
  • Python

Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.

  • Updated Aug 25, 2024
  • Python

MCP agent for semantic metrics governance with built-in trust scoring, lineage visualization, and conversational metric definition. Designed for data teams working with dbt, LookML, and modern semantic layers.

  • Updated Jan 4, 2026
  • Python

Automated trade data pipeline for analyzing the impact of Free Trade Agreements (FTAs) and tariffs on U.S. agricultural exports. Integrates global datasets from USDA, WTO, World Bank, and IMF.

  • Updated Mar 19, 2026
  • Python

Improve this page

Add a description, image, and links to the data-engineering-project topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-engineering-project topic, visit your repo's landing page and select "manage topics."

Learn more