Skip to content
View rickard-garnau's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report rickard-garnau

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rickard-garnau/README.md

Hi there! I'm Rickard

Data Engineering student at STI (2025–2027) with a background in operations and team leadership. I build ETL pipelines, data platforms and medallion architectures with a focus on data integrity, automation and scalability.

Stack

Languages: Python (Pandas, OOP) · SQL (Advanced, CTEs)
Databases: PostgreSQL · DuckDB
Platforms & Tools: Databricks · PySpark · Delta Live Tables · Apache Kafka · Docker · FastAPI · Git
Modeling: ER-modeling · 3NF Normalization · Dimensional modeling
BI: PowerBI · Streamlit · Evidence.dev

Projects

End-to-end medallion pipeline on Databricks for 7.4M ultra marathon results (1798–2022).

  • Streaming ingestion via Delta Live Tables into bronze
  • 20+ silver transformations: unit standardization, date parsing, performance normalization, deduplication
  • Dimensional model in gold: fct_results, dim_athlete, dim_event + analytical views
  • Genie space for ad hoc queries with manual verification notebook
  • Databricks dashboard built on gold views

FoodHub — Data Platform (FastAPI + Kafka + PostgreSQL + Docker)

End-to-end data pipeline with Kafka Producer/Consumer for async streaming from Spoonacular API into PostgreSQL (staging → curated). Cache-first strategy to minimize external API calls. ETL with Pydantic validation, NaN-handling and fuzzy ingredient matching. Exposed via FastAPI with search, history and query statistics endpoints. Scrum Master in a team of 5.

Stock Data Pipeline (FastAPI + PostgreSQL + Docker)

REST API ingestion of stock data stored as JSONB in PostgreSQL. ELT pipeline with Pandas for cleaning, validation and outlier detection. Flagging and rejection logic for data quality. Credentials via .env and containerized with Docker.

Background

15 years as Team Leader at Citymail Sweden AB — responsible for process optimization, flow management and daily KPI delivery across teams of 6–10 people. That background makes me take reliability and edge cases seriously.


Let's talk data. I'm happy to discuss pipeline architecture, medallion design or why your silver layer is lying to you.

More projects and course work under my repositories.

Seeking LIA internship. Open to data engineering roles.

Pinned Loading

  1. marathos_rickard_garnau marathos_rickard_garnau Public

    Ultra marathon analytics pipeline built with Databricks, PySpark and DLT — bronze to gold medallion architecture.

    Jupyter Notebook

  2. visualization-project-streamlit visualization-project-streamlit Public

    Forked from LisaYllander92/Visualization_project

    A data-driven cultural guide for Stockholm built with Streamlit. Aggregates event data from Ticketmaster, VisitStockholm, Fasching and Berns, combined with real-time weather forecasts from Open-Met…

    Jupyter Notebook

  3. stock-data-pipeline stock-data-pipeline Public

    ELT data pipeline for stock data using FastAPI, PostgreSQL & Pandas

    Python 1

  4. data-platform-project data-platform-project Public

    Recipe search platform with fuzzy matching, Kafka streaming and PostgreSQL. Built with FastAPI, Docker and Supabase.