Skip to content

Egbe34/ga4-ecommerce-analytics

Repository files navigation

GA4 E-commerce Analytics

Turn the GA4 public sample dataset into business insights, Power BI dashboards, a trained ML model, and a FastAPI prediction service running in Docker.


Preview

Architecture

Architecture

Entity Relationship Diagram (ERD)

ERM

Power BI Dashboards

1. Channel Performance & Conversion Channel Dashboard

2. E-Commerce Performance Trends Trends Dashboard

3. Funnel Performance Funnel Dashboard


1) Business Case & Questions

Use GA4 e-commerce events to improve funnel conversion and channel ROI.

Key questions

  • Which channels convert best from sessions → purchases?
  • How do sessions, revenue, and conversion change by day/week/month?
  • Where are the biggest drop-offs in the funnel?
  • (ML) Can we predict purchase propensity from simple session features?

2) Data & ERD

Source (BigQuery public dataset)
bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*

Star schema

  • dim_date (PK = date_pk)
  • dim_channel (PK = channel_key)
  • fact_funnel_by_date_channel
    • FK: date_pkdim_date.date_pk
    • FK: channel_keydim_channel.channel_key
    • Metrics: sessions, add_to_cart, purchases, revenue, conversion_rate, aov

SQL models (/sql)

  • 10_dim_date.sql
  • 20_dim_channel.sql
  • 30_fact_funnel_by_date_channel.sql

Visual ERD
See images/erd.png (rendered above).


3) Project Architecture

Flow: GA4 → BigQuery → SQL models → Python notebook (exports + model) → Power BI → FastAPI (Docker)

  • BigQuery builds the star schema.
  • Notebook creates exports/*.csv for BI and trains a purchase propensity model.
  • Power BI reads exports for dashboards.
  • FastAPI serves a /predict endpoint using the trained model.
  • Docker packages the API for consistent local runs.

Diagram: images/architecture.png (rendered above).


4) Analysis (Python)

Notebook: notebooks/01_eda_kpis.ipynb

Covers

  • Channel performance (sessions, purchases, revenue, conversion rate)
  • Time trends (weekly/monthly)
  • Funnel analysis (Sessions → Add to Cart → Purchases)
  • Trains a RandomForestClassifier for purchase propensity

Exports produced (used by BI)

  • exports/channel_summary.csv
  • exports/time_summary.csv
  • exports/funnel_summary.csv


6) Machine Learning

Model: RandomForestClassifier for purchase propensity.

Example features

  • Numeric: sessions, add_to_cart
  • Categorical one-hots: channel_group, day_name, month

Artifacts (/models)

  • purchase_rf.joblib
  • expected_cols.json ← columns used at inference

7) Prediction API (FastAPI) + Docker

Code: src/api.py

Run locally (virtual env)

python -m venv .venv

# Windows (PowerShell)
.\.venv\Scripts\Activate.ps1

# macOS/Linux
source .venv/bin/activate

pip install -r requirements.txt
uvicorn src.api:app --host 127.0.0.1 --port 8000 --reload

Health check

curl http://127.0.0.1:8000/health

curl -X POST http://127.0.0.1:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"sessions":3,"add_to_cart":1,"channel_group":"Referral","day_name":"Saturday","month":12}'

Docker (local)

docker build -t ga4-api .
docker run -p 8000:8000 ga4-api
curl http://127.0.0.1:8000/health

Endpoints

  • GET /health → service status
  • POST /predict → { "prediction": 0|1, "probability_purchase": float }

8) Results & Business Impact

  • Identified top-performing channels by conversion rate and revenue.
  • Highlighted the largest funnel drop-offs (Sessions → Add to Cart, Cart → Purchase).
  • Machine Learning flagged high-propensity sessions for remarketing and CRO (conversion rate optimization).
  • Delivered a full end-to-end pipeline (SQL → Python → BI → ML → API → Docker) showing both technical depth and business value.

9) Reproduce

Clone & setup

git clone https://github.com/Egbe34/ga4-ecommerce-analytics.git
cd ga4-ecommerce-analytics
python -m venv .venv

# Windows (PowerShell)
.\.venv\Scripts\Activate.ps1

# macOS/Linux
source .venv/bin/activate

pip install -r requirements.txt

Run notebook → generate exports & models

Open notebooks/01_eda_kpis.ipynb

Run all cells to create exports/* and models/*

Start API

10) Folder Structure

ga4-ecommerce-analytics/ ├─ sql/

├─ notebooks/

├─ exports/

├─ models/

├─ src/

├─ dashboard/

│ └─ powerbi/

├─ docs/

├─ images/

├─ Dockerfile

└─ README.md

**License:** MIT



About

End-to-end GA4→BigQuery→SQL→Python ML with Power BI + FastAPI (Docker).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages