Fast, easy, and surprisingly satisfying.
Craving something good?
FoodHub turns your leftovers into something worth eating.
Save money, cut waste, and satisfy your cravings β zero planning required.
FoodHub is a robust data platform and backend API designed to bridge the gap between raw ingredients and delicious meals. It features:
- Smart Search: Search for recipes by entering up to 10 ingredients.
- Fuzzy Matching: Handles smaller typos and misspellings (e.g., "avocdo" -> "avocado") using RapidFuzz.
- Ranking Logic: Recipe suggestions are ranked by the number of matching ingredients.
- Cache-First Strategy: Checks the database before calling the Spoonacular API to minimize API costs.
- Search Statistics: Tracks and visualizes the most popular ingredient searches using Matplotlib.
- Frontend: A lightweight web interface for searching recipes and viewing history and search statistic.
The project is organized into a modular directory structure to ensure a clean separation of concerns between the API, data processing, and infrastructure.
.
βββ app/ # Core application (FastAPI)
β βββ api/ # API endpoints (routes)
β βββ clients/ # External API clients (Spoonacular)
β βββ consumer/ # Kafka Consumer (listens and saves data)
β βββ producer/ # Kafka Producer (sends messages)
β βββ repositories/ # Database operations
β βββ schema/ # Pydantic models (internal & external)
β βββ services/ # Business logic (search, filtering)
β βββ transformers/ # Data transformation logic
βββ data/ # Data cleaning and validation scripts
β βββ cleaning_recipe.py # Logic for cleaning recipe data
β βββ flagged_recipe.py # Logic for handling invalid data
βββ docs/ # Architecture models and sprint logs
βββ frontend/ # Web interface (HTML, CSS, JS)
βββ database.py # Database connection pool (PostgreSQL)
βββ main.py # FastAPI application entry point
βββ docker-compose.yml # Infrastructure and container orchestration
βββ Dockerfile # Backend build instructions
βββ init.sql # Database initialization script
βββ pyproject.toml # Project metadata and dependencies (uv)
βββ uv.lock # Dependency lock file
βββ README.md # Project documentation and setup guide
βββ REQUIREMENTS.md # Detailed project requirements
The system is built as a modern data engineering pipeline within a Docker Compose environment, ensuring seamless communication between microservices, streaming components, and cloud storage:
The journey begins at the Frontend (localhost:8000). Users can search for recipes by ingredients, view their search history, and access data insights through automated Matplotlib visualizations.
The backend acts as the system's brain, managing the flow of data:
- Fuzzy Search: Uses RapidFuzz to handle typos, ensuring "chiken" still returns "chicken" recipes.
- Database Connectivity: Leverages psycopg for high-performance communication with the Supabase instance.
- Smart Caching: FastAPI first checks the
curated_recipestable. On a "cache hit," data is returned instantly to save API tokens. On a "miss," it fetches fresh data from the Spoonacular API.
To ensure asynchronous processing and scalability, we utilize Apache Kafka:
- Producer: When new data is fetched from Spoonacular, FastAPI acts as a Producer, pushing the results as events into the Kafka Cluster.
- Consumer: A dedicated Kafka Consumer listens to the stream, reads the incoming data, and persists the raw payloads into the staging layer.
Data is persisted into three distinct functional layers within Supabase to separate concerns:
staging_recipes(Raw Data): Stores untreated JSON payloads from Kafka for historical auditing and backup.curated_recipes(Validated Data): Holds cleaned, structured, and validated recipe data, optimized for frontend performance.search_log(Analytics): Logs user search queries (ingredients) to provide the data source for search frequency statistics.
Throughout the flow, data moves through a classic Extract β Transform β Load process:
- Extract: Raw recipe data is fetched from the Spoonacular API via
get_recipe_information(), returned as JSON with camelCase fields that may containNaNvalues and HTML-tagged instructions. - Transform: Data is cleaned and normalized in
recipe_transformers.pyandingredient_service.py:clean_numeric()replacesNaN/Infwith valid defaults- camelCase fields are converted to snake_case (e.g.
readyInMinutesβready_in_minutes) - HTML tags are stripped from instructions using Regex
- Ingredients are normalized to lowercase for fuzzy matching
- Load: Cleaned data is persisted into two layers following a Medallion Architecture:
staging_recipesβ raw JSON payloads for auditing (via Kafka Consumer)curated_recipesβ validated, structured data ready for the frontend
- Python 3.12
- Docker Desktop
- uv β install here
- A Supabase account and project β get started here
- Clone the repository
git clone https://github.com/LisaYllander92/LAB2_Data_platform_FoodHub.git
cd LAB2_Data_platform_FoodHub- Install dependencies
uv sync- Set up your
.envfile with your Supabase credentials:
DB_HOST=your-db-host
DB_PORT=6543
DB_NAME=postgres
DB_USER=postgres
DB_PASSWORD=your-supabase-password
SPOONACULAR_API_KEY=your-api-key
SPOONACULAR_USERNAME=your-spoonacular-username
SPOONACULAR_HASH=your-spoonacular-hash- Initialize the database schema in Supabase by running
init.sqlin the Supabase SQL editor. - Start all services
docker compose up --build- Access the app:
- Frontend: http://localhost:8000
- API docs (Swagger): http://localhost:8000/docs
- Kafka UI: http://localhost:8080
- Search statistics plot: http://localhost:8000/api/recipes/stats/plot
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/recipes/search |
Search recipes by ingredients |
| GET | /api/recipes/detail/{title} |
Get full recipe details |
| GET | /api/recipes/history |
View recently saved recipes |
| GET | /api/recipes/popular-searches |
Top 10 most searched ingredients |
| GET | /api/recipes/stats/plot |
Bar chart of popular searches |
| POST | /api/recipes |
Send a recipe to Kafka |
Detailed documentation of our architectural design and agile development process.
Visualizing our database structure from concept to final implementation.
| Phase | View Model |
|---|---|
| Step 1: Conceptual | π View Conceptual Model |
| Step 2: Initial Logical | π View First Logical Model |
| Step 3: Final Implementation | π View Final Logical Model |
We followed an agile methodology, documenting every step through activity logs and retrospectives.
| Sprint | Activity Logs | Retrospectives |
|---|---|---|
| Sprint 1 | π View Log | π View Retro |
| Sprint 2 | π View Log | π View Retro |
| Sprint 3 | π View Log | π View Retro |
| Sprint 4 | π View Log | π View Retro |
Tip
Sources & AI Usage: You can find our detailed documentation on tools, sources, and AI-assisted development here: FoodHub Sources PDF

