Skip to content

Conversation

@yp-ai-lab
Copy link

Add CR-HyperVR: GPU-free hypergraph-vector recommender for media discovery

Summary

This PR adds CR-HyperVR (Cloud Run Hypergraph-Vector Recommender) to the apps/ directory as a contribution to the Entertainment Discovery track.

CR-HyperVR directly addresses the hackathon's core challenge: the 45-minute decision problem. It combines semantic vector search with hypergraph signal propagation to deliver relevant media recommendations, all running on CPU-only infrastructure with no GPU dependencies at any stage (including fine-tuning).

Why this matters

Most production recommendation systems require expensive GPU infrastructure for both fine-tuning and inference. CR-HyperVR demonstrates that competitive results are achievable using:

  • CPU-only fine-tuning via Cloud Run Jobs
  • INT8 quantised inference via ONNX Runtime
  • Hypergraph augmentation to compensate for model size constraints

This makes the system accessible to teams without GPU budgets and deployable to edge environments where GPUs aren't available.

Architecture overview

Query → INT8 ONNX MiniLM embedding
           ↓
      pgvector seed candidates
           ↓
    ┌──────┴──────┐
    ↓             ↓
Co-watch      Genre
neighbours    neighbours
    ↓             ↓
    └──────┬──────┘
           ↓
   Weighted score fusion
           ↓
      Top-K results

The system runs entirely on GCP's serverless stack:

Component | Purpose -- | -- Cloud Run Services | Auto-scaling API endpoints Cloud SQL (PostgreSQL 15 + pgvector) | Embeddings and hyperedge storage Cloud Run Jobs | GPU-free fine-tuning pipeline execution Cloud Storage | Model artifacts and datasets

Example: embed free text

curl -s -X POST \
  https://embedding-service-5pgvctvdpq-nw.a.run.app/embed/text \
  -H 'Content-Type: application/json' \
  -d '{"text":"neo-noir heist with witty banter"}'

Example: graph-powered recommendations

curl -s -X POST \
  https://infra-service-5pgvctvdpq-nw.a.run.app/graph/recommend \
  -H 'Content-Type: application/json' \
  -d '{"query":"space opera adventure","top_k":5}'

API surface

Embedding endpoints:

  • POST /embed/text — Embed free text
  • POST /embed/batch — Batch embed multiple texts
  • POST /embed/movie — Embed from title + genres + description
  • POST /embed/user — Embed user taste profile

Search endpoints:

  • POST /search/similar — Vector similarity search
  • POST /search/recommend — User profile recommendations
  • POST /graph/recommend — Hypergraph-enhanced recommendations

Operations:

  • GET /healthz, GET /ready, GET /metrics

Data sources

  • TMDB (tmdb-movies-dataset-2023-930k-movies): Movie metadata and descriptions
  • MovieLens 25M: User ratings for collaborative signal extraction

Files added

apps/cr-hypervr/
├── app/              # FastAPI service implementation
├── db/               # Database schemas and migrations
├── pipeline/         # Data ingestion and processing
├── training/         # Fine-tuning, ONNX export, INT8 quantisation
├── scripts/          # Utilities and validation
├── Dockerfile
├── Makefile          # Full deployment automation
├── cloudbuild.yaml   # GCP Cloud Build configuration
└── README.md

Roadmap

The README outlines planned enhancements:

  • Curriculum sampling with temperature-controlled hard negatives
  • Weak supervision from genre and co-watch edges during fine-tuning
  • TinyBERT/MiniLM cross-encoder reranker
  • Nightly retraining with drift detection
  • Canary deployments with automated guardrails

Licence

MIT (compatible with repository licence)

…covery

- INT8 ONNX MiniLM embeddings (CPU-only, no GPU required)
- pgvector similarity search + hyperedge signals
- Live Cloud Run endpoints ready to use
- Fits Entertainment Discovery track
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant