Threat model highlights:
- Untrusted user inputs (files/strings) that drive execution
- High CPU/memory usage and storage exhaustion
- SSRF / exfil attempts via external calls
Current controls (implemented):
- Pydantic schema validation on all job requests
- Per-job working directory under
STORAGE_DIR - Containerized execution for API/worker
- Read-only root filesystem for worker (compose)
Controls (recommended / planned):
- Strict schema validation + max payload sizes
- Per-job resource limits (CPU/mem/time)
- Execute R as non-root, read-only base FS, write-only working dir
- Network egress allow-list (only KEGG endpoints if required)
- Quotas + rate limits + auth for production
- Clear retention policy + automatic cleanup