Skip to content

Run init evals on CI #290

@betegon

Description

@betegon

Summary

Set up a CI workflow to run the test/init-eval/ eval suite automatically (on PRs and/or on a schedule).

Context

The eval tests exercise sentry init against real project templates and use an LLM judge to score correctness. They currently only run locally because they require a running Mastra API server.

The Mastra agent lives in https://github.com/getsentry/cli-init-api/.

Requirements

  • Mastra server: Either deploy a persistent dev instance or spin one up as a service container in the workflow (pull from getsentry/cli-init-api).
  • GitHub environment: Create an init-eval environment with secrets:
    • MASTRA_API_URL — URL of the Mastra server
    • OPENAI_API_KEY — for the LLM judge
  • Workflow: Add .github/workflows/init-eval.yml with push/pull_request/workflow_dispatch triggers. A previous version existed but was removed in d5d0b22 — can be used as a starting point.
  • Concurrency: Use a concurrency group to cancel in-progress runs on new pushes.
  • Matrix: Run each platform (express, nextjs, python-fastapi, python-flask, react-vite, sveltekit) as a separate matrix job with fail-fast: false.

Open questions

  • Should the Mastra server be a persistent deployment (e.g. Cloudflare Worker) or spun up per-run?
  • Should this run on every PR or only on a schedule / manual trigger to save costs?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions