Skip to content

ci(pr): use CNCF Oracle runners for e2e#9

Closed
kvaps wants to merge 1 commit into
mainfrom
ci/e2e-oracle-runner
Closed

ci(pr): use CNCF Oracle runners for e2e#9
kvaps wants to merge 1 commit into
mainfrom
ci/e2e-oracle-runner

Conversation

@kvaps
Copy link
Copy Markdown
Member

@kvaps kvaps commented May 22, 2026

Summary

Switch the e2e job from GitHub-hosted `ubuntu-latest` to the CNCF-provided Oracle runner pool on the cozystack org, mirroring the pattern in cozystack/cozystack's pull-requests workflow.

```yaml
runs-on: ${{ contains(github.event.pull_request.labels.*.name, 'debug') && 'self-hosted' || 'oracle-vm-24cpu-96gb-x86-64' }}
```

PR label Runner Why
(default) `oracle-vm-24cpu-96gb-x86-64` (24 CPU / 96 GB / x86-64, CNCF-funded Oracle pool) Ephemeral, big enough for real-DRBD QEMU stands
`debug` `self-hosted` Long-lived so the breakpoint step can attach SSH

Why

`ubuntu-latest` (4 CPU / 16 GB / no KVM) only fits the kind-based tier. The real-DRBD QEMU stand (`make e2e` → Talos VMs in `.work/`) needs ~50 GB RAM and KVM nested virt — Oracle 24 CPU / 96 GB has both. With this swap the e2e job can drive the cli-matrix suite end-to-end without a separate bare-metal stand.

Also

  • Drop `continue-on-error: true` from the e2e job. It was added when the runner was the cramped `ubuntu-latest`; with real hardware the job should gate the PR again.
  • Bump `timeout-minutes` from 60 → 180 to match cozystack's e2e budget (real-DRBD scenarios can run long).

Test plan

  • CI picks the Oracle runner (regular PR).
  • Add `debug` label → CI picks `self-hosted`; breakpoint step has somewhere to attach.
  • e2e suite completes within 180 min.

Mirror cozystack/cozystack's PR-pipeline runner selection: a labelled
`debug` PR lands on a long-lived `self-hosted` runner (so the
breakpoint step has somewhere stable to attach SSH); regular PRs land
on the CNCF-provided Oracle pool `oracle-vm-24cpu-96gb-x86-64` (24
CPU / 96 GB / x86-64). Both labels are registered org-wide on the
cozystack org by the CNCF infra team — no per-repo setup needed.

96 GB RAM is enough headroom for real-DRBD QEMU stands (Talos VMs in
.work/<stand>, ~50 GB RAM, KVM nested virt), so this also lifts the
cli-matrix tier into CI once the workflow drives `make e2e<N>`
explicitly.

Drop continue-on-error: true (added when the runner was the cramped
ubuntu-latest) — with real hardware the e2e job should gate the PR
again. Bump timeout-minutes 60 → 180 to match cozystack's e2e budget.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c0157aaa-8639-4851-94e6-81135b0c220c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/e2e-oracle-runner

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kvaps
Copy link
Copy Markdown
Member Author

kvaps commented May 22, 2026

Superseded by #10 (bundled with the other two CI cleanups so the pipeline runs once).

@kvaps kvaps closed this May 22, 2026
@kvaps kvaps deleted the ci/e2e-oracle-runner branch May 22, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant