Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
6837b72
ci(integration): switch linstor-client install to GitHub tarball
kvaps May 22, 2026
8475b10
style(rest): propagate request ctx into delete-rollback closures
kvaps May 22, 2026
4e6694e
style(rest): wrap dynamic VD size errors with sentinels
kvaps May 22, 2026
1f39f60
style: reorder unexported helpers below their exported callers
kvaps May 22, 2026
ef6bc02
style: extract repeated literals and tighten struct shape
kvaps May 22, 2026
dda4486
style(satellite/controllers): pass Config by pointer through NewManager
kvaps May 22, 2026
b35d280
style(satellite/controllers): extract periodic-tick loop helper
kvaps May 22, 2026
8c57c66
style: split long satellite-controller functions into helpers
kvaps May 22, 2026
6af60c1
ci(pr): use CNCF Oracle runners for e2e
kvaps May 22, 2026
2a1a2b6
ci(pr): port GitHub tarball install + drop continue-on-error muzzles
kvaps May 22, 2026
9957ff3
ci: pin setup-envtest to release-0.23 branch
kvaps May 22, 2026
1a6e716
ci(lint): exclude godox for production-code paths with Bug NNN markers
kvaps May 22, 2026
0f2ed69
style: autofix gofmt/gofumpt/intrange/wsl_v5/modernize/nlreturn
kvaps May 22, 2026
45662fa
style: clear non-godox lint debt in pkg/version and internal/controller
kvaps May 22, 2026
f09de0d
style: clear lint debt in pkg/satellite — second pass
kvaps May 22, 2026
827ba59
style: clear remaining lint debt across pkg/rest + pkg/store + cmd
kvaps May 22, 2026
4d88db2
style: fix Linux-only lint findings missed on macOS dev box
kvaps May 22, 2026
a1f8fc9
fix(e2e): docker-build --target controller (Bug 359 followup)
kvaps May 22, 2026
3dabf41
test(bug204a): retry race-witness up to 10 attempts (flake fix)
kvaps May 22, 2026
1f2233f
ci: breakpoint fires on every e2e failure (no label gate)
kvaps May 22, 2026
58ef982
fix(e2e): kustomize command /manager → /controller
kvaps May 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 24 additions & 8 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,19 +40,35 @@ jobs:
- name: Install linstor-client (python-linstor)
# The harness shells out to the upstream linstor CLI to exercise
# wire-shape compatibility — exactly what unit tests cannot do.
# linstor-client / python-linstor aren't packaged in the default
# Ubuntu repos (LINBIT publishes them only on the LINBIT PPA and
# PyPI); pip is the runner-friendly path.
#
# Install path rationale (validated against ubuntu:24.04 / noble):
# - apt: LINBIT only ships debs for Debian (bookworm/bullseye/
# buster/trixie) and the LINBIT PPA, neither covers noble.
# `apt-get install linstor-client` → "Unable to locate package".
# - PyPI: only `python-linstor` is published. `linstor-client`
# and the bare `linstor` name are NOT on PyPI — pip exits with
# "No matching distribution found".
# - GitHub tarball: works, but v1.27.1's setup.py has a typo
# (missing comma joins `python3-setuptools` + `python-linstor`
# into one malformed requirement). `--no-deps` sidesteps it;
# `python-linstor` + `argcomplete` are installed explicitly
# beforehand so the runtime dep set stays correct.
# Pin v1.27.1 to match `linstor_client.VERSION` the integration
# harness asserts on (tests/integration/group_h_test.go).
run: |
python3 -m pip install --break-system-packages --upgrade linstor-client python-linstor
linstor --version | head -1
python3 -m pip install --break-system-packages --upgrade \
python-linstor==1.27.1 argcomplete
python3 -m pip install --break-system-packages --no-deps \
https://github.com/LINBIT/linstor-client/archive/refs/tags/v1.27.1.tar.gz
linstor --version

- name: Install envtest binaries
# controller-runtime's envtest needs kube-apiserver + etcd
# binaries. setup-envtest pins the version matching our
# controller-runtime release.
# binaries. We track the release branch matching our
# controller-runtime (v0.23.x). `@latest` would resolve to
# the v0.24.x submodule, which requires Go >= 1.26.
run: |
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@release-0.23
echo "KUBEBUILDER_ASSETS=$(setup-envtest use --print path 1.34.x)" >> "$GITHUB_ENV"

- name: go mod tidy
Expand Down
87 changes: 37 additions & 50 deletions .github/workflows/pull-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,11 @@ name: Pull Request
# because variables are not exposed in fork-PR workflows).
#
# Labels that change behaviour:
# - debug → opens an SSH breakpoint on e2e failure (paused 20m for first
# attach, then 10m idle-timeout after the last disconnect).
# - debug → pins the e2e job to a self-hosted runner so a maintainer can
# attach via the host (kubectl/docker on the runner host
# directly, no SSH dance through the rendezvous server). The
# breakpoint step itself fires on every e2e failure regardless
# of label — it just needs BREAKPOINT_ENDPOINT to be set.

on:
pull_request:
Expand Down Expand Up @@ -57,12 +60,6 @@ jobs:
timeout-minutes: 15
permissions:
contents: read
# TODO: drop continue-on-error once the existing lint debt (~15
# findings across contextcheck, funcorder, err113,
# embeddedstructfieldcheck, goconst) is cleared. The job still
# runs and surfaces every new finding via annotations — it just
# doesn't fail the PR check until the backlog is zero.
continue-on-error: true
steps:
- name: Clone the code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
Expand Down Expand Up @@ -139,14 +136,6 @@ jobs:
timeout-minutes: 20
permissions:
contents: read
# TODO: drop continue-on-error once linstor-client / python-linstor
# have a runner-friendly install path. PyPI publishes the package
# but `pip install linstor-client` rejects on GitHub-hosted ubuntu-
# latest with "No matching distribution found" (Python wheel /
# platform tag mismatch). LINBIT also has a PPA, but it lacks
# noble builds. Until the install is unblocked the job runs +
# surfaces the breakage but does not fail the PR.
continue-on-error: true
steps:
- name: Clone
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
Expand All @@ -157,20 +146,23 @@ jobs:
with:
go-version-file: go.mod
- name: Install linstor-client (python-linstor)
# The harness shells out to the upstream linstor CLI to exercise
# wire-shape compatibility — exactly what unit tests cannot do.
# linstor-client / python-linstor aren't packaged in the default
# Ubuntu repos (LINBIT publishes them only on the LINBIT PPA and
# PyPI); pip is the runner-friendly path.
# See .github/workflows/integration.yml for the install path
# rationale; this mirrors that step so PR runs match push runs.
# Pin v1.27.1 to match `linstor_client.VERSION` the integration
# harness asserts on (tests/integration/group_h_test.go).
run: |
python3 -m pip install --break-system-packages --upgrade linstor-client python-linstor
linstor --version | head -1
python3 -m pip install --break-system-packages --upgrade \
python-linstor==1.27.1 argcomplete
python3 -m pip install --break-system-packages --no-deps \
https://github.com/LINBIT/linstor-client/archive/refs/tags/v1.27.1.tar.gz
linstor --version
- name: Install envtest binaries
# controller-runtime's envtest needs kube-apiserver + etcd
# binaries. setup-envtest pins the version matching our
# controller-runtime release.
# binaries. We track the release branch matching our
# controller-runtime (v0.23.x). `@latest` would resolve to
# the v0.24.x submodule, which requires Go >= 1.26.
run: |
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest
go install sigs.k8s.io/controller-runtime/tools/setup-envtest@release-0.23
echo "KUBEBUILDER_ASSETS=$(setup-envtest use --print path 1.34.x)" >> "$GITHUB_ENV"
- name: go mod tidy
run: go mod tidy
Expand All @@ -190,29 +182,22 @@ jobs:

e2e:
name: E2E
# GitHub-hosted runner. kind-based e2e (Tier 4 in the test strategy)
# runs fine on ubuntu-latest without nested virtualisation. Real-
# DRBD QEMU scenarios (.work/<stand> with Talos VMs) need KVM and
# ~50 GB RAM, so they stay manual on dedicated bare-metal stands —
# see stand/Makefile + reference_blockstor_stand.md.
#
# When the ephemeral self-hosted runner pool (ARC / namespace.so /
# equivalent) is wired up, swap the label to that pool's identifier
# and bring real-DRBD scenarios online here too.
runs-on: ubuntu-latest
# Runner selection mirrors cozystack/cozystack: a labelled `debug` PR
# lands on a long-lived `self-hosted` runner so the breakpoint step
# below has somewhere stable to attach SSH; regular PRs land on the
# CNCF-provided Oracle pool (24 CPU / 96 GB / x86-64) which has
# enough headroom for kind + real-DRBD QEMU stands (Talos VMs in
# .work/<stand>, ~50 GB RAM, KVM nested virt). The pool labels are
# org-wide on cozystack, so no extra setup is required here. Swap
# to oracle-vm-32cpu-128gb-x86-64 if a future scenario needs more
# RAM/CPU.
runs-on: ${{ contains(github.event.pull_request.labels.*.name, 'debug') && 'self-hosted' || 'oracle-vm-24cpu-96gb-x86-64' }}
needs: [detect-changes, lint, unit-test]
if: needs.detect-changes.outputs.code == 'true'
timeout-minutes: 60
timeout-minutes: 180
permissions:
contents: read
checks: write
# TODO: drop continue-on-error once the kind-based e2e tier is
# confirmed stable on ubuntu-latest. The kustomize manifest fix
# (commit c2e716daf) unblocked `make deploy`, but the suite
# itself may surface other ubuntu-latest gaps. Until then the job
# surfaces issues via annotations + breakpoint but does not block
# PR approval.
continue-on-error: true
steps:
- name: Clone the code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
Expand Down Expand Up @@ -240,10 +225,13 @@ jobs:

# Open an SSH breakpoint to the failing e2e runner so maintainers
# can attach, inspect kind/blockstor state, and resume with
# `breakpoint resume`. Gated by the `debug` label — set it on the PR
# to opt in. Forks can't reach the rendezvous server because
# repository variables are not exposed to fork-PR workflows; the
# step is silently skipped in that case.
# `breakpoint resume`. Fires on every e2e failure (no label opt-in)
# — the rationale is that an e2e failure already burned the runner
# minutes and a maintainer almost always wants to inspect the wedged
# cluster before tear-down. Forks can't reach the rendezvous server
# because repository variables are not exposed to fork-PR workflows;
# the step is silently skipped in that case (BREAKPOINT_ENDPOINT
# comes through empty).
#
# Uses cozystack/breakpoint-action (fork of namespacelabs/breakpoint-action)
# pinned by SHA. The fork adds pause-idle mode (initial grace period
Expand All @@ -253,8 +241,7 @@ jobs:
- name: Breakpoint on E2E failure
if: |
failure() &&
vars.BREAKPOINT_ENDPOINT != '' &&
contains(github.event.pull_request.labels.*.name, 'debug')
vars.BREAKPOINT_ENDPOINT != ''
# cozystack/breakpoint-action v2-cozy.1
# mode: pause-idle defaults: grace-period=20m, idle-timeout=10m
uses: cozystack/breakpoint-action@a6f3a6f87be398ad63b6577351e3398e53f578e4
Expand Down
96 changes: 95 additions & 1 deletion .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ linters:
# interfaces share the same shape on purpose).
- linters:
- dupl
path: pkg/store/inmemory_
path: pkg/store/inmemory
- linters:
- dupl
path: pkg/store/store\.go
Expand Down Expand Up @@ -272,6 +272,100 @@ linters:
- varnamelen
- mnd
path: test/
# Production-code packages use `Bug NNN:` markers as a deliberate
# cross-reference index into the project's bug tracker (the comments
# document WHY a workaround exists and point at the underlying issue).
# godox would flag them as stray TODO/BUG/FIXME — disable ONLY godox
# for these paths; every other linter still applies.
- linters:
- godox
path: cmd/apiserver/
- linters:
- godox
path: pkg/api/v1/
- linters:
- godox
path: pkg/drbd/
- linters:
- godox
path: pkg/luks/
- linters:
- godox
path: pkg/storage/
- linters:
- godox
path: pkg/store/
- linters:
- godox
path: pkg/uevent/
- linters:
- godox
path: tests/contract/
- linters:
- godox
path: pkg/rest/
- linters:
- godox
path: pkg/satellite/
- linters:
- godox
path: pkg/dispatcher/
- linters:
- godox
path: pkg/version/
# linstor-trace-recorder is a dev tool that writes JSON traces
# under -out-dir; gosec G703 flags the os.WriteFile call despite
# the explicit filepath.Base sanitisation right above it (taint
# analysis can't see through the local-variable assignment).
- linters:
- gosec
path: cmd/linstor-trace-recorder/
# pkg/storage/file/diskfree.go does an int64(stat.Bsize)
# conversion that is a no-op on Linux (Bsize int64) but
# required on macOS (Bsize uint32). unconvert can't see the
# cross-platform shape, so the conversion is flagged as
# unnecessary on Linux builds.
- linters:
- unconvert
path: pkg/storage/file/diskfree\.go
# peer_delete_sync.go contains Bug 342 v10 plumbing (per-peer
# forget-peer ACK annotations) that's staged on the satellite side
# but not yet called from the REST handler. The unused-code is
# intentional and reviewed; remove the exclusion when the v10
# wire-up lands.
- linters:
- unused
- funlen
- wrapcheck
- varnamelen
path: pkg/rest/peer_delete_sync\.go
# kv_store.go's KV PUT body shape contains LINSTOR-wire field
# names (override_props / delete_props / delete_namespaces) that
# appear both in the field-name allow-list and the corresponding
# unmarshal target — extracting constants doesn't aid readability
# since the literals match the JSON keys verbatim.
- linters:
- goconst
path: pkg/rest/kv_store\.go
# resources.go's boolQuery parser enumerates strconv.ParseBool's
# truthy alternatives + curl-style "yes"/"on"; the literals are
# parser cases, not magic constants.
- linters:
- goconst
path: pkg/rest/resources\.go
# resource_toggle_disk.go composes wire-status strings
# ("diskful"/"diskless") in a short branch; goconst miscounts
# these as repeated literals against the rest of the package.
- linters:
- goconst
path: pkg/rest/resource_toggle_disk\.go
# reconciler_drbd_test.go reserves sentinel errors for paths
# that aren't yet exercised by the fixture; the sentinels stay
# so the table-driven scaffolding can grow incrementally without
# re-introducing the same constants on each addition.
- linters:
- unused
path: pkg/satellite/reconciler_drbd_test\.go
formatters:
enable:
- gofmt
Expand Down
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,13 @@ DOCKER_BUILD_ARGS = --build-arg GIT_HASH=$(GIT_HASH) --build-arg BUILD_TIME=$(BU
# More info: https://docs.docker.com/develop/develop-images/build_enhancements/
.PHONY: docker-build
docker-build: ## Build docker image with the manager.
$(CONTAINER_TOOL) build $(DOCKER_BUILD_ARGS) -t ${IMG} .
# --target controller pins the multi-stage build to the distroless
# nonroot stage that ships /controller. Without it docker picks
# the last stage (satellite, debian:trixie-slim, no USER
# directive) and `make deploy` would land a root-running image
# under a Pod that kustomize stamps with `runAsNonRoot: true`,
# producing the e2e CreateContainerConfigError failure.
$(CONTAINER_TOOL) build --target controller $(DOCKER_BUILD_ARGS) -t ${IMG} .

.PHONY: docker-push
docker-push: ## Push docker image with the manager.
Expand Down
4 changes: 3 additions & 1 deletion cmd/linstor-trace-recorder/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,9 @@ func (r *recorder) write(method, path string, reqBody []byte, status int, respBo
// filepath.Base belt-and-braces against any path-traversal
// chars that snuck through sanitisePath (filenames already have
// `/` stripped, but gosec G703 wants the explicit guard).
err = os.WriteFile(filepath.Join(r.outDir, filepath.Base(name)), out, fileMode)
target := filepath.Join(r.outDir, filepath.Base(name))

err = os.WriteFile(target, out, fileMode)
if err != nil {
fmt.Fprintf(os.Stderr, "write trace: %v\n", err)
os.Exit(1)
Expand Down
2 changes: 1 addition & 1 deletion cmd/satellite/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ func loadRESTConfig() (*rest.Config, error) {
// satellite.ManagerFactory signature.
func mgrFactory(ready *readyState, logger *slog.Logger, ueventListener controllers.UeventNotifier) satellite.ManagerFactory {
return func(restCfg *rest.Config, nodeName, probeAddr string, rec *satellite.Reconciler) (manager.Manager, error) {
mgr, err := controllers.NewManager(restCfg, controllers.Config{
mgr, err := controllers.NewManager(restCfg, &controllers.Config{
NodeName: nodeName,
Apply: rec,
Exec: storage.RealExec{},
Expand Down
7 changes: 6 additions & 1 deletion config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,13 @@ spec:
seccompProfile:
type: RuntimeDefault
containers:
# Binary path is `/controller` (set by Dockerfile's controller
# stage: COPY --from=builder /workspace/controller .) — not
# the kubebuilder default `/manager`. Container name stays
# `manager` for the kustomize patches in config/default that
# match on it.
- command:
- /manager
- /controller
args:
- --leader-elect
- --health-probe-bind-address=:8081
Expand Down
Loading
Loading