Skip to content

Commit 909e9b8

Browse files
feat(doccano-django): real Python line coverage via coverage.py overlay
Replaces the prior API-route-surface "coverage" (which counted fired routes / known routes — a proxy that read like real coverage but didn't measure code execution) with actual line coverage via coverage.py 7.6.1. Architecture: - `Dockerfile.coverage` extends `doccano/doccano:backend` to install coverage[toml] and drop a `coverage_subprocess.pth` file into site-packages, so every gunicorn worker that forks auto-starts `coverage.process_startup()`. - `.coveragerc` runs in parallel mode (one .coverage.<pid> per worker) with sigterm = true so flushing happens on graceful shutdown. - `docker-compose.coverage.yml` is an OVERLAY: the GH Actions coverage workflow applies it via `-f docker-compose.yml -f docker-compose.coverage.yml`. The base `Dockerfile` and `docker-compose.yml` are untouched, so keploy/integrations and keploy/enterprise CI lanes consume the base compose and pay zero coverage-instrumentation cost. - `flow.sh::doccano_report_coverage` shells into the running backend, runs `coverage combine` + `coverage report --format=total`, emits `Covered N/M (XX.X%)` matching the helper script's regex. When called against the base image (no overlay) it prints "INFO: ... uninstrumented" and exits 0 so enterprise lanes' `flow.sh coverage || true` informational calls keep working. Removed: - `doccano_list_routes` (the Django URL-resolver walk). - `doccano_list_recorded_routes` (the keploy-tests / fired-routes reader). - The legacy route-surface `doccano_report_coverage` body. - `list-routes` subcommand (was diagnostic only for the surface metric). Validated locally: e2e run produced `coverage=59.0` to GITHUB_OUTPUT against a clean stack (gunicorn 4 workers, traffic loop fired, SIGTERM flush, combine+report inside container). 59% reflects bootstrap + the sample's small traffic surface; adding curls to flow.sh::doccano_record_traffic moves the number up. Signed-off-by: Akash Kumar <meakash7902@gmail.com>
1 parent ea1fef2 commit 909e9b8

7 files changed

Lines changed: 206 additions & 292 deletions

File tree

.github/workflows/doccano-django.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -175,8 +175,8 @@ jobs:
175175
if python3 -c "import sys; sys.exit(0 if (${RELEASE} - ${BUILD}) > ${THRESHOLD} else 1)"; then
176176
echo "::error::doccano-django coverage dropped from ${RELEASE}% → ${BUILD}% (-${drop}pp), exceeding the ${THRESHOLD}pp threshold."
177177
echo "Suggested actions:"
178-
echo " * Add curl(s) to flow.sh::doccano_record_traffic that exercise the routes you changed/touched."
179-
echo " * If the route(s) was intentionally retired, drop it from doccano-django/flow.sh::doccano_list_routes' SCOPE_PREFIXES too so it's removed from the denominator."
178+
echo " * Add curl(s) to flow.sh::doccano_record_traffic that exercise the new code paths."
179+
echo " * Or extend the .coveragerc 'omit' list if the new module is not part of the runtime backend (migrations, management commands, tests)."
180180
exit 1
181181
fi
182182
echo "OK — coverage delta within ${THRESHOLD}pp threshold."
@@ -196,4 +196,4 @@ jobs:
196196
197197
Threshold: PR may not drop coverage by more than **${{ env.COVERAGE_THRESHOLD }}pp**. Override per-repo via the `DOCCANO_COVERAGE_THRESHOLD` actions variable.
198198
199-
Coverage measures the API surface (`/v1/projects/*` + `/v1/me` + `/v1/users` + `/v1/health` + `/v1/auth`) that `flow.sh::doccano_record_traffic` actually exercises against the running backend's URL resolver. Reports are attached as artifacts on each job ("coverage-build" / "coverage-release").
199+
Coverage is **Python line coverage** (coverage.py 7.6.1) of the doccano backend code — the bytes that `flow.sh::doccano_record_traffic` actually executes. Instrumentation lives in a separate `Dockerfile.coverage` + `docker-compose.coverage.yml` overlay; the base `docker-compose.yml` consumed by keploy/integrations and keploy/enterprise CI lanes runs uninstrumented and pays zero coverage cost. Per-file reports are attached as artifacts on each job (`coverage-build` / `coverage-release`).
Lines changed: 66 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,51 @@
11
#!/usr/bin/env bash
22
#
3-
# run-and-measure.sh — bring doccano up via the sample's compose,
4-
# run flow.sh bootstrap + record-traffic with the per-call audit
5-
# log enabled, run flow.sh coverage, and emit `coverage=PCT`
6-
# onto $GITHUB_OUTPUT for the downstream coverage-gate job.
3+
# run-and-measure.sh — bring doccano up under the coverage overlay,
4+
# run flow.sh bootstrap + record-traffic, flush coverage from each
5+
# gunicorn worker, run flow.sh coverage to combine + report, and
6+
# emit `coverage=PCT` onto $GITHUB_OUTPUT for the downstream
7+
# coverage-gate job.
78
#
8-
# Called from .github/workflows/doccano-django.yml's
9-
# build-coverage and release-coverage jobs (one per ref under
10-
# comparison). Both jobs source the same script so the
11-
# measurement is identical across refs — any drift in the
12-
# numerator definition would otherwise produce a misleading
13-
# delta.
9+
# Called from .github/workflows/doccano-django.yml's build-coverage
10+
# and release-coverage jobs (one per ref under comparison). Both
11+
# jobs source the same script so the measurement is identical
12+
# across refs — any drift in the numerator definition would
13+
# otherwise produce a misleading delta.
1414
#
15-
# Inputs (all from the workflow env):
16-
# DOCCANO_FIRED_ROUTES_FILE — per-call audit log path; passed
17-
# through to flow.sh so its
18-
# record-traffic loop logs each
19-
# (METHOD, URL) pair, and so its
20-
# coverage subcommand uses that
21-
# file as the standalone
22-
# numerator.
23-
# DOCCANO_PHASE — label spliced into the project
24-
# name so build vs. release runs
25-
# don't collide on volume names
26-
# (compose project naming inside
27-
# the GH runner is per-job
28-
# anyway, but DOCCANO_PHASE shows
29-
# up in the test fixtures and
30-
# is useful for diffing logs).
31-
# GITHUB_OUTPUT — standard GH Actions sink for
32-
# step outputs.
15+
# Coverage isolation contract:
16+
# * Base `Dockerfile` and `docker-compose.yml` are untouched.
17+
# * The overlay `Dockerfile.coverage` + `docker-compose.coverage.yml`
18+
# adds coverage.py + the auto-start .pth file. ONLY this script
19+
# applies the overlay; the keploy/integrations and
20+
# keploy/enterprise CI lanes consume the base compose and pay
21+
# zero coverage-instrumentation cost.
22+
#
23+
# Inputs (from the workflow env):
24+
# DOCCANO_PHASE — label spliced into the project name so
25+
# build vs release runs don't collide.
26+
# GITHUB_OUTPUT — standard GH Actions sink for step outputs.
3327
set -Eeuo pipefail
3428

3529
export DOCCANO_BACKEND_CONTAINER="${DOCCANO_BACKEND_CONTAINER:-doccano_backend}"
3630
export DOCCANO_DB_CONTAINER="${DOCCANO_DB_CONTAINER:-doccano_db}"
3731
export DOCCANO_APP_PORT="${DOCCANO_APP_PORT:-18080}"
3832
export DOCCANO_FIXED_TOKEN="${DOCCANO_FIXED_TOKEN:-ac38262065f0ae1476b6a707d9d697a101764a6b}"
39-
: "${DOCCANO_FIRED_ROUTES_FILE:?DOCCANO_FIRED_ROUTES_FILE must be set by the workflow}"
4033

41-
# Reset audit log for this run; otherwise a prior run's entries
42-
# would inflate the numerator on a re-trigger.
43-
: >"$DOCCANO_FIRED_ROUTES_FILE"
34+
mkdir -p coverage
35+
chmod 777 coverage # worker UID inside container differs from runner UID
36+
sudo rm -rf coverage/.coverage* 2>/dev/null || rm -rf coverage/.coverage* 2>/dev/null || true
37+
38+
COMPOSE=(docker compose -f docker-compose.yml -f docker-compose.coverage.yml)
4439

45-
# Stage 1: bring up doccano with bootstrap so the admin user +
46-
# fixed token persist into the named volume.
47-
DOCCANO_SKIP_BOOTSTRAP=0 docker compose up -d
40+
# Stage 1: bring up doccano with bootstrap so the schema migrations
41+
# and the admin user persist into the named DB volume. The overlay
42+
# image runs gunicorn with coverage.process_startup() auto-armed in
43+
# every forked worker.
44+
DOCCANO_SKIP_BOOTSTRAP=0 "${COMPOSE[@]}" up -d --build
4845

49-
# Wait for the backend to start serving (not just port-open).
50-
# Cold doccano boot runs Django migrations + admin user create,
51-
# which on a GH runner can hit 90-120s.
46+
# Wait for the backend to start serving (cold doccano boot runs
47+
# Django migrations + admin user create — on a GH runner this can
48+
# hit 90-120s).
5249
for i in $(seq 1 120); do
5350
code=$(curl -sS -o /dev/null -w '%{http_code}' \
5451
"http://127.0.0.1:${DOCCANO_APP_PORT}/v1/health/" 2>/dev/null || echo "")
@@ -57,31 +54,49 @@ for i in $(seq 1 120); do
5754
done
5855

5956
bash flow.sh bootstrap 240
60-
docker compose down --remove-orphans
57+
"${COMPOSE[@]}" down --remove-orphans
6158

6259
# Stage 2: re-launch in skip-bootstrap mode against the populated
63-
# volume — same shape the keploy lanes use.
64-
DOCCANO_SKIP_BOOTSTRAP=1 docker compose up -d
60+
# volume; same shape the keploy lanes use. The overlay layer is
61+
# preserved across compose-down (only `down -v` would wipe the
62+
# named volume), so coverage tooling is still wired in.
63+
DOCCANO_SKIP_BOOTSTRAP=1 "${COMPOSE[@]}" up -d
6564

66-
# Drive traffic. flow.sh::doccano_record_traffic gates on
67-
# doccano_wait_for_fixed_token internally, so this won't fire
68-
# curls at a half-booted backend.
65+
# flow.sh::doccano_record_traffic gates on doccano_wait_for_fixed_token
66+
# internally, so this won't fire curls at a half-booted backend.
6967
bash flow.sh record-traffic
7068

71-
# Coverage report — uses DOCCANO_FIRED_ROUTES_FILE as numerator
72-
# since no keploy/test-set-* tree exists in the standalone case.
69+
# Flush coverage from each gunicorn worker. coverage.py with
70+
# sigterm = true writes the in-flight per-worker .coverage.<pid>
71+
# data file to /coverage on SIGTERM; `compose kill -s SIGTERM`
72+
# delivers it to the container's main process which propagates to
73+
# its workers via gunicorn's graceful shutdown.
74+
"${COMPOSE[@]}" kill -s SIGTERM backend
75+
# coverage.py's sigterm hook is synchronous but the OS-level
76+
# write+fsync needs a moment.
77+
sleep 3
78+
79+
# Bring backend back up so `flow.sh coverage` can docker-exec
80+
# `coverage combine` + `coverage report` inside.
81+
"${COMPOSE[@]}" up -d backend
82+
for i in $(seq 1 60); do
83+
if docker exec "$DOCCANO_BACKEND_CONTAINER" sh -c 'ls /coverage/.coverage.* >/dev/null 2>&1'; then
84+
break
85+
fi
86+
sleep 1
87+
done
88+
7389
COVERAGE_REPORT_FILE="$PWD/coverage_report.txt" bash flow.sh coverage
7490

75-
# Pull the percentage out of the report's `Covered N/M (XX.X%)`
76-
# line. Anchored on the parenthesised form so a future change to
77-
# the report's prose doesn't break the parse.
91+
# Parse `Covered N/M (XX.X%)` — anchored on the parenthesised form
92+
# so a future report-prose change doesn't break the parse.
7893
pct=$(grep -oE '\([0-9]+\.[0-9]+%\)' coverage_report.txt | head -1 | tr -d '()%')
7994
if [ -z "$pct" ]; then
8095
echo "::error::Could not parse coverage percentage from coverage_report.txt"
8196
cat coverage_report.txt || true
8297
exit 1
8398
fi
8499
echo "coverage=${pct}" >>"$GITHUB_OUTPUT"
85-
echo "coverage: ${pct}% (audit log: $DOCCANO_FIRED_ROUTES_FILE)"
100+
echo "coverage: ${pct}% (Python line coverage via coverage.py)"
86101

87-
docker compose down -v --remove-orphans
102+
"${COMPOSE[@]}" down -v --remove-orphans

doccano-django/.coveragerc

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
[run]
2+
# Per-process line coverage of the backend Django code.
3+
#
4+
# parallel + sigterm: gunicorn forks WORKERS subprocesses; each
5+
# writes its own .coverage.<host>.<pid> file under /coverage.
6+
# `combine` merges them at report time. `sigterm = true` flushes
7+
# the in-flight data on SIGTERM so the reaper from the workflow
8+
# captures it.
9+
parallel = true
10+
sigterm = true
11+
branch = false
12+
data_file = /coverage/.coverage
13+
source = /backend
14+
15+
omit =
16+
*/tests/*
17+
*/migrations/*
18+
*/__pycache__/*
19+
/backend/manage.py
20+
/backend/config/wsgi.py
21+
/backend/config/asgi.py

doccano-django/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
coverage/
2+
coverage_report.txt

doccano-django/Dockerfile.coverage

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Coverage-instrumented variant of the doccano backend image.
2+
#
3+
# Base `Dockerfile` (and `docker-compose.yml`) are deliberately
4+
# untouched so the keploy enterprise / integrations lanes — which
5+
# consume them as-is — pay zero coverage-instrumentation cost. This
6+
# overlay image is built and run ONLY by the standalone GitHub
7+
# Actions workflow under `.github/workflows/doccano-django.yml`,
8+
# wired in via `docker-compose.coverage.yml`.
9+
#
10+
# What the overlay adds:
11+
# * `coverage` (Python coverage.py) installed into the same
12+
# site-packages as gunicorn / Django.
13+
# * `.coveragerc` placed at /backend/.coveragerc — the working
14+
# directory the upstream image starts gunicorn from. With
15+
# `COVERAGE_PROCESS_START=/backend/.coveragerc` exported into
16+
# the container env (set in the compose overlay), every
17+
# gunicorn worker that imports `coverage.process_startup` via
18+
# site-packages will pick the rcfile up; combined with `parallel
19+
# = true` and `sigterm = true` in the rcfile, this gives us
20+
# real per-worker line coverage that flushes on SIGTERM.
21+
FROM doccano/doccano:backend
22+
23+
USER root
24+
RUN pip install --no-cache-dir 'coverage[toml]==7.6.1'
25+
26+
# Subprocess auto-start: a .pth file in site-packages is processed
27+
# at every Python startup, so each gunicorn worker that forks calls
28+
# coverage.process_startup() before any Django code runs. This is
29+
# the canonical way coverage.py instruments forked subprocesses
30+
# (see "Measuring sub-processes" in the coverage.py docs).
31+
RUN echo 'import coverage; coverage.process_startup()' \
32+
> /usr/local/lib/python3.10/site-packages/coverage_subprocess.pth
33+
34+
COPY .coveragerc /backend/.coveragerc
35+
RUN mkdir -p /coverage \
36+
&& chown -R doccano:doccano /coverage /backend/.coveragerc
37+
USER doccano
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Coverage overlay — applied with:
2+
#
3+
# docker compose -f docker-compose.yml -f docker-compose.coverage.yml up -d --build
4+
#
5+
# Used ONLY by the standalone .github/workflows/doccano-django.yml
6+
# CI workflow. Keploy CI lanes (enterprise, integrations) ignore
7+
# this file and run the base compose unchanged, so they pay zero
8+
# coverage-instrumentation cost.
9+
services:
10+
backend:
11+
build:
12+
context: .
13+
dockerfile: Dockerfile.coverage
14+
environment:
15+
# Tells the .pth-file-installed coverage.process_startup() to
16+
# actually start tracking; see Dockerfile.coverage.
17+
COVERAGE_PROCESS_START: /backend/.coveragerc
18+
volumes:
19+
- ./coverage:/coverage

0 commit comments

Comments
 (0)