Commit 5d5ce0f
authored
feat(doccano-django): keploy compat lane sample + Python line coverage gate (#101)
* feat: add doccano-django sample (keploy postgres-v3 simple-Query bind regression)
Minimum reproducer for the polymorphic-resourcetype failure that
motivated keploy/integrations#177. Wraps doccano v1.8.5 +
django-rest-polymorphic + postgres 13.3-alpine — the same shape
the bug originally surfaced on (keploy/enterprise PRs #1889 / #1964,
pipelines 3556 / 3572).
Per the keploy-ci-debug skill, the sample owns ALL orchestration
the lane scripts in keploy/integrations and keploy/enterprise need:
the docker-compose, the admin-bootstrap flow, the API traffic loop,
the noise filter (via keploy.yml.template), and a coverage-report
helper. Future lanes that exercise the same backend re-use this
directory; they don't redefine compose / bootstrap / traffic in
their own scripts. The intent is to migrate
enterprise/.ci/scripts/doccano-linux.sh from its current ~400-line
inlined-everything shape down to a thin "clone sample → wrap in
keploy → assert" wrapper in a follow-up PR.
Layout:
* `Dockerfile` — `FROM doccano/doccano:backend`. Wrapper exists
so a future doccano patch (or a backport of an upstream fix that
changes the bug-triggering shape) is a one-line edit here, not
scattered across lane scripts.
* `docker-compose.yml` — postgres + doccano backend on a fixed
subnet, every name fully env-driven (DOCCANO_BACKEND_CONTAINER /
DOCCANO_DB_CONTAINER / DOCCANO_APP_PORT / DOCCANO_DB_IP /
DOCCANO_NETWORK_SUBNET). Lane scripts running multiple matrix
cells in parallel pass per-cell values so the cells don't
collide on container names. Two-phase boot
(DOCCANO_SKIP_BOOTSTRAP=0 → migrations + admin; named volume
retained; DOCCANO_SKIP_BOOTSTRAP=1 → gunicorn-only against the
populated volume) so record/replay see a deterministic state.
* `flow.sh` — four subcommands:
bootstrap — log in as admin, install the deterministic
authtoken_token row so record-time and
replay-time Authorization headers match.
record-traffic — drive the API: 16-call /v1/me warmup hammer
(gunicorn worker contenttypes-cache warmup,
necessary for the SIGINT-driven shutdown
pattern lanes use), POST a polymorphic
TextClassificationProject, GET / PATCH it,
plus dependent category-types / examples /
categories / metrics reads that exercise the
multi-bind django_content_type lookups the
fix targets. Fire-and-forget; keploy is the
assertion layer at replay.
coverage — walk the running backend's URL resolver
(introspecting actual served methods, not
Django's permissive http_method_names default)
and the just-recorded keploy/test-set-*
tests; emit a (method, path) coverage
percentage for the v1/projects + accessory
surface.
list-routes — print the route table the coverage report
uses as its denominator (diagnostic).
* `keploy.yml.template` — globalNoise filter for the inherently
non-deterministic fields (Date/Expires headers, created_at/
updated_at body fields). Centralised here so a future doccano
version that adds another auto-timestamp field is one edit
rather than a fan-out across lane scripts. Lane scripts
envsubst this template into the per-cell run dir.
* `README.md` — bug shape, local-run instructions, lane pointers.
Sample is keploy-independent: `docker compose up && bash flow.sh
bootstrap && bash flow.sh record-traffic` works against bare
doccano. Verified locally: 25/25 calls return expected status,
polymorphic resourcetype is `TextClassificationProject` end-to-end.
The route walker emits 144 (method, path) pairs for the v1/projects
+ /v1/me + /v1/users + /v1/health + /v1/auth surface; coverage
matching against synthetic recorded tests rounds correctly.
Lanes that pin to this sample (pinned to the
feat/doccano-django-sample branch via --branch until this PR
merges):
* keploy/integrations `.woodpecker/doccano-postgres.yml` —
three-way matrix (record-build × replay-build, record-latest ×
replay-build, record-build × replay-latest); depends_on
prepare-and-run.
* keploy/enterprise `.woodpecker/doccano-linux.yml` — being
migrated to consume this sample in a follow-up PR; today still
uses inline compose generation.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* fix(doccano-django): gate record-traffic on a real readiness signal
Pipeline 3597 / 909 (post-compose-render fix) failed at:
Container ... Error dependency postgres failed to start
Misleading. Real cause: doccano_record_traffic fired its very
first POST /v1/projects against a backend whose port was open but
gunicorn was still booting; the 5xx response failed `curl -fsS`,
set -e killed the script silently, the lane saw a zero-second
"traffic done", SIGINTed keploy ~3s later, and the recording
captured nothing. The "dependency postgres failed" line in the
log is downstream noise from the SIGINT compose-down.
Fix: gate doccano_record_traffic on doccano_wait_for_fixed_token
before any curl fires. /v1/me with the fixed Authorization
header is a stronger readiness signal than wait_for_port: it
proves gunicorn is past boot, auth is wired, the named-volume
token is loaded, and the DB is responsive — all four guarantees
the first POST needs.
Lane scripts can keep their own port-level wait (wait_for_port),
but the sample's flow.sh now refuses to fire traffic until the
backend is genuinely serving. Local smoke-test pattern is
unchanged: bootstrap + record-traffic still work standalone.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* ci: doccano-django sample coverage gate (build vs release)
Adds .github/workflows/doccano-django.yml — runs ONLY on changes
under doccano-django/ (or this workflow file) so unrelated
samples in this repo don't pay the doccano runtime cost.
Three jobs:
* `build-coverage` — checks out the PR's HEAD ref, brings up
the sample's compose, drives flow.sh bootstrap +
record-traffic with a per-call audit log enabled, runs
flow.sh coverage. Captures the percentage as a job output.
* `release-coverage` — same end-to-end against
github.event.pull_request.base.ref (typically main) so we
have a baseline to compare against. Skipped on direct push
events to main (no baseline to diff against — main IS the
baseline).
* `coverage-gate` — fails the PR if build's coverage drops
more than COVERAGE_THRESHOLD pp below release.
COVERAGE_THRESHOLD defaults to 1.0pp; override with the
`DOCCANO_COVERAGE_THRESHOLD` actions variable per-repo.
Sticky-comments the PR with the diff via
marocchino/sticky-pull-request-comment so reviewers see the
delta inline.
The two measurement jobs share their body via
.github/workflows/scripts/run-and-measure.sh — same script,
different ref. Lifting it out of the YAML keeps the YAML focused
on orchestration (matrix / outputs / artifacts) and the bash on
the actual workflow logic.
Coverage source uses flow.sh's per-call audit log
(DOCCANO_FIRED_ROUTES_FILE). That makes the measurement genuinely
keploy-independent: the workflow doesn't run keploy at all,
doesn't compare against recorded test sets, just measures what
the sample's flow.sh ACTUALLY exercises against doccano's URL
resolver. Lane scripts in keploy/integrations and keploy/enterprise
consume the same flow.sh but use the keploy/test-set-*/tests/*.yaml
tree as their numerator (authoritative — only calls keploy actually
captured count). Both modes are wired into
flow.sh::doccano_list_recorded_routes via the
DOCCANO_FIRED_ROUTES_FILE fallback.
Sample-side changes:
* flow.sh::doccano_wait_for_fixed_token extracted as its own
function (was inlined into doccano_bootstrap_token, broke
doccano_record_traffic's forward reference and silently
fail-fasted under set -e).
* flow.sh::doccano_record_traffic gates on
doccano_wait_for_fixed_token before any curl fires —
port-open isn't a sufficient readiness signal under
SIGINT-driven shutdown, the very first curl -fsS POST would
5xx on a still-booting gunicorn and silently kill the script.
* flow.sh::log_fired writes (METHOD, URL) to
DOCCANO_FIRED_ROUTES_FILE before each curl in
doccano_record_traffic. Cheap, optional (no-op when env var
unset), and keeps the audit log adjacent to the curl that
produces it so future contributors can't add a curl without
also adding the log entry.
* flow.sh::doccano_list_recorded_routes falls back to the audit
log when no keploy/test-set-*/tests/*.yaml exists — the
standalone-mode numerator the workflow needs.
Verified locally: workflow body (`run-and-measure.sh`) runs
end-to-end against bare doccano in ~3 minutes, captures 16
unique (method, path) pairs, emits coverage=11.1% to
GITHUB_OUTPUT. The gate logic itself is plain bash + python3
arithmetic; no codecov/coveralls dependency, no hosted service
needed.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* ci(doccano-django): graceful bootstrap when base ref lacks the sample
Run 25196349264 (the very PR introducing doccano-django/) failed
in release-coverage with:
An error occurred trying to start process '/usr/bin/bash' with
working directory '.../doccano-django'. No such file or directory
Expected: the workflow checks out the PR's base ref to compute
the baseline coverage, but on the introducing PR there's no
baseline — `doccano-django/` doesn't exist on main yet.
Fix: a `detect` step inspects whether `doccano-django/flow.sh`
exists on the checked-out base ref. If yes, the measurement
runs as before. If no (first-PR-bootstrap case), an
`empty-baseline` step emits coverage=0.0 onto the job output,
the measurement step is skipped via `if:`, and the upload-
artifact step is also skipped (so we don't claim a non-existent
report file). The job's `outputs.coverage` falls back through
`||` so the gate sees 0.0 either way.
Net effect on the introducing PR: build's coverage (currently
~11%) is compared against 0%, gate trivially passes. After
this PR merges and a future PR edits doccano-django/, the
detect step finds the sample on main, real measurement runs,
real diff applies.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* feat(doccano-django): real Python line coverage via coverage.py overlay
Replaces the prior API-route-surface "coverage" (which counted
fired routes / known routes — a proxy that read like real coverage
but didn't measure code execution) with actual line coverage via
coverage.py 7.6.1.
Architecture:
- `Dockerfile.coverage` extends `doccano/doccano:backend` to
install coverage[toml] and drop a `coverage_subprocess.pth`
file into site-packages, so every gunicorn worker that forks
auto-starts `coverage.process_startup()`.
- `.coveragerc` runs in parallel mode (one .coverage.<pid> per
worker) with sigterm = true so flushing happens on graceful
shutdown.
- `docker-compose.coverage.yml` is an OVERLAY: the GH Actions
coverage workflow applies it via `-f docker-compose.yml -f
docker-compose.coverage.yml`. The base `Dockerfile` and
`docker-compose.yml` are untouched, so keploy/integrations and
keploy/enterprise CI lanes consume the base compose and pay
zero coverage-instrumentation cost.
- `flow.sh::doccano_report_coverage` shells into the running
backend, runs `coverage combine` + `coverage report
--format=total`, emits `Covered N/M (XX.X%)` matching the
helper script's regex. When called against the base image
(no overlay) it prints "INFO: ... uninstrumented" and exits 0
so enterprise lanes' `flow.sh coverage || true` informational
calls keep working.
Removed:
- `doccano_list_routes` (the Django URL-resolver walk).
- `doccano_list_recorded_routes` (the keploy-tests / fired-routes
reader).
- The legacy route-surface `doccano_report_coverage` body.
- `list-routes` subcommand (was diagnostic only for the surface
metric).
Validated locally: e2e run produced `coverage=59.0` to
GITHUB_OUTPUT against a clean stack (gunicorn 4 workers, traffic
loop fired, SIGTERM flush, combine+report inside container).
59% reflects bootstrap + the sample's small traffic surface;
adding curls to flow.sh::doccano_record_traffic moves the
number up.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* ci(doccano-django): drop trailing prose from sticky comment
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* docs(doccano-django): split run section into smoke / coverage / keploy modes
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* fix(doccano-django): own skip-bootstrap replay mode
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
* fix(doccano-django): nest globalNoise schema so the filter actually applies
keploy's GlobalNoise type is map[string]map[string][]string —
outer key is the response section ("header" / "body"), inner
key is the field name. Flat dotted keys like `body.created_at: []`
get put into the outer map as literal key "body.created_at" and
never match any section, so the noise is silently dropped. The
template's drift-suppression list was a no-op; only Date got
ignored at compare time because keploy auto-stamps Date as per-test
noise on every recording, and everything else slipped through.
Same shape of fix landed in samples-typescript/umami-postgres in
93bbdae; documenting the gotcha in this template's comments so
the next sample doesn't repeat it.
Validation pending — doccano cells haven't run yet on the active
keploy/enterprise PR (#1889). The matrix-cell collision fix
landed in keploy/enterprise#1889 (commit 84cd64b1) opens up the
lane enough for the noise filter to actually be exercised against
real drift.
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
---------
Signed-off-by: Akash Kumar <meakash7902@gmail.com>1 parent 57856de commit 5d5ce0f
12 files changed
Lines changed: 1027 additions & 0 deletions
File tree
- .github/workflows
- scripts
- doccano-django
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
0 commit comments