feat(segment_membership): Daily Snowflake-backed per-env segment counts by khvn26 · Pull Request #7464 · Flagsmith/flagsmith

khvn26 · 2026-05-08T23:05:48Z

Thanks for submitting a PR! Please check the boxes below:

I have read the Contributing Guide.
I have added information to docs/ if required so people know about the feature.
I have filled in the "Changes" section below.
I have filled in the "How did you test this code" section below.

Changes

Contributes to Segment Membership Inspection.

Adds a daily pipeline that backfills Dynamo identities into Snowflake, materialises per-(segment, environment) match counts via flagsmith-sql-flag-engine, and exposes them on the segment endpoint as memberships: [{environment, count, last_synced_at}] for env-dropdown badges. Gated behind the org-scoped segment_membership_inspection FoF flag; no-ops when SNOWFLAKE_* env vars are unset.

Review complexity: 4/5 — three datastores, two new runtime deps, new Django app with recurring + handler tasks. Pulled down by an FoF flag, additive on the read path.

Review order: models.py (cache table) → services.py (compile + count, parameterised SQL) → tasks.py (daily recurring backfill fans out per-project refresh) → mappers.py (Dynamo doc → IDENTITIES row) → migrations/0002_* (Snowflake DDL RunPython, no-op when unconfigured) → segments/serializers.py + views.py (read-side memberships field, prefetched).

How did you test this code?

36 unit tests + 2 integration tests; 100% coverage on segment_membership/. make lint and make typecheck (mypy strict) green.

Backfills identities from Dynamo to Snowflake daily, then refreshes per-(segment, environment) match counts in the new `SegmentMembership` cache. The translator from `flagsmith-sql-flag-engine` turns each canonical segment into a SQL `WHERE` predicate; counts are materialised as `COUNT(*) ... GROUP BY environment_id` per segment. The serializer surfaces them as a list of `{environment, count, last_synced_at}`, ready to back per-env count badges in the Identities-tab environment dropdown. Pipeline shape: - `backfill_identities_to_snowflake` is the daily recurring task (`timeout=4h` to fit large environments). After backfilling each project's environments it dispatches one `refresh_project_segment_counts(project_id)` per project so the count refresh always sees the freshly backfilled snapshot rather than racing a separate schedule. - `refresh_project_segment_counts` opens its own Snowpark session, re-checks the FoF flag at execution time so a stale fan-out skips orgs that have since been disabled, and bulk-upserts via Postgres `ON CONFLICT` (single statement per project). - `compute_segment_counts_for_project` returns a list of unsaved `SegmentMembership` instances; the task stamps `last_synced_at` consistently across the batch. Untranslatable segments emit a structlog `compute.segment.skipped` error event so we hear about predicate gaps rather than silently dropping rows. Both tasks short-circuit when SNOWFLAKE_* env vars are unset and skip per-organisation when the `segment_membership_inspection` Flagsmith-on-Flagsmith flag is False, so SaaS rolls out gradually and self-hosted is unaffected. DELETE-then-INSERT runs without an explicit transaction. Snowflake holds micropartition locks for the lifetime of an open transaction, and at 10M+ identities a BEGIN/COMMIT around the whole env partition would keep that lock open for minutes. Per-statement implicit commits leave a brief mid-refresh window where readers see an empty partition; acceptable under the FoF flag's gradual rollout. Backfill writes via Snowpark DataFrames against the canonical IDENTITIES schema, with `DynamoIdentity` documents projected through `segment_membership.mappers.map_identity_document_to_snowflake_row`. Refresh issues a single batched UNION ALL using parameterised SQL — env keys are bound, predicates from the engine are already escape- safe. Schema setup is a `RunPython` migration gated on `is_snowflake_configured()`, so it no-ops on self-hosted and in the test suite. The segment serializer surfaces cached counts via a new `memberships` list field; absence of an entry is the read-side signal, no flag check on the read path. `SegmentMembershipSerializer` gives drf-spectacular a typed schema. Adds a generic `batched` helper to `api/util/util.py` for the per-INSERT batching. beep boop

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

vercel · 2026-05-08T23:05:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	May 9, 2026 3:06am

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
flagsmith-frontend-preview	Ignored	Preview	May 9, 2026 3:06am
flagsmith-frontend-staging	Ignored	Preview	May 9, 2026 3:06am

github-actions · 2026-05-08T23:06:55Z

Docker builds report

Image	Build Status	Security report
`ghcr.io/flagsmith/flagsmith-e2e:pr-7464`	Finished ✅	Skipped
`ghcr.io/flagsmith/flagsmith-api-test:pr-7464`	Finished ✅	Skipped
`ghcr.io/flagsmith/flagsmith-frontend:pr-7464`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith-api:pr-7464`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith:pr-7464`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith-private-cloud:pr-7464`	Finished ✅	Results ✅

…ps prefetch The new `prefetch_related("memberships")` adds one IN-clause query per list response, even when no rows exist. Update the regression expectations so the existing test suite reflects the new baseline. beep boop

codecov · 2026-05-08T23:35:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.45%. Comparing base (e4651d1) to head (bff85b4).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7464      +/-   ##
==========================================
+ Coverage   98.44%   98.45%   +0.01%     
==========================================
  Files        1398     1410      +12     
  Lines       52654    53117     +463     
==========================================
+ Hits        51834    52297     +463     
  Misses        820      820

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… pre-release Switches the api dep from a private-repo git URL — which the Docker build can't clone in CI — to a versioned pin against Flagsmith's staging CodeArtifact PyPI (`flagsmith-pypi-staging`, account 302456015006, eu-west-2). Initial published release: 0.1.0a1. The reusable docker-build workflow now unconditionally assumes the OIDC role `arn:aws:iam::302456015006:role/codeartifact-github-actions-staging` (trust policy allows any `repo:Flagsmith/*`), fetches an authorisation token, and exposes it to every build as the `codeartifact_token` BuildKit secret. Builds that don't mount the secret simply ignore it; the OIDC + token cost is a couple of seconds per build. `Dockerfile`'s four `make install*` lines mount the `codeartifact_token` secret and export `POETRY_HTTP_BASIC_FLAGSMITH_PYPI_STAGING_*` so poetry resolves the dep from CodeArtifact. The header documents the `--secret="id=codeartifact_token,env=..."` incantation for local builds. beep boop

github-actions · 2026-05-09T00:11:55Z

Playwright Test Results (oss - depot-ubuntu-latest-16)

1 passed

Details

1 test across 1 suite
32.1 seconds
fede18a
🔄 Run: #16619 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

1 passed

Details

1 test across 1 suite
42.4 seconds
fede18a
🔄 Run: #16619 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

3 passed

Details

3 tests across 3 suites
37.3 seconds
fede18a
🔄 Run: #16619 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

2 failed

Details

2 tests across 2 suites
22.8 seconds
fede18a
📦 Artifacts: View test results and HTML report
🔄 Run: #16619 (attempt 1)

Failed tests

2 tests across 2 suites
42.8 seconds
4d954a2
🔄 Run: #16625 (attempt 1)

github-actions · 2026-05-09T00:13:33Z

Visual Regression

16 screenshots compared. See report for details.
View full report

…fact The unit-test, MCP-schema-push, makefile-target, and update-flagsmith workflows all run `make install-packages`, which now needs CodeArtifact credentials to resolve the `flagsmith-sql-flag-engine` pre-release. Encapsulate the OIDC role assumption + token fetch in a composite action, reuse it from the Docker build workflow, and wire it into every workflow that runs poetry install. beep boop

CodeQL flagged the MD5 truncation as a sensitive-data hashing risk. UUIDv4 already gives us the random bits we need for a dedup key, so take the high 64 bits directly via int.from_bytes and drop the hash. beep boop

Adds four global Prometheus metrics covering the daily Dynamo→Snowflake backfill and the per-project count refresh: identities mirrored, per-environment backfill duration, refresh duration, and refresh failures. Metrics are global — env/project labels would blow Prometheus cardinality at SaaS scale. Snowpark sessions now carry a QUERY_TAG for spend attribution, set via Snowpark's `session.query_tag` setter. Backfill tags by org+project per env iteration; refresh tags by org+project. Spend grouped by tag is queryable from Snowflake's QUERY_HISTORY for 365 days. beep boop

khvn26 requested review from a team as code owners May 8, 2026 23:05

khvn26 requested review from gagantrivedi and removed request for a team May 8, 2026 23:05

claude Bot reviewed May 8, 2026

View reviewed changes

github-actions Bot added api Issue related to the REST API docs Documentation updates feature New feature or request and removed docs Documentation updates labels May 8, 2026

github-advanced-security AI found potential problems May 8, 2026

View reviewed changes

Comment thread api/segment_membership/mappers.py Fixed

github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 8, 2026

khvn26 requested a review from a team as a code owner May 9, 2026 00:05

github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026

github-actions Bot added docs Documentation updates and removed feature New feature or request docs Documentation updates labels May 9, 2026

github-actions Bot added the feature New feature or request label May 9, 2026

khvn26 force-pushed the feat/segment-membership-counts branch from c6e464b to bff85b4 Compare May 9, 2026 00:18

github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026

fix(segment_membership): Derive IDENTITIES.id from UUID bytes, not MD5

aa45090

CodeQL flagged the MD5 truncation as a sensitive-data hashing risk. UUIDv4 already gives us the random bits we need for a dedup key, so take the high 64 bits directly via int.from_bytes and drop the hash. beep boop

khvn26 force-pushed the feat/segment-membership-counts branch from 0bd838f to aa45090 Compare May 9, 2026 00:37

github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026

khvn26 mentioned this pull request May 9, 2026

feat(segment_membership): Surface identity counts in segments UI #7467

Open

4 tasks

khvn26 requested a review from a team as a code owner May 9, 2026 03:05

github-actions Bot added the docs Documentation updates label May 9, 2026

vercel Bot deployed to Preview – docs May 9, 2026 03:06 View deployment

github-actions Bot added feature New feature or request and removed feature New feature or request docs Documentation updates labels May 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(segment_membership): Daily Snowflake-backed per-env segment counts#7464

feat(segment_membership): Daily Snowflake-backed per-env segment counts#7464
khvn26 wants to merge 6 commits intomainfrom
feat/segment-membership-counts

khvn26 commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

vercel Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

khvn26 commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

How did you test this code?

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

vercel Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docker builds report

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Playwright Test Results (oss - depot-ubuntu-latest-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

Details

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

Details

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Visual Regression

Uh oh!

Reviewers

Assignees

Labels

Projects

khvn26 commented May 8, 2026 •

edited

Loading

vercel Bot commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 8, 2026 •

edited

Loading

codecov Bot commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading