Trt 1989 migration queries#3542
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
WalkthroughDenormalize release and timestamp into child Prow tables, add model fields/indexes, thread those fields through ingestion and seed data, update SQL functions/views/queries and API callers to use per-run/per-test fields, and add planning docs for phased partitioning and indexing. ChangesTRT-1989 Denormalization for PostgreSQL Partitioning
Sequence Diagram(s)sequenceDiagram
participant GCS as processGCSBucketJobRun
participant Persister as DB transaction
participant PRJoin as ProwJobRunProwPullRequest
participant Extractor as prowJobRunTestsFromGCS
participant TestCase as extractTestCases
GCS->>Persister: create ProwJobRun (ProwJobRelease, Timestamp)
Persister->>PRJoin: insert per-PR join rows
GCS->>Extractor: call with prowJobID, prowJobRelease
Extractor->>TestCase: extractTestCases(prowJobRelease, prowJobStartTime)
TestCase->>Persister: create ProwJobRunTest (ProwJobID, ProwJobRunRelease, ProwJobRunTimestamp)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 3 warnings)
✅ Passed checks (13 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neisw The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
docs/plans/trt-1989-phase2-indexes.md (1)
14-16: ⚡ Quick winClarify the intentional index-order exception.
The doc says indexes mirror
(release, timestamp), butprow_job_run_testsis documented as(timestamp, release). Add an explicit “except forprow_job_run_tests” qualifier where the mirror rule is introduced.Also applies to: 41-43, 127-128
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/trt-1989-phase2-indexes.md` around lines 14 - 16, Update the description that says indexes mirror the future partition key "(release, timestamp)" to explicitly note the intentional exception for the prow_job_run_tests index (which is documented as "(timestamp, release)"); add a parenthetical or sentence like "except for prow_job_run_tests, which intentionally uses (timestamp, release)" at the point where the mirror rule is introduced and make the same clarification in the other repeated occurrences of this rule later in the doc (the subsequent paragraph and the later section that repeats the mirror-index guidance).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/plans/trt-1989-partitioning-prep.md`:
- Around line 86-88: The sentence incorrectly describes FK direction: change the
wording that currently reads "Tables with FKs from `prow_job_runs` to
non-partitioned tables (annotations, pull request join table)..." to clarify
that annotations and the pull-request join table hold FKs referencing
`prow_job_runs` (i.e., FKs to `prow_job_runs`), and therefore any tables that
reference `prow_job_runs` must be co-partitioned with it or have their FKs
dropped; update the line and any examples to use "tables with FKs to
`prow_job_runs` (annotations, pull request join table) must either be
co-partitioned or have their FKs dropped" and ensure surrounding text
consistently treats `prow_job_runs` as the referenced table.
In `@docs/plans/trt-1989-phase2-indexes.md`:
- Around line 36-37: Update the doc line that incorrectly references
ProwJobRunTest.ProwJobID: it should point to the prow_job_run_tests table /
ProwJobRunTests model field instead of prow_job_runs; replace the reference with
either "prow_job_run_tests" or "ProwJobRunTests.ProwJobID" so the standalone
index is correctly attributed to the prow_job_run_tests model.
In `@docs/plans/trt-1989-phase3-query-optimization.md`:
- Around line 28-31: The document's principle "Add filters, don't replace"
contradicts later rollout rules that replace/drop joins; update the principle
heading and first paragraph (the "Add filters, don't replace" paragraph) to
explicitly match the rollout: state that join-based filters are retained until
local columns are validated and tested, then replaced/dropped when safe, and
mirror this clarified wording in the other affected sections referenced
(sections covering the same guidance around lines 73-80, 81-86, and 126-132) so
the high-level principle and the detailed rollout rules are consistent.
---
Nitpick comments:
In `@docs/plans/trt-1989-phase2-indexes.md`:
- Around line 14-16: Update the description that says indexes mirror the future
partition key "(release, timestamp)" to explicitly note the intentional
exception for the prow_job_run_tests index (which is documented as "(timestamp,
release)"); add a parenthetical or sentence like "except for prow_job_run_tests,
which intentionally uses (timestamp, release)" at the point where the mirror
rule is introduced and make the same clarification in the other repeated
occurrences of this rule later in the doc (the subsequent paragraph and the
later section that repeats the mirror-index guidance).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 44fd03d2-a25c-47af-9a09-37386bb12690
📒 Files selected for processing (16)
cmd/sippy/seed_data.godocs/plans/trt-1989-partitioning-prep.mddocs/plans/trt-1989-phase2-indexes.mddocs/plans/trt-1989-phase3-query-optimization.mdpkg/api/componentreadiness/dataprovider/postgres/provider.gopkg/api/job_runs_test.gopkg/api/recent_test_failures.gopkg/dataloader/prowloader/prow.gopkg/db/db.gopkg/db/functions.gopkg/db/models/prow.gopkg/db/query/build_clusters.gopkg/db/query/job_queries.gopkg/db/query/test_queries.gopkg/db/views.gopkg/sippyserver/pr_new_tests_worker_test.go
| Also added a standalone index on `ProwJobRunTest.ProwJobID` to support | ||
| variant queries that previously required joining through `prow_job_runs`. |
There was a problem hiding this comment.
Correct the model/table reference for the standalone index.
The text points to ProwJobRunTest.ProwJobID, but this index is on prow_job_run_tests (ProwJobRunTests model field), not prow_job_runs.
Proposed doc fix
-Also added a standalone index on `ProwJobRunTest.ProwJobID` to support
+Also added a standalone index on `ProwJobRunTests.ProwJobID` to support📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Also added a standalone index on `ProwJobRunTest.ProwJobID` to support | |
| variant queries that previously required joining through `prow_job_runs`. | |
| Also added a standalone index on `ProwJobRunTests.ProwJobID` to support | |
| variant queries that previously required joining through `prow_job_runs`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/plans/trt-1989-phase2-indexes.md` around lines 36 - 37, Update the doc
line that incorrectly references ProwJobRunTest.ProwJobID: it should point to
the prow_job_run_tests table / ProwJobRunTests model field instead of
prow_job_runs; replace the reference with either "prow_job_run_tests" or
"ProwJobRunTests.ProwJobID" so the standalone index is correctly attributed to
the prow_job_run_tests model.
|
Scheduling required tests: |
|
/hold |
9cefc7d to
66753c4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
docs/plans/trt-1989-phase3-query-optimization.md (1)
28-31:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAlign the guiding principle with the rollout rules.
This principle conflicts with later sections that replace filters and drop joins after local-column validation. Please reword it to: keep join-based filters initially, then replace/drop when validated safe.
Proposed doc fix
-1. **Add filters, don't replace** — keep existing join-based filters - alongside the new local filters. The planner uses whichever is cheaper. - Removing them risks behavior change if denormalized data has gaps. +1. **Add filters first, then replace when validated** — keep existing + join-based filters alongside new local filters during rollout. + After local denormalized columns are validated, replace old filters + and drop no-longer-needed joins where safe.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/trt-1989-phase3-query-optimization.md` around lines 28 - 31, Update the principle titled "Add filters, don't replace" so it matches the rollout rules: reword the sentence to state that join-based filters should be retained initially alongside new local filters and only replaced/dropped after local-column validation proves it safe; ensure the new wording explicitly says "keep join-based filters initially, then replace/drop when validated safe" and that it aligns with later sections that describe replacing filters and dropping joins post-validation.docs/plans/trt-1989-partitioning-prep.md (1)
86-88:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix inverted FK direction in migration constraints.
This still states FKs are from
prow_job_runsto annotations/PR join rows, but those tables referenceprow_job_runs. Please flip the direction to avoid incorrect partitioning/FK planning.Proposed doc fix
-3. Tables with FKs **from** `prow_job_runs` to non-partitioned tables - (annotations, pull request join table) must either be co-partitioned - or have their FKs dropped. +3. Tables with FKs **to** `prow_job_runs` from non-partitioned tables + (annotations, pull request join table) must either be co-partitioned + or have their FKs dropped.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/trt-1989-partitioning-prep.md` around lines 86 - 88, The sentence incorrectly states foreign keys are "from prow_job_runs to annotations/pull request join table"; update the wording to say the annotations and pull request join table have FKs referencing prow_job_runs (i.e., FKs are to prow_job_runs from those tables) and therefore those referencing tables must be co-partitioned with prow_job_runs or have their FKs dropped; locate and change the phrase mentioning prow_job_runs, annotations, and pull request join table so the FK direction is flipped to "annotations/PR join rows -> prow_job_runs" and clarify the co-partition/drop requirement.
🧹 Nitpick comments (1)
pkg/db/query/test_queries.go (1)
281-291: ⚡ Quick winAdd short inline comments for major query sections in changed builders.
The modified JOIN/WHERE/grouping sections are non-trivial and currently undocumented in-place; please add brief “why”-focused comments for the main sections.
As per coding guidelines, "BigQuery and SQL query-building code should have inline comments explaining the purpose of each major query section (CTEs, JOINs, window functions, WHERE clauses)."
Also applies to: 321-325, 336-339
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/db/query/test_queries.go` around lines 281 - 291, The query builder chains (the q := dbc.DB.Table(...) blocks that join prow_job_run_test_outputs -> prow_job_run_tests -> prow_job_runs -> prow_jobs and the grouped WHERE clauses using current_date - interval '14' day and release/test filters) need short inline “why” comments: add brief comments above the JOIN group explaining why these tables are joined and what relationship/intent each JOIN captures, and above the WHERE block explaining the purpose of the 14-day time filters and why release/test filters are applied; do the same for the other similar q := dbc.DB.Table(...) builders in this file (the other blocks that start the same chain and contain the WHERE/grouping logic) so each major section (JOINs, time-window WHEREs, release/test filters, and any grouping/aggregation sections) has a succinct purpose-focused comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/db/query/test_queries.go`:
- Around line 285-291: Remove the legacy parent-table predicates that
over-constrain the denormalized TestOutputs path: drop the
Where("prow_job_runs.timestamp > current_date - interval '14' day") and
Where("prow_jobs.release = ?", release) clauses from the query so that filtering
relies on the child-table fields
(prow_job_run_test_outputs.prow_job_run_test_timestamp,
prow_job_run_test_outputs.prow_job_run_test_release,
prow_job_run_tests.prow_job_run_timestamp,
prow_job_run_tests.prow_job_run_release, and prow_job_run_tests.test_id = (?))
and thus preserves the denormalized/partition/index benefits.
---
Duplicate comments:
In `@docs/plans/trt-1989-partitioning-prep.md`:
- Around line 86-88: The sentence incorrectly states foreign keys are "from
prow_job_runs to annotations/pull request join table"; update the wording to say
the annotations and pull request join table have FKs referencing prow_job_runs
(i.e., FKs are to prow_job_runs from those tables) and therefore those
referencing tables must be co-partitioned with prow_job_runs or have their FKs
dropped; locate and change the phrase mentioning prow_job_runs, annotations, and
pull request join table so the FK direction is flipped to "annotations/PR join
rows -> prow_job_runs" and clarify the co-partition/drop requirement.
In `@docs/plans/trt-1989-phase3-query-optimization.md`:
- Around line 28-31: Update the principle titled "Add filters, don't replace" so
it matches the rollout rules: reword the sentence to state that join-based
filters should be retained initially alongside new local filters and only
replaced/dropped after local-column validation proves it safe; ensure the new
wording explicitly says "keep join-based filters initially, then replace/drop
when validated safe" and that it aligns with later sections that describe
replacing filters and dropping joins post-validation.
---
Nitpick comments:
In `@pkg/db/query/test_queries.go`:
- Around line 281-291: The query builder chains (the q := dbc.DB.Table(...)
blocks that join prow_job_run_test_outputs -> prow_job_run_tests ->
prow_job_runs -> prow_jobs and the grouped WHERE clauses using current_date -
interval '14' day and release/test filters) need short inline “why” comments:
add brief comments above the JOIN group explaining why these tables are joined
and what relationship/intent each JOIN captures, and above the WHERE block
explaining the purpose of the 14-day time filters and why release/test filters
are applied; do the same for the other similar q := dbc.DB.Table(...) builders
in this file (the other blocks that start the same chain and contain the
WHERE/grouping logic) so each major section (JOINs, time-window WHEREs,
release/test filters, and any grouping/aggregation sections) has a succinct
purpose-focused comment.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 1d655e29-3ac0-40c3-bada-f206e2d07cf1
📒 Files selected for processing (16)
cmd/sippy/seed_data.godocs/plans/trt-1989-partitioning-prep.mddocs/plans/trt-1989-phase2-indexes.mddocs/plans/trt-1989-phase3-query-optimization.mdpkg/api/componentreadiness/dataprovider/postgres/provider.gopkg/api/job_runs_test.gopkg/api/recent_test_failures.gopkg/dataloader/prowloader/prow.gopkg/db/db.gopkg/db/functions.gopkg/db/models/prow.gopkg/db/query/build_clusters.gopkg/db/query/job_queries.gopkg/db/query/test_queries.gopkg/db/views.gopkg/sippyserver/pr_new_tests_worker_test.go
✅ Files skipped from review due to trivial changes (1)
- docs/plans/trt-1989-phase2-indexes.md
🚧 Files skipped from review as they are similar to previous changes (12)
- pkg/api/job_runs_test.go
- pkg/db/query/job_queries.go
- pkg/api/componentreadiness/dataprovider/postgres/provider.go
- pkg/db/functions.go
- pkg/db/db.go
- pkg/api/recent_test_failures.go
- cmd/sippy/seed_data.go
- pkg/sippyserver/pr_new_tests_worker_test.go
- pkg/db/query/build_clusters.go
- pkg/db/views.go
- pkg/db/models/prow.go
- pkg/dataloader/prowloader/prow.go
|
Scheduling required tests: |
|
Scheduling required tests: |
|
@neisw: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Depends on #3541, updates queries to use new indexes.
Summary by CodeRabbit
Chores
Bug Fixes
Tests