Skip to content

ci(SP-4166): add reusable schema sync check workflow#5

Open
isasmendiagus wants to merge 1 commit intomainfrom
feat/SP-4166/schema-sync-ci
Open

ci(SP-4166): add reusable schema sync check workflow#5
isasmendiagus wants to merge 1 commit intomainfrom
feat/SP-4166/schema-sync-ci

Conversation

@isasmendiagus
Copy link
Contributor

@isasmendiagus isasmendiagus commented Mar 19, 2026

Summary

  • Adds a reusable workflow schema-sync-check.yml that consumer repos can call to verify their vendored schema is in sync with the source of truth
  • Inputs: source-file, target-file, fail-on-diff
  • On drift: shows diff and provides a curl command to fix

Test plan

  • Consumer repo (scanoss.py) calls this workflow and CI passes when schemas are in sync
  • Consumer repo CI fails with diff when schemas are out of sync

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Added a reusable CI workflow that validates vendored schema files remain in sync with source schemas. It accepts configurable inputs for source/target paths and a fail-on-diff option. When differences are detected it reports status, provides guidance to update the vendored file, and can optionally fail the run to enforce compliance.

@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94446814-41f3-4f6f-8b71-ff71c4a59654

📥 Commits

Reviewing files that changed from the base of the PR and between 014db52 and 5719594.

📒 Files selected for processing (1)
  • .github/workflows/schema-sync-check.yml
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/schema-sync-check.yml

📝 Walkthrough

Walkthrough

Introduces a reusable GitHub Actions workflow that checks whether a source schema file and a vendored target schema file are identical, accepts source-file, target-file, and fail-on-diff inputs, runs a unified diff, and reports results via GitHub annotations and outputs.

Changes

Cohort / File(s) Summary
New Schema Sync Workflow
​.github/workflows/schema-sync-check.yml
Adds a reusable workflow_call workflow with inputs source-file, target-file, fail-on-diff. Checks out caller and an external schema repo, validates file existence, runs diff -u, sets result output, and emits error or warning depending on fail-on-diff.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Caller Workflow
  participant Runner as GitHub Runner
  participant RepoA as Caller Repo (working dir)
  participant SchemaRepo as External Schema Repo

  Caller->>Runner: invoke schema-sync-check with inputs
  Runner->>RepoA: checkout caller repo
  Runner->>SchemaRepo: checkout external schema repo into .schema-source
  Runner->>Runner: resolve absolute paths to source & target files
  Runner->>Runner: validate both files exist (error if missing)
  Runner->>Runner: run `diff -u source target`
  alt no differences
    Runner->>Caller: log "schemas in sync" and set result=ok
  else differences
    Runner->>Caller: log error, print example curl, set result=out-of-sync
    alt fail-on-diff = true
      Runner->>Caller: emit failing ::error:: and exit 1
    else fail-on-diff = false
      Runner->>Caller: emit ::warning:: (non-failing)
    end
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I nibbled lines of schema bright,
Tuned the diff by lantern light,
Source and vendor side by side,
I sniffed the mismatch, gave a sigh—
Hop! Sync restored, I bound with pride.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main change: addition of a reusable schema sync check workflow, with proper CI scope prefix and ticket reference.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/SP-4166/schema-sync-ci
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.github/workflows/schema-sync-check.yml (2)

28-32: Consider making the schema repository name configurable.

The workflow assumes the schema repository is always named schema under the same owner. This works for your current use case but limits reusability. If you want broader adoption, consider adding an optional input for the repository name.

Additionally, no ref is specified, so it defaults to the repository's default branch. This is likely intentional for "source of truth" semantics, but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/schema-sync-check.yml around lines 28 - 32, Update the
"Checkout schema repo" step to accept a configurable repository name instead of
hardcoding repository: ${{ github.repository_owner }}/schema; add a workflow
input (e.g. inputs.schema_repo or INPUT_SCHEMA_REPO) and use it in the checkout
action so the value can be overridden, and optionally add an input for ref
(branch/tag) so actions/checkout@v4 can use a specific ref; make sure to keep
the checkout target path as .schema-source to preserve downstream references.

37-60: Avoid direct interpolation of inputs in shell scripts.

Directly interpolating ${{ inputs.target-file }} into the shell script (line 39) could be risky if inputs contain shell metacharacters. While workflow_call inputs typically come from trusted workflow files, it's safer to pass inputs via environment variables.

♻️ Proposed fix using environment variables
       - name: Compare schemas
         id: compare
         shell: bash
+        env:
+          SOURCE_FILE: ${{ inputs.source-file }}
+          TARGET_FILE: ${{ inputs.target-file }}
+          REPO_OWNER: ${{ github.repository_owner }}
         run: |
-          source_path=".schema-source/${{ inputs.source-file }}"
-          target_path="${{ inputs.target-file }}"
+          source_path=".schema-source/${SOURCE_FILE}"
+          target_path="${TARGET_FILE}"
 
           if [ ! -f "$source_path" ]; then
-            echo "::error::Source schema not found: ${{ inputs.source-file }}"
+            echo "::error::Source schema not found: ${SOURCE_FILE}"
             exit 1
           fi
 
           if [ ! -f "$target_path" ]; then
             echo "::error::Local vendored schema not found: $target_path"
             exit 1
           fi
 
           if diff -u "$source_path" "$target_path"; then
             echo "Schema is in sync."
           else
             echo ""
             echo "::error::Schema out of sync: $target_path"
             echo ""
             echo "To fix, run:"
-            echo "  curl -sL https://raw.githubusercontent.com/${{ github.repository_owner }}/schema/main/${{ inputs.source-file }} -o $target_path"
+            echo "  curl -sL https://raw.githubusercontent.com/${REPO_OWNER}/schema/main/${SOURCE_FILE} -o $target_path"
             echo "result=out-of-sync" >> "$GITHUB_OUTPUT"
           fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/schema-sync-check.yml around lines 37 - 60, The workflow
directly interpolates inputs like `${{ inputs.target-file }}` into the shell
script (used to build `source_path` and `target_path`), which can allow shell
metacharacters to be interpreted; change the job step to pass inputs into the
shell via environment variables (e.g., set SOURCE_FILE and TARGET_FILE in the
step's env) and then reference those env vars inside the script when
constructing `source_path` and `target_path`, and ensure any values written to
`GITHUB_OUTPUT` (e.g., `result=out-of-sync`) continue to be appended safely;
update the occurrences that reference `${{ inputs.source-file }}` and `${{
inputs.target-file }}` in the script to use the new env vars instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/schema-sync-check.yml:
- Line 58: The curl line hardcodes "main"; add a workflow input (e.g.,
source-ref with default "main") and use that input instead of the literal "main"
in the echo/curl command (the line containing curl -sL
https://raw.githubusercontent.com/${{ github.repository_owner }}/schema/main/${{
inputs.source-file }} -o $target_path) and also use the same input when running
the checkout step (the checkout action's ref) so both checkout and the curl
fetch the same branch/ref; update references to use ${{ inputs.source-ref }} (or
similarly named input) throughout the job.

---

Nitpick comments:
In @.github/workflows/schema-sync-check.yml:
- Around line 28-32: Update the "Checkout schema repo" step to accept a
configurable repository name instead of hardcoding repository: ${{
github.repository_owner }}/schema; add a workflow input (e.g. inputs.schema_repo
or INPUT_SCHEMA_REPO) and use it in the checkout action so the value can be
overridden, and optionally add an input for ref (branch/tag) so
actions/checkout@v4 can use a specific ref; make sure to keep the checkout
target path as .schema-source to preserve downstream references.
- Around line 37-60: The workflow directly interpolates inputs like `${{
inputs.target-file }}` into the shell script (used to build `source_path` and
`target_path`), which can allow shell metacharacters to be interpreted; change
the job step to pass inputs into the shell via environment variables (e.g., set
SOURCE_FILE and TARGET_FILE in the step's env) and then reference those env vars
inside the script when constructing `source_path` and `target_path`, and ensure
any values written to `GITHUB_OUTPUT` (e.g., `result=out-of-sync`) continue to
be appended safely; update the occurrences that reference `${{
inputs.source-file }}` and `${{ inputs.target-file }}` in the script to use the
new env vars instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: baf837e7-af36-4486-a70f-ee934025a02a

📥 Commits

Reviewing files that changed from the base of the PR and between 6891ed6 and 014db52.

📒 Files selected for processing (1)
  • .github/workflows/schema-sync-check.yml

@isasmendiagus isasmendiagus force-pushed the feat/SP-4166/schema-sync-ci branch from 014db52 to 5719594 Compare March 19, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant