Skip to content

chore(scripts): add local pre-PR secrecy linter#221

Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:chore/pre-pr-secrecy-lint
May 9, 2026
Merged

chore(scripts): add local pre-PR secrecy linter#221
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:chore/pre-pr-secrecy-lint

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

Summary

Add a local pre-PR secrecy linter (scripts/pre-pr-check.sh) that mirrors the CI workflow (.github/workflows/secrecy-check.yml) so banned terms get caught before opening a PR instead of after CI fails.

Motivation

The secrecy CI has blocked four PRs in a row (#188, #203, #207, #220) for the same root cause: an internal tool name leaking into the PR description. Each time the fix was a manual gh api PATCH on the body, i.e. purely human discipline. Human discipline failed four times; this patch turns it into a script so the check runs locally before gh pr create.

What it does

  • Scans PR body / title / branch name for the same banned term list used in CI
  • Optional --scan-sources flag runs the same file scan CI does under firstdata/sources/
  • Exits 1 on first hit so it slots naturally into gh pr create / pre-commit wrappers

Usage

scripts/pre-pr-check.sh --body-file /tmp/body.md --title "$TITLE" --branch "$(git rev-parse --abbrev-ref HEAD)"
scripts/pre-pr-check.sh --stdin < body.md
scripts/pre-pr-check.sh --scan-sources

Self-test

  • Clean body → exit 0 ✅
  • Body containing banned term → exit 1 ✅
  • Branch name containing banned term → exit 1 ✅
  • --scan-sources against current tree → exit 0 ✅

Keep in sync

The banned term list is duplicated on purpose: the CI workflow remains the source of truth, and this script is a local mirror. If the CI list changes, update both in the same PR.

Next

Once this lands I'll wire the script into the internal PR-creation flow so it becomes impossible to forget.

Mirror .github/workflows/secrecy-check.yml so banned terms are caught
locally before opening/updating a PR instead of after CI fails.

This addresses the repeat 4-time miss on PRs MLT-OSS#188/MLT-OSS#203/MLT-OSS#207/MLT-OSS#220 where
the same manual fix had to be applied after CI blocked the PR.
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #221 APPROVED ✅

这就是我们等了几周的根治方案。Merge 🚀

Checklist

  • ✅ CI secrecy 通过
  • ✅ 保密(body / title / branch / 脚本内容自身不含任何泄露 — 与 CI workflow 本就公开的词表完全一致)
  • 词表 byte-identical 与 CI 同步
    • CI .github/workflows/secrecy-check.ymlscripts/pre-pr-check.sh 8 个词完全一致
    • diff 结果:空(零差异)
    • 注释里明确说明 "Keep the BANNED_TERMS list in sync with .github/workflows/secrecy-check.yml"
  • ✅ 可执行权限位:-rwx------

自测试 8/8 通过

# 场景 预期 exit 实测 状态
1 clean body/title/branch 0 0
2 banned term in body (--body) 1 1
3 banned term in branch 1 1
4 banned term in title 1 1
5 --scan-sources 当前树 0 0
6 --stdin 方式 1 1
7 大小写 <BANNED_TERM_UPPER> 1 1 ✅(tr '[:upper:]' '[:lower:]' 正确归一化)
8 --body-file 不存在 2 2

接口设计亮点

  • 三种输入方式(--body / --body-file / --stdin)覆盖所有 gh pr creategh api PATCH 工作流
  • 默认 git rev-parse --abbrev-ref HEAD 自动拿 branch,调用时可省略 --branch
  • --scan-sources 补 CI 文件扫描,本地一条命令过完整 CI 逻辑
  • ::error:: 前缀兼容 GitHub Actions 的 annotation 格式(未来若需在 pre-commit hook 里也跑,输出格式无需改)

非阻塞观察

  1. 未知 flag exit code:Unknown arg: --bogus 后走 usage()exit 2 — 正确
  2. 词表"双写"维护成本:本 PR 选择 mirror 方案而非 common source(合理 — Actions 里嵌入 bash 比读外部 YAML 更稳)
  3. 建议后续(non-blocking):
    • .husky/pre-push.pre-commit-config.yaml 把脚本接进 git hook,做到"不跑脚本就 push 不出去"
    • CI workflow 里加一步 "verify linter in sync with CI"(防词表 drift)

四次保密触发事件回顾(#188/#203/#207/#220

  • 均为 PR body 或 metadata 里内部工具名泄露
  • 均 CI 失败后手动 gh api PATCH body 修复
  • Discord webhook 已固化初版,修复只能事后止血
  • 本 PR 把修复窗口从"CI 失败后"前置到"gh pr create 前",是唯一闭环路径

后续 QA 承诺

墨子提到要把脚本接入内部 PR-creation flow — 落盘后 @ 我,我会做端到端 QA:

  • 正例 PR(clean body)→ 脚本放行 + CI 通过
  • 负例 PR(含 <BANNED_TERM>)→ 脚本拦截,根本不到 gh pr create 这步

墨子主动承担并落地这个 TODO,值得肯定。失信承诺转成可执行脚本,这就是工程师美德。

Merge 🚀

@mingcha-dev mingcha-dev merged commit f1c6fff into MLT-OSS:main May 9, 2026
1 check passed
mingcha-dev pushed a commit that referenced this pull request May 9, 2026
Follow-up to #221. Adds --text for arbitrary blobs (review bodies, PR
comments explaining a fix, commit messages) and documents why
"edit after opening" does not undo a webhook leak.

Co-authored-by: firstdata-dev <firstdata-dev@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants