ImpactGuard is a lightweight API impact analyzer that supports Python, TypeScript, Java, Go, Rust, C, C++, and Ruby. It is designed to maintain API stability by tracking function signatures across commits, detecting breaking changes, and analyzing call-site impact using both static and runtime techniques.
It provides a quantitative risk framework to help developers understand the consequences of code changes before they are merged.
- Multi-language Extraction: Automatically extracts function signatures from Python (
ast), TypeScript, Java, Go, Rust, C, C++, and Ruby via tree-sitter grammars (with regex fallback) - Semantic API Diffing: Classifies changes into a taxonomy of breaking (e.g., removing positional arguments) vs. non-breaking (e.g., adding optional keyword-only arguments)
- Impact Analysis: Correlates signature changes with static call-site extraction and optional runtime tracing to identify affected downstream code
- Risk Assessment: Quantifies the danger of a change using the S × E × C × λ (Severity × Exposure × Confidence × Lambda) model
- Automated Remediation: Generates format-preserving patches using LibCST to fix broken call sites
| Component | Module | Description |
|---|---|---|
| Signature Extraction | extract_signatures.py |
AST-based extraction of function metadata |
| Signature Comparison | compare_signatures.py |
Semantic diffing of API changes |
| Call-Site Analysis | extract_calls.py, analyze_module.py |
Static call-site extraction and resolution |
| Impact Analysis | impact_analysis.py |
Correlates changes with call sites |
| Risk Model | risk_model.py |
S × E × C × λ risk scoring |
| Risk Gate | risk_gate.py, enforce_gate.py |
CI enforcement engine |
| Runtime Tracing | trace_calls.py, trace_calls_prod.py |
Development and production tracers |
| Patch Generation | cst_patch.py, patch_generator.py |
Format-preserving automated fixes |
| Reporting | generate_report.py |
Static HTML report generation |
| Robustness Evaluation | tools/robustness_evaluator.py |
Composite robustness score, fragility index, diversity |
| CLI | cli.py |
Command-line interface |
| Language | Extensions | Extraction Backend | Signature Extraction | Call-Site Extraction | Type Annotations | Optional Dependency | Status |
|---|---|---|---|---|---|---|---|
| Python | .py |
ast (stdlib) |
Yes | Yes | Yes | — | Stable |
| TypeScript | .ts, .tsx |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| Java | .java |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| Go | .go |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| Rust | .rs |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| C | .c, .h |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| C++ | .cpp, .hpp, .cc, .cxx, .hxx |
tree-sitter (preferred) / regex fallback | Yes | Yes | Yes / partial | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
| Ruby | .rb |
tree-sitter (preferred) / regex fallback | Yes | Yes | No (no native annotations) | pip install "impactguard[languages]" |
Stable (tree-sitter) · Best-effort (regex) |
Note: All tree-sitter backends require
tree-sitter>=0.23plus the corresponding grammar package (e.g.tree-sitter-java>=0.23), installed together viapip install "impactguard[languages]". When those packages are absent, ImpactGuard automatically falls back to regex-based extraction and emits aUserWarning.To add support for a new language, implement the
LanguageExtractorprotocol and register the extractor withregister().
Prerequisites:
- Python 3.11 or higher
- Dependencies:
libcstfor concrete syntax tree manipulations - For git hooks:
pre-commit>=4.6.0,pyyaml>=6.0
# Install from PyPI
pip install impactguard
# Install with tree-sitter language support (TypeScript, Java, Go, Rust, C, C++, Ruby)
pip install "impactguard[languages]"
# Or install for development
git clone https://github.com/daedalus/ImpactGuard.git
cd ImpactGuard
pip install -e ".[test]"| Path | Description |
|---|---|
src/impactguard/ |
Core package containing the analysis logic, risk model, and CLI |
extract_signatures.py |
Utility for extracting function metadata into JSON/Text |
extract_calls.py |
AST-based call site extractor |
impact_analysis.py |
Logic for correlating signatures with call sites |
risk_gate.py |
The CI-ready enforcement engine |
trace_calls.py |
Runtime instrumentation for capturing live execution data |
SPEC.md |
Technical specification and public API |
The full pipeline can be executed using the impactguard CLI:
# 1. Extract signatures and calls (Python — uses stdlib ast)
impactguard extract $(git ls-files '*.py') > .signatures.json
impactguard extract-calls $(git ls-files '*.py') > .calls.json
# Extract from other supported languages (requires impactguard[languages])
impactguard extract $(git ls-files '*.java' '*.go' '*.rs') > .signatures.json
impactguard extract-calls $(git ls-files '*.java' '*.go' '*.rs') > .calls.json
# 2. Capture runtime exposure (optional)
impactguard trace dump .runtime_calls.json
# 3. Compare and analyze risk
impactguard risk diff.txt .runtime_calls.json report.json
# 4. Generate the report
impactguard report report.json api_report.htmlOr use the Python API:
from impactguard import run_pipeline
result = run_pipeline(
old_path="old_version/",
new_path="new_version/",
runtime_path="runtime.json", # optional
output_path="report.html" # optional
)
print(f"Breaking changes: {len(result['comparison']['breaking'])}")ImpactGuard operates as a pipe-and-filter architecture where artifacts from one stage inform the next.
The first stage involves deep inspection of source files. For Python, the ast stdlib module is used to walk the Abstract Syntax Tree. For all other supported languages (TypeScript, Java, Go, Rust, C, C++, Ruby) tree-sitter grammars provide accurate, battle-tested AST parsing with a regex fallback when tree-sitter packages are absent. Every supported language produces the same schema: Fully Qualified Name (FQN), parameters, defaults, and decorators/annotations.
- Key Component:
extract_signatures.py(Python) ·src/impactguard/languages/(all languages) - Output:
.signatures.json - Role: Establishes the baseline of the API surface
Once two snapshots of a codebase exist (e.g., HEAD vs main), the compare utility performs a semantic diff. Unlike a standard text-based diff, this stage understands Python's parameter rules. It categorizes changes into Breaking (e.g., removing a parameter, reordering positional arguments) and Non-breaking (e.g., adding an optional keyword argument).
- Key Component:
compare_signatures.py - Output: A structured list of semantic changes
- Role: Identifies exactly how the API contract has evolved
To understand the "blast radius" of a change, ImpactGuard must find where the modified functions are actually used. This is achieved through two complementary approaches:
- Lightweight Extraction: Rapidly finding call nodes in the AST
- Deep Module Analysis: Tracking imports and assignments to resolve method calls to their actual definitions (FQN resolution)
- Key Components:
extract_calls.pyandanalyze_module.py - Output:
.calls.json - Role: Maps the internal dependency graph of the codebase
The final stage of the core pipeline, analyze, correlates the detected API changes with the discovered call sites. It validates whether the arguments passed at a specific call site still satisfy the requirements of the new function signature. If runtime data is available, it is integrated here to provide context on how often a specific impacted path is actually executed.
- Key Component:
impact_analysis.py - Input: Signature diffs, call-site data, and optional runtime traces
- Role: Pinpoints exactly which lines of code are broken by a change
These changes do NOT break existing callers:
- Adding optional parameters:
def foo(a, b=1)→def foo(a, b=1, c=0)(no callers need to change) - Adding keyword-only arguments:
def foo(a)→def foo(a, *, debug=False)(existing callers unaffected) - Adding new functions/classes: Entirely new APIs that don't affect existing code
- Adding
*argsor**kwargs:def foo(a)→def foo(a, *args)(backward compatible)
These changes WILL break existing callers:
- Removing required parameters:
def foo(a, b)→def foo(a)(callers passingbwill fail) - Reordering positional arguments:
def foo(a, b)→def foo(b, a)(callers' positional args swap) - Removing functions/methods: Any callable that's removed entirely
- Changing parameter types:
def foo(a: int)→def foo(a: str)(type safety breaks)
The Risk Model and Enforcement subsystem is the decision-making engine of ImpactGuard. It transforms raw signature changes and runtime telemetry into actionable risk levels (HIGH, MEDIUM, LOW, or UNKNOWN). These levels are then used to automatically block or permit CI/CD pipelines based on the potential impact on consumers.
The core logic resides in risk_model.py. It quantifies risk by evaluating three distinct dimensions, scaled by a tuneable sensitivity multiplier λ:
| Component | Code Entity | Description |
|---|---|---|
| Severity (S) | get_severity() |
Score (0.1 to 1.0) based on change type (e.g., REMOVED = 1.0, ADDED = 0.1) |
| Exposure (E) | exposure() |
Logarithmic scale mapping call counts to a 0.0-1.0 range |
| Confidence (C) | confidence() |
Measures data reliability based on sample size against a threshold |
| Lambda (λ) | --lambda / lambda_ |
Sensitivity multiplier (default 1.0). Values >1 increase sensitivity; values <1 decrease it |
| Classification | classify() |
Uses a decision tree to assign the final risk label |
Exposure Calculation: min(1.0, log(1 + count) / log(1 + max_count))
Sensitivity Tuning:
--lambda=2— doubles effective severity, making ImpactGuard more sensitive (more changes flagged HIGH/MEDIUM)--lambda=0.5— halves effective severity, making ImpactGuard less sensitive (fewer changes flagged HIGH/MEDIUM)
The risk assessment is operationalized through risk_gate.py and enforce_gate.py:
- Risk Gate Execution:
risk_gate.pycontains therun()function which parses the diff and runtime data to generate a comprehensivereport.json - Gate Enforcement:
enforce_gate.pyconsumes this report:- If any item is flagged as
HIGHrisk → exits with code1(blocks build) - If
UNKNOWNrisks are detected → issues a warning but allows build (exit code0)
- If any item is flagged as
The Runtime Tracing subsystem provides dynamic analysis capabilities to complement ImpactGuard's static analysis pipeline. By observing actual execution patterns, the system captures "Exposure" data which is used by the risk_model.py to weight the impact of breaking changes.
Designed for test suites and local execution where performance is less critical than data accuracy. It uses an @trace decorator to capture not just call counts, but also signature metadata like argument counts and keyword argument names.
- Key Mechanism: Uses
inspect.signature(func).bind_partial(*args, **kwargs)to validate and record invocations - Integration: Commonly used via
install_tracer()in test fixtures
Optimized for minimal overhead in live environments. It employs a probabilistic sampling strategy (default 1%) to capture a representative subset of traffic.
- Sampling Logic: Only records data if
random.random() < SAMPLE_RATE - Background Flushing: Periodically flushes captured counts to disk (default every 10 seconds)
| Feature | Development Tracer | Production Sampler |
|---|---|---|
| Primary Goal | Deep visibility / Test coverage | Low overhead monitoring |
| Data Captured | Counts + Arg structure | Call Counts only |
| Sampling | 100% (No sampling) | 1% (Adjustable) |
| Storage Trigger | Manual dump() call |
Periodic flush() (10s interval) |
The Patch Generation subsystem transforms identified impact risks into actionable code fixes. It provides a multi-tiered approach to remediation, ranging from high-level suggestions to precise, format-preserving code transformations using Concrete Syntax Trees (CST).
The system first generates high-level suggestions based on the nature of the breaking change. For simple scenarios, it employs a naive line-based patching strategy using Python's difflib.
- Logic Location:
suggest_fixes.pyanalyzes issues to recommend actions - Naive Patching:
patch_generator.pyusesdifflib.unified_difffor simple string replacement
To handle complex code structures, ImpactGuard utilizes LibCST. Unlike standard AST, a Concrete Syntax Tree preserves formatting, comments, and whitespace.
- Transformers: Uses
AddDefaultTransformerto modify function signatures andFixCallTransformerto inject missing arguments into call sites - Safety: Gracefully falls back to simpler methods if
libcstis not installed
Every generated patch is assigned a confidence score (0.0 to 1.0) to determine if it can be auto-applied:
- Target Certainty (T): How sure we are that we found the correct line
- Structural Safety (S): Is the change a simple default addition or a risky positional reorder?
- Semantic Risk (R): Does the change affect required parameters?
- Complexity Penalty (C): Is the code heavily decorated or nested?
The impactguard command-line tool is the primary entry point for developers and automation scripts.
impactguard [-h] [--version]
{extract,compare,analyze,risk,report,enforce,suggest,patch,extract-calls,trace,check,check-diff,check-commit,check-commits,install-hooks,generate-changelog,baseline,semver,report-markdown,feedback,history} ...
ImpactGuard - API impact analyzer for Python
positional arguments:
{extract,compare,analyze,risk,report,enforce,suggest,patch,extract-calls,trace,check,check-diff,check-commit,check-commits,install-hooks,generate-changelog,baseline,semver,report-markdown,feedback,history}
Available commands
extract Extract function signatures from source files
compare Compare signature snapshots
analyze Analyze impact on call sites
risk Run risk analysis
report Generate HTML report
enforce Enforce gate - block on HIGH risk
suggest Generate fix suggestions from risk report
patch Generate CST-based patches
extract-calls Extract call sites from source files
trace Runtime tracing
check Run full ImpactGuard pipeline check
check-diff Run full pipeline on a unified diff / patch file
check-commit Run full pipeline on a single git commit vs its parent
check-commits Compare two git commits
install-hooks Install git hooks for ImpactGuard
generate-changelog Generate changelog from signature diffs
baseline Manage ImpactGuard signature baselines
semver Suggest semver bump from two signature snapshots
report-markdown Generate markdown PR comment from risk report JSON
feedback Manage patch-outcome feedback for confidence calibration
history Manage tagged release-history baselines
options:
-h, --help show this help message and exit
--version show program's version number and exit
# Compare two versions of your code
impactguard old_version/ new_version/
# Compare two git commits directly
impactguard check-commits HEAD~1 HEAD
# Compare specific files between commits
impactguard check-commits HEAD~1 HEAD --files src/module.py src/utils.pyimpactguard extract file1.py file2.py
impactguard compare old_sigs.json new_sigs.json
impactguard analyze signatures.json calls.json runtime.json
impactguard risk diff.txt runtime.json output.json
impactguard report risk.json output.html
impactguard trace install mypackage
impactguard trace dump runtime.json
impactguard install-hooks . --both # Install git hooksImpactGuard uses the pre-commit framework to manage git hooks with proper YAML configuration.
# Install both pre-commit and post-commit hooks
impactguard install-hooks .
# Install only pre-commit hook
impactguard install-hooks . --pre
# Install only post-commit hook
impactguard install-hooks . --post
# Install hooks + GitHub Actions workflow
impactguard install-hooks . --install-github-workflowThe install-hooks command:
- Creates/updates
.pre-commit-config.yamlwith ImpactGuard hooks (using PyYAML for proper formatting) - Runs
pre-commit installandpre-commit install --hook-type post-commit - Optionally generates
.github/workflows/impactguard.ymlfor CI/CD
Hook behavior:
- Pre-commit: Runs full ImpactGuard pipeline (
check-diff --pipe) on staged changes - Post-commit: Runs
check-commit HEAD+ updates signature tracking
The impactguard package exports its core functionality for programmatic integration.
from impactguard import run_pipeline, quick_check, run_pipeline_git, ImpactGuard
# Full pipeline - extract, compare, analyze, risk, report
result = run_pipeline(
old_path="src/",
new_path="src/",
runtime_path="runtime.json",
output_path="report.html"
)
# Quick comparison only (extract + compare)
changes = quick_check("old/", "new/")
print(f"Breaking: {len(changes['comparison']['breaking'])}")
# Compare git commits
result = run_pipeline_git(
old_ref="HEAD~1",
new_ref="HEAD",
files=["src/module.py"]
)
# Use ImpactGuard class for more control
guard = ImpactGuard()
report = guard.check("old/", "new/", output="report.html")from impactguard import extract, compare, analyze_impact
# Extract signatures from Python files
signatures = extract(["src/module.py", "src/other.py"])
# Extract from other supported languages (tree-sitter backend)
signatures = extract(["src/main.go", "src/utils.go"])
signatures = extract(["src/lib.rs", "src/main.rs"])
# Compare two signature snapshots
result = compare("old_sigs.json", "new_sigs.json")
print(f"Breaking changes: {len(result['breaking'])}")
# Analyze impact on call sites
issues = analyze_impact("signatures.json", "calls.json", "runtime.json")ImpactGuard integrates deeply into the standard Git development workflow using the pre-commit framework.
Runs the complete ImpactGuard pipeline on staged changes before allowing a commit:
impactguard check-diff --pipe --runtime .runtime_calls.jsonThis catches breaking changes early, before they enter the commit history.
After each commit, the post-commit hook:
- Runs
check-commit HEADto analyze the committed changes - Updates
.signatures.txtwith current function signatures
Generate a ready-to-use CI workflow with:
impactguard install-hooks . --install-github-workflowThis creates .github/workflows/impactguard.yml that:
- Triggers on push/PR to
main/master - Runs
check-commitsfor pull requests - Runs
check-commitfor direct pushes - Uses
impactguard[all]for full language support
The hooks use these entry points (automatically configured):
impactguard-check-staged→ runs pipeline on staged diffimpactguard-post-commit-hook→ runs post-commit analysis
The CI pipeline is defined in .github/workflows/ci.yml and executes on all pushes and pull requests targeting the master branch. It consists of three parallel jobs:
- Test Matrix: Executes
pytestacross Python versions 3.11, 3.12, and 3.13 - Static Analysis (Linting): Runs
ruff,prospector,semgrep, andmypy - Build Verification: Ensures the package can be successfully built via
twine check
ImpactGuard uses modern Python packaging standards with hatchling as the build backend.
Dependency Groups:
| Group | Purpose | Key Tools |
|---|---|---|
dev |
General development | hatch, pip-api |
test |
Automated testing | pytest, hypothesis |
lint |
Static analysis | ruff, mypy, semgrep |
Release Automation:
- Version Management: Uses
bumpversionto maintain consistency acrosspyproject.tomlandsrc/impactguard/__init__.py - Automated Publishing: The
pypi-publish.ymlworkflow triggers on GitHub Release events to build and publish to PyPI using Trusted Publishers (OIDC)
The ImpactGuard test suite ensures the reliability of the signature extraction pipeline, the accuracy of the risk model, and the stability of the CLI. The project maintains a strict quality gate, requiring a minimum of 80% code coverage.
- Unit Tests: Isolated testing of individual modules (extraction, comparison, patching)
- Integration Tests: End-to-end CLI flows and public API surface validation
- Coverage Enforcement: Automated checks to ensure the codebase meets the 80% threshold
| Fixture | Description |
|---|---|
sample_signature_data |
Returns a list of dictionaries representing serialized function signatures |
sample_signatures_file |
Creates a temporary .json file containing signature data |
sample_python_file |
Generates a temporary .py file with functions and classes |
runtime_data_file |
Provides a temporary JSON file simulating tracer output |
- Signature Extraction — Parses Python AST (stdlib) or tree-sitter grammars (TypeScript, Java, Go, Rust, C, C++, Ruby) to extract function signatures with full structural information
- API Diff — Compares signature snapshots to detect removed functions, added required args, positional reordering, and other breaking changes
- Call-Site Analysis — Combines signature data with call-site extraction to predict which callers will break
- Runtime Validation — Instruments functions during test runs to record actual call patterns
- Pipeline Orchestrator — Connects all components in one unified workflow (
run_pipeline()) - Git Integration — Compare any two git commits directly (
run_pipeline_git())
The pipeline relies on standardized JSON schemas to pass data between filters:
| Artifact | Producer | Consumer | Description |
|---|---|---|---|
.signatures.json |
extract_signatures.py |
compare_signatures.py, impact_analysis.py |
Function metadata including arguments, defaults, and line numbers |
.calls.json |
extract_calls.py |
impact_analysis.py |
Static call sites mapped by caller and callee |
.runtime_calls.json |
trace_calls.py |
impact_analysis.py, risk_gate.py |
Frequency and argument data from execution |
report.json |
risk_gate.py |
generate_report.py, suggest_fixes.py |
Final risk classifications (HIGH/MEDIUM/LOW) |
ImpactGuard has been tested on itself to validate its own API changes:
# Extract signatures from own codebase
$ impactguard extract src/impactguard/*.py
✓ Extracted 98 function signatures
# Detect non-breaking change (added optional parameter)
✓ Correctly classified as "Non-breaking changes: 1"
# Detect breaking change (removed required parameter)
✓ Correctly classified as "Breaking changes: 1"
# Run full pipeline on itself
$ impactguard check-commits HEAD~5 HEAD
✓ Pipeline orchestrator completed successfully
✓ Generated HTML report with risk analysisImpactGuard follows strict quality gates:
- Ruff — 0 issues (formatting + linting)
- MyPy — 0 errors (strict mode)
- Prospector — 0 warnings
- Semgrep — 0 findings
- Coverage — ≥80% (target)
- Tests — All passing
- Signature: A structured representation of a callable's interface, including positional arguments, keyword-only arguments, variadic arguments, and return type. Supported for Python, TypeScript, Java, Go, Rust, C, C++, and Ruby.
- FQName (Fully Qualified Name): A unique identifier in
file_path:function_nameformat (e.g.,src/auth.py:login) - Breaking Change: A modification that prevents existing callers from executing correctly (e.g.,
REMOVED,REQUIRED_POSITIONAL_ADDED,POSITIONAL_REORDER)
- Severity (S): The technical impact of the change type (0.1 to 1.0)
- Exposure (E): How often the function is called, calculated logarithmically
- Confidence (C): The reliability of runtime data based on sample size
- Lambda (λ): Sensitivity multiplier (default 1.0); tune via
--lambda
- CST (Concrete Syntax Tree): Unlike AST, preserves formatting, comments, and whitespace
- Patch Confidence: A score from 0.0 to 1.0 representing the likelihood that an automated fix is correct
The table below compares ImpactGuard against the tools most commonly used for Python API change management, static analysis, and release automation. As of 2026-05, to our knowledge:
| Feature | ImpactGuard | griffe | python-semantic-release | commitizen | pyright / mypy |
|---|---|---|---|---|---|
| AST-based signature extraction | ✅ Full — Python (ast), TypeScript/Java/Go/Rust/C/C++/Ruby (tree-sitter) |
✅ Full (Python) | ❌ | ❌ | ✅ (internal only) |
| Breaking-change detection | ✅ Semantic diff (added / removed / modified) | ✅ | ❌ Code-unaware | ❌ Code-unaware | |
| Call-site impact analysis | ✅ Static call-site traversal | ❌ | ❌ | ❌ | ❌ |
| Runtime call tracing | ✅ (test + production sampler) | ❌ | ❌ | ❌ | ❌ |
| Risk scoring (S × E × C × λ model) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Transitive impact tracking | ✅ | ❌ | ❌ | ❌ | ❌ |
| Semver bump recommendation | ✅ From code diff | ✅ From commit msgs | ✅ From commit msgs | ❌ | |
| Changelog generation | ✅ From signature diff | ✅ From commit msgs | ✅ From commit msgs | ❌ | |
| HTML report | ✅ | ❌ | ❌ | ❌ | ❌ |
| Patch generation (CST-based) | ✅ Formatting-preserving | ❌ | ❌ | ❌ | |
| Patch confidence scoring | ✅ | ❌ | ❌ | ❌ | ❌ |
| Baseline management | ✅ Save / compare / diff | ❌ | ❌ | ❌ | |
| CI enforcement gate | ✅ Blocks on HIGH / UNKNOWN | ❌ | ✅ (release gate) | ✅ (lint gate) | ✅ (type gate) |
| Git hook integration | ✅ Pre + post commit | ❌ | ❌ | ✅ | ❌ |
| Config file (TOML) | ✅ impactguard.toml |
✅ | ✅ | ✅ | ✅ |
| Watch mode (live re-run) | ✅ --watch |
❌ | ❌ | ❌ | ✅ |
| No network required | ✅ | ✅ | ❌ (PyPI / git) | ❌ (git) | ✅ |
| Tool | Domain | Overlap with ImpactGuard | What ImpactGuard adds |
|---|---|---|---|
| griffe | Python API docs + diff | Closest alternative — extracts signatures, detects breaking changes | Call-site analysis, runtime tracing, risk model, patch generation |
| python-semantic-release | Automated releases + semver | Semver bumps from conventional commits | Code-level proof, not just commit message convention |
| commitizen | Conventional commits + changelog | Changelog generation, git hooks | Actual API-level analysis and enforcement |
| bump2version / bumpversion | Version string management | Version bumping | All analysis features |
| mypy / pyright | Static type checking | Detects type-incompatible changes | Call-site impact, risk scoring, runtime data integration |
| japicmp / apidiff (Go/Java) | API compatibility in Java / Go | Direct conceptual analog in other languages | Python + TypeScript + Java + Go + Rust + C/C++ + Ruby support, runtime tracing, patch generation |
- Risk scoring (S × E × C × λ) — No competitor combines severity, exposure (call count), and confidence into a single risk score.
- Runtime + static fusion — Merges static call-site analysis with actual runtime call counts from test runs to give empirically grounded risk levels.
- Transitive impact — Tracks callers of callers, not just direct call sites.
- CST-based patch generation — Suggests and previews source patches that preserve original formatting; no competitor does this in the API-change domain.
- Patch confidence scoring — Quantifies how safe an automated fix is before applying it.
- Fully offline — No network access, no database; embeds entirely in a Python project.
The Robustness Evaluator computes a composite project-level Robustness Score (R) from test-suite metrics, placing extra emphasis on adversarial test performance. It also reports an Adversarial Fragility Index (F) that isolates how much adversarial inputs specifically degrade the system.
| Metric | Formula | Description |
|---|---|---|
| R | C × (α × P_a + (1−α) × P_n) |
Composite Robustness Score — overall health in [0, 1] |
| R_d | C × D × (α × P_a + (1−α) × P_n) |
R with category diversity penalty |
| F | 1 − (P_a / P_n) |
Fragility Index — how much adversarial conditions hurt you |
| D | categories_with_≥1_pass / total_categories |
Diversity ratio |
Input symbols:
| Symbol | Meaning |
|---|---|
C |
Coverage ratio (0 – 1) |
α |
Adversarial weight; recommended 0.5 (general), 0.65 (security), 0.75 (red-team) |
P_a |
Adversarial pass rate (passing_adv / n_adversarial) |
P_n |
Normal pass rate (passing_norm / n_normal) |
Robustness labels: EXCELLENT (≥ 0.80) · GOOD (≥ 0.65) · FAIR (≥ 0.45) · POOR (< 0.45)
Fragility labels: ROBUST (F ≤ 0.10) · MODERATE (≤ 0.25) · BRITTLE (≤ 0.50) · VERY_BRITTLE (> 0.50)
The tool enforces a minimum 25% adversarial coverage requirement (exits with code 1 when unmet).
| Category | Target % of adversarial budget | Example |
|---|---|---|
| Boundary/edge cases | 30% | Inputs at decision boundaries |
| Semantic perturbation | 25% | Same meaning, different form |
| Evasion/obfuscation | 25% | Encoding tricks, reformulation |
| Compositional attacks | 20% | Multi-step, chained inputs |
Python API:
from tools.robustness_evaluator import evaluate_robustness, CategoryStats
result = evaluate_robustness(
n_total=1054,
n_adversarial=425,
passing_adv=424,
passing_norm=629,
coverage=0.57,
alpha=0.65, # security context
categories=[
CategoryStats("boundary", 28, 28),
CategoryStats("semantic", 22, 22),
CategoryStats("evasion", 24, 24),
CategoryStats("compositional", 19, 19),
],
)
print(f"R = {result.robustness_score:.4f} [{result.robustness_label}]")
print(f"F = {result.fragility_index:.4f} [{result.fragility_label}]")
print(f"R_d = {result.robustness_score_with_diversity:.4f} (with diversity)")CLI (human-readable report) — empirical run from current test suite:
python tools/robustness_evaluator.py \
--n-total 1054 \
--n-adversarial 425 \
--passing-adv 424 \
--passing-norm 629 \
--coverage 0.57 \
--alpha 0.65 \
--categories '[{"name":"boundary","total":28,"passing":28},
{"name":"semantic","total":22,"passing":22},
{"name":"evasion","total":24,"passing":24},
{"name":"compositional","total":19,"passing":19}]'CLI (JSON output for CI pipelines):
python tools/robustness_evaluator.py --n-total 1054 --n-adversarial 425 \
--passing-adv 424 --passing-norm 629 --coverage 0.57 --jsonEmpirical output (measured from actual test runs):
============================================================
ImpactGuard — Robustness Evaluation Report
============================================================
── Test Composition ──────────────────────────────────────
Total tests : 1054
Adversarial tests : 425
Normal tests : 629
Adversarial ratio : 40.3% ✓
── Pass Rates ────────────────────────────────────────────
P_adversarial (P_a): 0.998
P_normal (P_n): 1.000
Coverage (C) : 0.570
Alpha (α) : 0.65
Diversity (D) : 1.000
── Primary Metrics ───────────────────────────────────────
Robustness Score (R) : 0.5691 [FAIR]
Robustness + Diversity (R_d) : 0.5691
Fragility Index (F) : 0.0024 [ROBUST]
── Category Breakdown ────────────────────────────────────
boundary 28/28 (100%) ●●●●●●●●●●●●●●●●●●●●●●●●●●●●
semantic 22/22 (100%) ●●●●●●●●●●●●●●●●●●●●●●
evasion 24/24 (100%) ●●●●●●●●●●●●●●●●●●●●●●●●
compositional 19/19 (100%) ●●●●●●●●●●●●●●●●●●●
============================================================
For deeper exploration of specific subsystems, refer to the DeepWiki documentation:
