batfish · dhalperi · May 20, 2026 · May 20, 2026
diff --git a/.gitignore b/.gitignore
@@ -9,6 +9,10 @@ __pycache__/
 # Distribution / packaging
 .Python
 build/
+# Per-lab build inputs (manifests, scripts, bootstrap configs) under
+# snapshots/<lab>/source/build/ are NOT Python build artifacts.
+!snapshots/*/source/build/
+!snapshots/*/source/build/**
 develop-eggs/
 dist/
 downloads/
@@ -144,7 +148,7 @@ networks/
 
 # Working directory for temporary scripts and analysis
 working/
-CLAUDE.md
+CLAUDE.local.md
 .claude/
 
 # VM images (large, user-provided, not distributed)

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,152 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Development Commands
+
+### Environment Setup
+
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install -e .
+pip install -r requirements-dev.txt
+```
+
+### Core Testing Commands
+
+```bash
+# Run all tests
+pytest
+
+# Run with coverage reporting
+pytest --cov=lab_validation --cov-report=xml --cov-report=term-missing
+
+# Run specific validator tests
+pytest tests/test_arista_validator.py -v
+pytest tests/test_nxos_validator.py -v
+
+# Run specific lab integration test
+pytest lab_tests/test_labs.py --labname=eos_bgp_aggregate -v
+
+# Run single test method
+pytest tests/test_arista_validator.py::TestAristaValidator::test_get_runtime_data -v
+```
+
+### Code Quality Commands
+
+```bash
+# Run pre-commit hooks (includes formatting, isort, prettier)
+pre-commit run --all-files
+
+# Type checking with mypy
+mypy src/
+```
+
+## Architecture Overview
+
+### Core Components
+
+**Validation Framework**: Abstract `VendorValidator` base class with vendor-specific implementations. Each validator implements device data parsing, Batfish comparison, and specialized validation for interfaces, routing tables, BGP, and EVPN.
+
+**Multi-Vendor Parsers**: Parsing systems in `src/lab_validation/parsers/` for each vendor (Arista EOS JSON, Cisco IOS-XE, NX-OS with EVPN support, JunOS, etc.) using grammar-based parsing and vendor-specific logic.
+
+**Cost-Based Matching Algorithm**: Located in `src/lab_validation/utils/validation_utils.py`, implements pairing of device operational data against Batfish analysis results, handling next-hop analysis and route attribute comparison.
+
+**Lab Test Framework**: Network labs in `snapshots/` directory, each containing device configurations, show command outputs, and optional validation rules. Labs follow consistent structure with `configs/`, `show/`, and `validation/` directories.
+
+**Sickbay System**: Expected failure management using YAML files (`sickbay.yaml`) that reference GitHub issues for tracking known validation problems. Integrates with pytest to mark expected failures as `xfail`.
+
+### Data Flow
+
+1. **Lab Discovery**: Framework automatically discovers available labs from `snapshots/` directory
+2. **Batfish Analysis**: Builds network snapshot and runs Batfish validation questions
+3. **Device Data Parsing**: Vendor-specific parsers extract operational data from show commands
+4. **Validation Engine**: Cost-based matching compares device state against Batfish predictions
+5. **Result Processing**: Sickbay system handles expected failures and generates test report
+
+## Code Structure
+
+### Vendor Validators (`src/lab_validation/validators/`)
+
+Each validator implements:
+
+- `get_runtime_data()`: Parse device show command outputs
+- `validate_interface_properties()`: Interface state validation
+- `validate_main_rib_routes()`: Main routing table validation
+- `validate_bgp_rib_routes()`: BGP table validation
+- `validate_evpn_rib_routes()`: EVPN route validation (vendor-specific)
+
+### Parser Architecture (`src/lab_validation/parsers/`)
+
+- JSON parsing for modern APIs (Arista EOS)
+- Text parsing with regex for traditional CLI (IOS, NX-OS)
+- Structured parsing for complex outputs (JunOS, EVPN routes)
+
+### Data Models (`src/lab_validation/models/`)
+
+- `routes.py`: Route models with AS-path conversion and next-hop analysis
+- `interface_properties.py`: Interface state models
+- `runtime_data.py`: Node operational data structures
+
+## Lab Data Structure
+
+Each lab in `snapshots/` follows this pattern:
+
+```
+<lab_name>/
+├── source/                   # Hand-authored build inputs (for labs built via lab_builder)
+│   ├── topology.clab.yml    # Containerlab topology
+│   ├── configs/             # Startup configs pushed to devices
+│   ├── checks.yaml          # Lab-state preconditions
+│   └── README.md            # Lab documentation and findings
+├── configs/                  # Device configuration files (collected from running lab)
+├── show/                     # Device show command outputs
+│   └── host_nos.txt         # Device-to-vendor mapping
+├── batfish/                  # Optional Batfish-specific inputs (e.g., layer1_topology.json)
+└── validation/               # Optional validation rules
+    └── sickbay.yaml         # Expected failure definitions with GitHub issues
+```
+
+`source/` keeps each lab self-contained: the hand-authored inputs sit
+next to the collected outputs. Labs built via `lab_builder` (see
+`infra/README.md`) put their topology/configs/checks here. Older labs
+without `source/` were authored before the lab builder existed.
+
+## Testing
+
+### Unit Tests (`tests/`)
+
+Parser and validator testing. Each vendor validator has test coverage for parsing logic and validation algorithms.
+
+### Lab Integration Tests (`lab_tests/`)
+
+End-to-end validation using real network lab data. Each lab runs independently with Batfish analysis, device data parsing, validation comparison, and sickbay handling.
+
+Requires a running Batfish instance on localhost:9996.
+
+### CI/CD Pipeline
+
+GitHub Actions runs matrix strategy testing all labs in parallel, with Batfish JAR caching and coverage reporting.
+
+## Coding Conventions
+
+- **Python 3.10+** required (supports 3.10, 3.11, 3.12)
+- **Pre-commit hooks** required for consistent code quality
+- **Parser error handling**: parsers should crash (assert, KeyError) on
+  unexpected data shapes rather than silently degrading with `.get()`
+  defaults. Only use `.get()` when a key is genuinely optional per the
+  schema.
+- **Unit tests for parser changes**: any parser fix or extension must
+  include a unit test covering the new code path.
+
+## Lab Builder Infrastructure
+
+See `infra/README.md` for detailed documentation on creating new labs
+using containerlab on EC2.
+
+## Project Configuration
+
+- **Claude-facing documentation and project status should go in the
+  working/ folder**. This is everything related to LLM-driven projects
+  and tasks that does not face end users and contributors.
diff --git a/infra/README.md b/infra/README.md
@@ -238,6 +238,32 @@ For each test failure, determine the cause:
 - **Expected difference**: management interfaces, pseudo-interfaces, etc. that
   Batfish intentionally doesn't model. Update the validator's exclusion logic.
 
+## Convergence Expectations
+
+Once a vJunos node is booted and SSH-reachable, routing protocol
+convergence and config operations are fast:
+
+- **BGP convergence** after config push: < 30 seconds for small labs
+  (2-4 nodes, < 200 prefixes).
+- **`commit check`**: < 5 seconds for any reasonable policy.
+- **`commit`**: < 15 seconds for labs of this size.
+- **Health-check after deploy**: SSH comes up ~3-5 min after
+  `containerlab deploy`; BGP converges within seconds of SSH.
+
+If a command hasn't completed within 2× its expected time, something
+is wrong — investigate rather than retry with a longer timeout.
+
+### Operational principles
+
+- **Verify the tool works before looping.** Run the check command once
+  interactively before wrapping it in a polling loop. If it fails on
+  the first try, fix the command — don't blindly retry.
+- **Prefer bulk operations.** Push 100 config lines via `load set
+terminal` (< 5 seconds) rather than per-line `send_command_timing`
+  (minutes). Use binary search to isolate failures in a batch.
+- **No absurd timeouts.** A 60-minute timeout on a command expected to
+  finish in 10 seconds masks bugs. Set timeouts at 2× expected runtime.
+
 ## Iterating on a Lab
 
 To modify configs without full redeploy (saves 5-10 min boot time):
@@ -254,6 +280,27 @@ ssh -i $KEY ubuntu@$IP \
 
 Then re-download and re-validate locally.
 
+### Pushing config to running Junos devices
+
+When pushing config to a running vJunos device programmatically:
+
+- **Use `load set terminal`** + paste all lines + `^D`. This is fast
+  even for hundreds of lines. Do NOT push lines one at a time via
+  netmiko's `send_command_timing` (~2s/line overhead).
+- **Binary search on commit failure** — if bulk `commit check` fails,
+  bisect the config into halves and re-check each. Isolates rejections
+  in O(log N) commits instead of O(N).
+- **Bracket syntax `[ A B ]`** cannot be reliably delivered through
+  netmiko/SSH interactive sessions (`[` triggers Junos CLI multi-value
+  entry mode). Use the desugared equivalent: two separate `set` lines
+  (e.g., `community set A` + `community set B`). Junos stores them
+  identically — `show configuration | display set` always emits the
+  desugared form.
+- **Large configs as startup**: if a config is too large or complex for
+  interactive push, use it as the containerlab `startup-config`. If
+  commit fails at boot, SSH never comes up — use a minimal startup
+  config and push terms post-boot.
+
 **Always re-collect after changing the lab.** The snapshot's
 `configs/<node>/show_running-config.txt` and every file under
 `show/<node>/` must come from the actual running device after your

diff --git a/...shots/junos_commit_check/source/README.md → snapshots/junos_commit_check/README.md b/...shots/junos_commit_check/source/README.md → snapshots/junos_commit_check/README.md
diff --git a/...s/junos_community_delete/source/README.md → snapshots/junos_community_delete/README.md b/...s/junos_community_delete/source/README.md → snapshots/junos_community_delete/README.md
diff --git a/...s/junos_cross_term_match/source/README.md → snapshots/junos_cross_term_match/README.md b/...s/junos_cross_term_match/source/README.md → snapshots/junos_cross_term_match/README.md
diff --git a/..._policy_chain_visibility/source/README.md → ...s/junos_policy_chain_visibility/README.md b/..._policy_chain_visibility/source/README.md → ...s/junos_policy_chain_visibility/README.md
diff --git a/snapshots/junos_rmw_localpref/README.md b/snapshots/junos_rmw_localpref/README.md
@@ -0,0 +1,131 @@
+# Junos Read-Modify-Write Local-Preference Lab
+
+When a Junos `then` block contains both a `set` and an arithmetic
+(`add`/`subtract`) action on the same attribute, what does the device
+actually do? The original framing of this lab — "does a later action
+read the running value or the original?" — turns out to be the wrong
+question. Junos resolves the conflict at config time, not at policy
+evaluation time.
+
+## Topology
+
+```
+sender (AS 65001) --- dut (AS 65002)
+       ge-0/0/0  <->  ge-0/0/0
+       10.0.12.0      10.0.12.1
+```
+
+No collector is needed: local-preference and metric are both either
+non-transitive across eBGP or not the question we're asking. The
+observation is on `dut` only.
+
+## Test Routes
+
+`sender` originates these prefixes via static and exports all to `dut`
+without modification (so they all arrive with localpref=100, metric=0,
+AS-path `65001`).
+
+| Prefix       | dut import term     | Compound action(s)                                    |
+| ------------ | ------------------- | ----------------------------------------------------- |
+| 10.20.0.0/24 | `BASELINE`          | (none — baseline)                                     |
+| 10.20.1.0/24 | `TWO-ADDS`          | `local-preference add 50; local-preference add 50;`   |
+| 10.20.2.0/24 | `SET-THEN-ADD`      | `local-preference 300; local-preference add 50;`      |
+| 10.20.3.0/24 | `SET-THEN-SUBTRACT` | `local-preference 300; local-preference subtract 50;` |
+| 10.20.4.0/24 | `METRIC-SET-ADD`    | `metric 200; metric add 50;`                          |
+| 10.20.5.0/24 | `DOUBLE-PREPEND`    | `as-path-prepend "65002 65002";`                      |
+
+## Running
+
+```
+lab_builder validate topology.clab.yml --checks checks.yaml
+```
+
+## Results
+
+Deployed 2026-05-19 on vJunos-router 25.4R1.12 (containerlab on EC2).
+All 8 checks passed.
+
+### What Junos kept after commit
+
+The first thing to notice is the diff between the configured policy
+(`source/configs/dut.cfg`) and what the device's
+`show configuration | display set` actually retained
+(`configs/dut/show_configuration_|_display_set.txt`).
+For each term that combined a `set` with an arithmetic action, the
+`set` is gone:
+
+| Term                | Configured                                            | Retained after commit           |
+| ------------------- | ----------------------------------------------------- | ------------------------------- |
+| `BASELINE`          | (accept only)                                         | (accept only)                   |
+| `TWO-ADDS`          | `local-preference add 50; local-preference add 50;`   | `local-preference add 50`       |
+| `SET-THEN-ADD`      | `local-preference 300; local-preference add 50;`      | `local-preference add 50`       |
+| `SET-THEN-SUBTRACT` | `local-preference 300; local-preference subtract 50;` | `local-preference subtract 50`  |
+| `METRIC-SET-ADD`    | `metric 200; metric add 50;`                          | `metric add 50`                 |
+| `DOUBLE-PREPEND`    | `as-path-prepend "65002 65002";`                      | `as-path-prepend "65002 65002"` |
+
+`TWO-ADDS` collapses two identical statements into one (standard
+Junos config dedup). The other three rows are the substantive
+observation: when an `add`/`subtract` action and a plain `set` action
+target the same attribute in the same `then` block, **Junos silently
+drops the `set` at commit time and keeps only the arithmetic
+action**. The configured `local-preference 300` and `metric 200`
+never make it into the running config; the policy evaluator never
+sees them.
+
+### Resulting BGP attributes
+
+With those configs in place, observations from `show route <prefix>
+extensive | display json` on `dut`:
+
+| Prefix       | Effective action  | localpref | metric | AS path                                            |
+| ------------ | ----------------- | --------- | ------ | -------------------------------------------------- |
+| 10.20.0.0/24 | (none)            | 100       | —      | 65001 I                                            |
+| 10.20.1.0/24 | `add 50`          | **150**   | —      | 65001 I                                            |
+| 10.20.2.0/24 | `add 50`          | **150**   | —      | 65001 I                                            |
+| 10.20.3.0/24 | `subtract 50`     | **50**    | —      | 65001 I                                            |
+| 10.20.4.0/24 | `metric add 50`   | 100       | **50** | 65001 I                                            |
+| 10.20.5.0/24 | `as-path-prepend` | 100       | —      | 65002 65002 65001 I (Looped: 65002 — still active) |
+
+Each result is exactly what you'd expect from applying the _retained_
+action to the original attribute value (localpref=100, metric=0).
+There's no compound semantics to investigate at policy-evaluation
+time, because the conflicting actions never coexist by the time the
+policy runs.
+
+### Implication for the original question
+
+The "Suspicious Omission" framing in
+`working/intermediateAttributes.md` asked whether Junos chains reads
+across compound `set`/`add` actions or reads the original each time.
+This lab's answer: **the question is moot for Junos**. The device's
+config system enforces that the two action types can't coexist on the
+same attribute in the same `then` block — only the arithmetic action
+survives — so there is no runtime semantics to disagree about.
+
+(Whether the IOS-side bug Todd identified for compound writes still
+exists is a separate question; that's a different vendor with a
+different config syntax that does not have this commit-time
+collapsing behavior.)
+
+### Multi-ASN as-path-prepend
+
+The `DOUBLE-PREPEND` term is unaffected by the collapse — it has only
+one action, `as-path-prepend "65002 65002"`. The prepend produced
+`65002 65002 65001` as expected. (Junos flagged the loop but kept the
+route active since the prepend was self-applied at import.)
+
+### Batfish triage
+
+`pytest lab_tests/test_labs.py --labname=junos_rmw_localpref` against
+local Batfish: **13 passed, 0 skipped, no sickbay**. The BGP-rib-routes
+test compares Batfish-predicted local_preference / MED / as-path
+against the device's actual values per prefix; all match. Batfish
+parses the `display set` output, which is post-commit, so it never
+sees the dropped `set` statements either — both Batfish and the
+device agree because they're operating on the same retained config.
+
+### Companion data
+
+- Source policy: `source/configs/dut.cfg`
+- Retained-config snapshot:
+  `configs/dut/show_configuration_|_display_set.txt`