Summary
Scenario validation checks expression syntax and unknown attribute names, but does not validate that compared string literals are valid categorical options for the referenced attribute.
This allows case/value mismatches (e.g., urban_rural == 'urban' when option is Urban) to pass validation and silently no-op at runtime.
Why This Matters
This is a major source of "looks valid but behavior is wrong" bugs. Incorrect conditions don’t throw hard errors during simulation; they just fail to match and flatten dynamics.
Current Behavior (Code)
In /Users/adithyasrinivasan/Projects/extropy/extropy/scenario/validator.py:
- syntax check:
validate_expression_syntax(...)
- reference check:
extract_names_from_expression(...) vs known attrs
- no check that literals used in comparisons are present in attribute option domains
By contrast, population semantic validation already has this concept via AST comparison extraction.
Proposed Fix
- Add AST-based comparison extraction for scenario
when clauses:
seed_exposure.rules[].when
- timeline exposure rules
timeline[].exposure_rules[].when (if present)
spread.share_modifiers[].when
- For each
(attribute, compared_string_values) pair:
- if attribute is categorical with known options, require literal values to match one of those options
- invalid literals should be
ERROR (not warning) because rule is effectively broken
-
Handle list membership checks (in [...]) and single comparisons (==, !=).
-
Add tests covering:
- exact match pass
- case mismatch fail
- nonexistent value fail
- non-categorical attributes skipped
Acceptance Criteria
- Invalid categorical literals in scenario conditions are caught at validation time.
- Common mismatch classes (case/style/legacy tokens) no longer survive to runtime.
- Validation message includes valid option set for quick fix.
Pipeline Impact
Reduces recursive debug loops by catching scenario-domain mismatches before sampling/simulation.
Summary
Scenario validation checks expression syntax and unknown attribute names, but does not validate that compared string literals are valid categorical options for the referenced attribute.
This allows case/value mismatches (e.g.,
urban_rural == 'urban'when option isUrban) to pass validation and silently no-op at runtime.Why This Matters
This is a major source of "looks valid but behavior is wrong" bugs. Incorrect conditions don’t throw hard errors during simulation; they just fail to match and flatten dynamics.
Current Behavior (Code)
In
/Users/adithyasrinivasan/Projects/extropy/extropy/scenario/validator.py:validate_expression_syntax(...)extract_names_from_expression(...)vs known attrsBy contrast, population semantic validation already has this concept via AST comparison extraction.
Proposed Fix
whenclauses:seed_exposure.rules[].whentimeline[].exposure_rules[].when(if present)spread.share_modifiers[].when(attribute, compared_string_values)pair:ERROR(not warning) because rule is effectively brokenHandle list membership checks (
in [...]) and single comparisons (==,!=).Add tests covering:
Acceptance Criteria
Pipeline Impact
Reduces recursive debug loops by catching scenario-domain mismatches before sampling/simulation.