Skip to content

Expand VTS coverage for TLC soundness-critical qualification gaps #8

@lemmy

Description

@lemmy

Expand the current Validation Test Suite (VTS) so it can detect recently discovered TLC soundness bugs that may cause TLC to miss a real violation of a safety property (most severe), or report a bogus counterexample (less severe, but still qualification-relevant). These bugs are subtle and are not exposed by the current VTS strategy, which combines TLA+ language features pairwise. In practice, soundness bugs appear to require interactions among 3, 4, or more features. Exhaustively testing such higher-order combinations is generally infeasible due to combinatorial explosion.

This means the current VTS approach, while useful, is insufficient on its own for covering certain soundness-critical TLC behaviors relevant to safety-critical qualification (e.g., ISO 26262).

Critical soundness issues to drive coverage expansion:

Required work

Characterize the bug-triggering interactions

  • Identify which combinations of language features / semantic conditions are implicated in each issue.
  • Document why pairwise composition is insufficient in each case.

Design targeted, non-exhaustive test strategies

  • Develop test-generation strategies that go beyond pairwise coverage without requiring full n-wise enumeration.
  • Candidate approaches may include:
    • bug-class regression tests
    • targeted higher-order feature combinations
    • metamorphic tests
    • constraint-guided test generation
    • risk-based prioritization of combinations

Promote critical issues into permanent regression coverage

  • Add minimized reproducer specs (where possible) to VTS.
  • Ensure each reproducer is tied to a documented bug class and expected behavior.

Update qualification claims and coverage rationale

  • Explicitly describe what kinds of soundness risks are now covered.
  • Document residual risks from untested higher-order feature interactions.

Outcome needed: a VTS expansion plan that prioritizes detection of soundness-critical TLC failures beyond pairwise feature interaction testing, starting from the known issue set above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions