Skip to content

Bug: parse_alignment_target_id fails with multiple KDMA names containing underscores #28

@PaulHax

Description

@PaulHax

Problem

The current fix in parse_alignment_target_id function has a critical flaw when handling multiple KDMA names that contain underscores.

Current Logic Issue

The function uses the number of values to determine parsing strategy:

  • 1 value: treat entire KDMA part as single name (fixes personal_safety-0.0)
  • Multiple values: split KDMA part by underscores

Failing Case

Input: personal_safety_merit-0.0_1.0

  • Values: [0.0, 1.0] (2 values)
  • KDMA names after underscore split: ["personal", "safety", "merit"] (3 names)
  • 3 names ≠ 2 values → returns empty list ❌

Should parse as:

  • personal_safety with value 0.0
  • merit with value 1.0

Root Cause

Using value count as a heuristic is unreliable because:

  1. KDMA names can contain underscores (personal_safety)
  2. Multiple KDMAs can also contain underscores
  3. No way to distinguish between name separators vs. name components

Potential Solutions

  1. Delimiter approach: Use a different delimiter between KDMA names (e.g., double underscore __)
  2. Length-based parsing: Use known KDMA name lengths/patterns
  3. Registry approach: Maintain a list of valid KDMA names and match against them
  4. Format change: Restructure alignment target ID format to avoid ambiguity

Location

File: align_browser/experiment_models.py:42-98
Function: parse_alignment_target_id

Priority

High - affects KDMA parsing accuracy for alignment targets

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions