Skip to content

EPDMS score is 0.0 for some tokens, even when all subscores are 1.0. #172

@Mashiroln

Description

@Mashiroln
Image

I am using navsim==2.2 to evaluate the EPDMS metric on the navtest split. (This evaluation setting, while not the two-stage pseudo-simulation, has been used in several recent papers).

1. Description

I've encountered a consistent issue where certain specific tokens receive a final EPDMS score of 0.0, even though every individual subscore (NC, DAC, DDC, TLC, EP, TTC, etc.) for that token is 1.0.

According to the EPDMS formula (a product of penalty terms and a weighted average of other terms), if all subscores are 1.0, the final EPDMS should also be 1.0. This 0.0 result seems to be an error. Is there anything I might have missed?

2. Other info

Hardware: The issue is reproducible on both NVIDIA H100 and Ascend 910B hardware.
Consistency: The exact same tokens fail (score 0.0) consistently across different models and multiple test runs.
Dependencies : I am aware of potential issues with older numpy versions (like 1.23.4). I have already upgraded numpy to 1.26.4, deleted the entire metric_cache then regenerated it, but the problem persists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions