Skip to content

This PR introduces real-time error percentage tracking during benchmarking and surfaces it in the UI, along with updated documentation.#2

Open
trcjr wants to merge 2 commits into
mkhnsn:mainfrom
trcjr:track-error-percentage
Open

This PR introduces real-time error percentage tracking during benchmarking and surfaces it in the UI, along with updated documentation.#2
trcjr wants to merge 2 commits into
mkhnsn:mainfrom
trcjr:track-error-percentage

Conversation

@trcjr
Copy link
Copy Markdown

@trcjr trcjr commented Apr 16, 2026

Summary

This PR introduces real-time error percentage tracking during benchmarking and surfaces it in the UI, along with updated documentation.

What’s Changed

  • Backend

    • Added error_percent to SampleProgress
    • Enables real-time calculation and reporting of error rates during runs
  • Frontend

    • Displays live error percentage in the benchmarking UI
    • Updates dynamically as samples are processed
  • Types / Models

    • Updated shared types to support error_percent across the stack
  • Documentation

    • README updated to document:
      • Live error percentage behavior
      • Configurability
      • Test failure thresholds

Why

  • Provides immediate visibility into benchmark quality
  • Helps identify instability or failure conditions earlier
  • Makes failure thresholds more transparent and actionable

Notes

  • No breaking changes expected
  • Existing workflows continue to function as before, with additional visibility into error rates

trcjr added 2 commits April 15, 2026 19:26
Backend now includes error_percent in SampleProgress, and the frontend displays it in real time during benchmarking. Types and models updated for full-stack support. Documentation updated in README to cover live error percentage, configurability, and test failure thresholds, clarifying new UI and backend behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant