feat: evalmonkey web ui and benchmark stability fixes by himmi-01 · Pull Request #3 · Corbell-AI/evalmonkey

himmi-01 · 2026-05-06T05:49:35Z

Added Next.js & FastAPI Web UI for live benchmarking
Fixed macOS torch shared memory permission crash
Improved HuggingFace datasets loading logic (trust_remote_code=True)
Fixed Hellaswag and MMLU strict LLM judge options mapping
Updated UI to auto-detect LLM judge from environment

Issue : #2

- Added Next.js & FastAPI Web UI for live benchmarking - Fixed macOS torch shared memory permission crash - Improved HuggingFace datasets loading logic (trust_remote_code=True) - Fixed Hellaswag and MMLU strict LLM judge options mapping - Updated UI to auto-detect LLM judge from environment

himmi-01 merged commit 1ad2b0e into Corbell-AI:main May 6, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: evalmonkey web ui and benchmark stability fixes#3

feat: evalmonkey web ui and benchmark stability fixes#3
himmi-01 merged 1 commit intoCorbell-AI:mainfrom
himmi-01:feature/evalmonkey-web-ui

himmi-01 commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

himmi-01 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

himmi-01 commented May 6, 2026 •

edited

Loading