Add EQ-Bench3 synthetic training environment by poofeth · Pull Request #1335 · PrimeIntellect-ai/verifiers

poofeth · 2026-05-11T05:29:14Z

Summary

add an eq_bench3 environment for EQ-Bench-style emotional intensity prediction
include a deterministic generator for uncontaminated synthetic dialogue/emotion prompts
commit a 64-row HF Dataset-compatible JSONL sample with question, answer, and info fields
add deterministic scoring for JSON emotion-intensity predictions
add focused tests covering generator determinism, dataset loading, reward scoring, and environment construction without external API calls

Bounty

Algora: https://algora.io/PrimeIntellect-ai/bounties/UZGP8mWodUUMpUM8
Claiming the EQ-Bench3 + train set bounty.

Validation

uv run pytest tests/test_eq_bench3_environment.py -q
uv run ruff check environments/eq_bench3 tests/test_eq_bench3_environment.py
uv run ruff format --check environments/eq_bench3 tests/test_eq_bench3_environment.py
git diff --check

Note

Low Risk
Low risk: changes are additive (new eq_bench3 environment, sample data, and tests) with no modifications to core framework or security-sensitive code paths.

Overview
Adds a new environments/eq_bench3 SingleTurn environment for EQ-Bench-style emotion-intensity prediction, including JSON parsing + a continuous reward (emotion_score_reward) that scores per-emotion intensity error.

Includes a deterministic prompt generator script and commits a 64-row synthetic JSONL sample dataset, plus packaging metadata (pyproject.toml) and focused tests for generator determinism, dataset loading, reward correctness, and load_environment construction.

Updates environments/README.md to list eq_bench3 among available SingleTurn examples.

^{Reviewed by Cursor Bugbot for commit 32cf899. Bugbot is set up for automated code reviews on this repo. Configure here.}

poofeth · 2026-05-11T05:29:29Z

Validation evidence for the EQ-Bench3 bounty PR:

uv run pytest tests/test_eq_bench3_environment.py -q passed: 4 tests.
uv run ruff check environments/eq_bench3 tests/test_eq_bench3_environment.py passed.
uv run ruff format --check environments/eq_bench3 tests/test_eq_bench3_environment.py passed.
git diff --check passed.

The sample dataset is generated from local deterministic templates (source=synthetic-uncontaminated-v1) rather than copied from the upstream EQ-Bench question set.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 1698173. Configure here.}

poofeth · 2026-05-11T05:43:35Z

Addressed the Bugbot review in commit 32cf899:

added eq_bench3 to environments/README.md
added an upper-bound guard for synthetic generation (num_examples <= 96) to avoid duplicate-exhaustion loops
added regression coverage for the generator bound

Validation after the fixes:

$ uv run pytest tests/test_eq_bench3_environment.py -q
.....                                                                    [100%]

$ uv run ruff check environments/eq_bench3 tests/test_eq_bench3_environment.py
All checks passed!

$ uv run ruff format --check environments/eq_bench3 tests/test_eq_bench3_environment.py
3 files already formatted

$ git diff --check
# no output

poofeth · 2026-05-11T09:53:20Z

/claim https://algora.io/PrimeIntellect-ai/bounties/UZGP8mWodUUMpUM8

Add EQ-Bench3 synthetic training environment

1698173

cursor Bot reviewed May 11, 2026

View reviewed changes

Comment thread environments/eq_bench3/README.md

Comment thread environments/eq_bench3/generate_eq_bench3_prompts.py

Address EQ-Bench3 review feedback

32cf899

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EQ-Bench3 synthetic training environment#1335

Add EQ-Bench3 synthetic training environment#1335
poofeth wants to merge 2 commits into
PrimeIntellect-ai:mainfrom
poofeth:bounty/eq-bench3-train-set

poofeth commented May 11, 2026 •

edited by cursor Bot

Loading

Uh oh!

poofeth commented May 11, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

poofeth commented May 11, 2026

Uh oh!

poofeth commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

poofeth commented May 11, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bounty

Validation

Uh oh!

poofeth commented May 11, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

poofeth commented May 11, 2026

Uh oh!

poofeth commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

poofeth commented May 11, 2026 •

edited by cursor Bot

Loading