RLMEnv: Simplify constructor and internals by snimu · Pull Request #966 · PrimeIntellect-ai/verifiers

snimu · 2026-02-27T14:23:24Z

Description

Remove 14 unused/niche constructor args that were silently swallowed via **kwargs or had no
remaining use case (interception_host, interception_port, interception_url, execution_backend,
context_key, sandbox_start_command, sandbox_client_max_workers, root_tool_serialization,
stagger/jitter params, etc.)
Remove _InterceptionPool singleton and all shared-pool branching — each RLMEnv instance now owns
its own interception server and tunnel (this undoes a recent change by myself which was poorly motivated and thought through)
Add explicit max_turns: int = 50 constructor param (previously inherited a default of 10 from
StatefulToolEnv, easily lost via **kwargs)
Rename sub_tool_max_turns → sub_llm_max_turns for consistency with max_sub_llm_parallelism and the
sub_llm_* metric names
Hardcode interception_port=0 (OS-assigned) and bind_host="127.0.0.1" — the old configurability only
mattered for the now-removed pool
Update docs and docstring to remove outdated claims

Note: requires small changes to the -rlm environments.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Note

High Risk
Breaking API changes remove/rename multiple RLMEnv constructor args and change interception/tunnel lifecycle to be per-instance (no shared pool), which can affect rollout networking and resource usage. Touches sandbox execution, proxy routing, and sub-LLM timeouts, so regressions could impact core execution paths.

Overview
Simplifies RLMEnv configuration and removes shared interception infrastructure. The constructor is pared down (adds explicit max_turns, renames sub_tool_max_turns to sub_llm_max_turns, removes max_iterations and many other knobs), and context ingestion is now fixed to info["context_dir"]/info["context"] (drops configurable keys).

Interception/tunnel logic is simplified by deleting _InterceptionPool and related branching: each RLMEnv now owns its own aiohttp interception server and Prime Tunnel, with interception_port always ephemeral and bind host fixed to 127.0.0.1 (tests use a private _interception_url_override to skip tunneling).

Sub-LLM execution is streamlined: removes stagger/jitter delays, collapses sub-LLM timeouts to a single sub_llm_timeout derived from code_execution_timeout, and hardcodes root-tool serialization to pickle (removes serialization handling). Context directory copy now enforces a fixed 1GB limit internally (removes configurable filesystem_copy_max_bytes). Docs and tests are updated accordingly, including removal of _InterceptionPool tests and updated fixtures/expectations.

^{Written by Cursor Bugbot for commit 9e27e18. This will update automatically on new commits. Configure here.}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-02-27T14:32:09Z

verifiers/envs/experimental/rlm_env.py

            sandbox_id,
            cmd,
-            timeout=self.env._compute_install_wait_seconds(),
+            timeout=self.env.max_startup_wait_seconds,


Pip install timeout no longer scales with packages

Low Severity

The removed _compute_install_wait_seconds() scaled the pip install timeout based on the number of packages (30s per package, minimum max_startup_wait_seconds). Now using the flat max_startup_wait_seconds (default 120s) means environments with many pip_install_packages (5+) may time out during installation where they previously succeeded.

If somebody installs that many packages, they know what they're doing, and should be able to simply increase the max_startup_wait_seconds..

snimu and others added 4 commits February 27, 2026 13:53

simplify rlm args

3e764de

rename sub_tool_max_turns → sub_llm_max_turns

b252f53

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simplify RLMEnv docs

d060b11

fix outdated claims in RLMEnv docstring

9e27e18

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor bot reviewed Feb 27, 2026

View reviewed changes

snimu mentioned this pull request Feb 27, 2026

Update RLMEnv args to align with verifiers PR#966 PrimeIntellect-ai/research-environments#186

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLMEnv: Simplify constructor and internals#966

RLMEnv: Simplify constructor and internals#966
snimu wants to merge 4 commits intomainfrom
sebastian/rlm-args-reduction-2026-02-27

snimu commented Feb 27, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 27, 2026

Uh oh!

snimu Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

snimu commented Feb 27, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 27, 2026

Choose a reason for hiding this comment

Pip install timeout no longer scales with packages

Uh oh!

snimu Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

snimu commented Feb 27, 2026 •

edited by cursor bot

Loading