AgentExit: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments

🎉 This work has been accepted by Findings of EMNLP 2025. 🎉

This repository presents AgentExit, a framework for large language model (LLM)-based agents to early exit when necessary.
It is built on AgentBoard, and provides the agent code for replication of the study as well as a plug-and-play tool for agent manipulation.

Abstract

Agents powered by large language models (LLMs) have demonstrated strong planning and decision-making capabilities in complex embodied environments. However, such agents often suffer from inefficiencies in multi-turn interactions, frequently trapped in repetitive loops or issuing ineffective commands, leading to redundant computational overhead. Instead of relying solely on learning from trajectories, we take a first step toward exploring the early-exit behavior for LLM-based agents. We propose two complementary approaches:

an intrinsic method that injects exit instructions during generation,

an extrinsic method that verifies task completion to determine when to halt an agent's trial.

To evaluate early-exit mechanisms, we introduce two metrics:

Redundancy Steps: measures the reduction of redundancy steps as a positive effect,

Progress Degradation: measures task progress loss via reduced subgoal completion as a negative effect.

Experiments with 4 different LLMs across 5 embodied environments show significant efficiency improvements, with only minor drops in agent performance. We also validate a practical strategy where a stronger agent assists after an early-exit agent, achieving better performance with the same total steps.

Usage

We provide several different setups, along with the agent and configuration:

Intrinsic Early Exit	Extrinsic Early Exit	Agent	Configuration
❌	❌	React Style Agent	React Baseline Config
✔️	❌	React Style Agent	Intrinsic Config
❌	✔️	React Style Verify Agent	Intrinsic Config
✔️	✔️	React Style Extrinsic+Intrinsic Agent	Extrinsic Intrinsic Config

We also adjust the datasets task workflow (ALFWorld, ScienceWorld, BabyAI, Jericho, PDDL), to permit actions such as "EXIT" to exit the interaction status. We also provide the react style example.

Explanation of Metrics

Please refer to this picture to explain the behavior of early exit for LLM-based agents.

Perfect Early-Exit Scenario: The ideal scenario occurs when both RS and PR are zero, meaning no redundant steps and no progress loss.

Too-Early Scenarios: Reduce redundant steps but significantly impair progress.

Too-Late Scenarios: PD remains low but RS stays high.

Main Results & Findings

Statistical Results in Embodied Environments

We can see that:

Early-exit mechanisms significantly reduce redundant steps.

Minor performace drop in success and progress rates.

LLMs show varying preferences for early exit strategies.

Combining intrinsic and extrinsic early exit maximizes performance retention.

Early-exit strategye generalizes to gaming environments.

Sample-wise Visualization

The results indicate that the earlyexit strategy has potential but remains imperfect redundant steps persist, and premature termination can still harm task progress. The PD metric provides a clearer view of these negative impacts, complementing RS as an efficiency measure.

Practical Implications

Stronger Agent's Assistance after Early-Exit

We simulate a practical scenario in embodied environments, where a weaker agent exits early from challenging environments and requests assistance from a stronger agent. We use ALFWorld as our testset, Mistral-24B-Instruct as the weak agent, and Llama3.1-70B-Instruct as the strong agent.

We find that early exit followed by strong agent assistance yields over a 10% improvement in success rate within the same 40-step budget. The implementation of the agent, followed by the other agent history can be seen in React Style Agent History.

Adaptation to Reflexion Framework

We adapt the Reflexion framework for a one-trial scenario: if an agent terminates early, it reflects on its trajectory and resumes the interaction within the same trial. We use Llama3.1-70b-Instruct as the backbone of LLM Agent and find that combining Early Exit with Reflexion significantly boosts both success and progress rates by over 10%, while keeping the average steps and token costs nearly identical to the baseline.

Citation

If you find this work helpful, please consider citing as follows:

@article{Lu2025AgentExit,
  title={Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments},
  author={Lu, Qingyu and Ding, Liang and Cao, Siyi and Liu, Xuebo and Zhang, Kanjian and Zhang, Jinxia and Tao, Dacheng},
  journal={arXiv preprint},
  url={https://arxiv.org/pdf/2505.17616},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
agentboard		agentboard
assets		assets
eval_configs		eval_configs
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AgentExit: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments

Abstract

Usage

Explanation of Metrics

Main Results & Findings

Practical Implications

Citation

About

Uh oh!

Releases

Packages

Languages

License

Coldmist-Lu/AgentExit

Folders and files

Latest commit

History

Repository files navigation

AgentExit: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments

Abstract

Usage

Explanation of Metrics

Main Results & Findings

Practical Implications

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages