Skip to content

feat: Interactive breakpoint and step-through debugging (AgentStepper) #183

@acailic

Description

@acailic

Paper Reference

  • Title: AgentStepper: Interactive Debugging of Software Development Agents
  • Authors: Robert Hutter, Michael Pradel
  • Year: 2026
  • URL: https://arxiv.org/abs/2602.06593
  • Venue: arXiv preprint

Paper Summary

First interactive debugger for LLM-based software engineering agents. Represents agent trajectories as structured conversations among LLM, agent program, and tools. Supports breakpoints, stepwise execution, and live editing of prompts/tool invocations. User study shows 60% bug identification vs 17% with conventional tools, and reduced frustration from 5.4/7.0 to 2.4/7.0.

Proposed Feature

Implement breakpoint and step-through debugging for agent sessions in Peaky Peek:

Core Capabilities

  • Breakpoints: Allow users to set breakpoints on specific agent decisions, tool calls, or event types. When replaying a session, execution pauses at each breakpoint.
  • Step Controls: Step into (next decision), step over (skip tool internals), step out (return to parent context).
  • Live Prompt Editing: At any breakpoint, allow inline editing of the agent's prompt/context to test alternative reasoning paths.
  • Variable Inspection: Show current agent state (context window, memory, tool results) at each breakpoint.

Technical Approach

  • Extend the existing checkpoint system to support breakpoint markers
  • Add step-through controls to the session replay UI
  • Implement state inspection panel showing agent context at each step
  • Add "branch and replay" capability from any breakpoint

Impact

This would make Peaky Peek the first open-source agent debugger with full interactive debugging capabilities, significantly differentiating from LangSmith (read-only tracing).

Labels

enhancement, paper-inspired, high-priority

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions