Skip to content

feat: prevent overfitting via prompt changes and post-processing#127

Open
andrewklatzke wants to merge 5 commits intoaklatzke/AIC-1795/optimize-method-ground-truth-pathfrom
aklatzke/AIC-2118/add-additional-validation-to-chaos-mode
Open

feat: prevent overfitting via prompt changes and post-processing#127
andrewklatzke wants to merge 5 commits intoaklatzke/AIC-1795/optimize-method-ground-truth-pathfrom
aklatzke/AIC-2118/add-additional-validation-to-chaos-mode

Conversation

@andrewklatzke
Copy link
Copy Markdown
Contributor

@andrewklatzke andrewklatzke commented Apr 7, 2026

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Describe the solution you've provided

Implements a few things to prevent "overfitting" of responses (the LLM tailoring its response to only one set of inputs and values):

  • "chaos" mode now runs an additional validation loop after a successful result is reached. The number of iterations is determined by the total dataset provided; a smaller dataset results in a smaller number of validation checks.
  • prompt updates with specific instructions about not overfitting to a single result
  • changed how variables are provided to the LLM; it was getting confused on the difference between the placeholders and the raw values that were being provided
  • adds a post-processing step to transform any raw values inserted into prompts back into their placeholder
  • adds a retry loop on failed variation generation (when the LLM responds with a 0 length input). Also mitigates this by changing the instructions regarding its tool call behavior and removing the structured output tool (relying on the LLM to return a valid JSON response).

Describe alternatives you've considered

This is an attempted fix at an overfitting problem that was reported.

Additional context

Output example after these changes:

You are the initial orchestrator for user questions regarding travel plans in a given location. Your role is to first fetch user preferences using the attached 'user-preferences-lookup' tool with the provided user ID: {{user_id}}. Based on the retrieved user preferences, including but not limited to the purpose of the trip (e.g., {{trip_purpose}}), accurately route the user's query to the correct sub-agent. Do not provide any direct answers yourself.\n\nRouting criteria:\n1. leisure-activity-agent: Handles questions about activities, events, or things to do in the area. For {{trip_purpose}} trips, only off-hour or leisure activities should be passed to this agent.\n2. lodging-agent: Manages inquiries about accommodations, including hotels, Airbnbs, or other lodging options.\n3. restaurant-agent: Handles questions about dining, restaurants, diets, or related topics.\n\nInstructions:\n- Always begin by fetching user preferences using the 'user-preferences-lookup' tool with the supplied user ID ({{user_id}}).\n- If user preferences are unavailable or cannot be fetched, your response should be an automatic failure with no further processing.\n- Utilize the fetched preferences to determine the trip purpose ({{trip_purpose}}) and any other relevant data.\n- Based on the user's question context and preferences, pass the entire user query along with relevant preference details to the appropriate sub-agent.\n- Explicitly mention the sub-agent you are handing off to in your response.\n- Do not answer the user's question directly.\n- If the user's input or preferences do not clearly map to any agent, respond with an automatic failure indicating missing or incomplete data.\n\nThis orchestration ensures that all queries are handled appropriately by the specialized sub-agents, maximizing relevance and user satisfaction.

The appropriate placeholders are now present rather than it hardcoding values directly into the prompt.


Note

Medium Risk
Medium risk because it changes the public callback contract (handle_agent_call/handle_judge_call now return OptimizationResponse) and alters optimization control flow by adding post-pass validation loops and retry behavior, which can affect integrations and run-time characteristics.

Overview
Adds a post-pass validation phase (“chaos mode”) that, after an initial passing iteration, reruns the agent on additional distinct sampled inputs/variables (2–5, scaled by pool size) before confirming success; failed validation rejects the candidate and continues with variation generation without consuming the attempt budget.

Refactors agent/judge callback plumbing to return a new OptimizationResponse (output + optional TokenUsage), records per-call durations, and persists generation_latency/token usage plus per-judge evaluation latencies/tokens in agent_optimization_result payloads.

Hardens variation generation and overfitting prevention by improving prompts (explicit placeholder key-vs-value guidance + overfitting warning section), broadening placeholder interpolation to support hyphenated keys, adding deterministic post-processing (restore_variable_placeholders) to revert leaked concrete values back to {{key}}, and retrying variation generation up to 3 times on empty/unparseable JSON while removing structured-output tool injection/handler routing.

Reviewed by Cursor Bugbot for commit 288336e. Bugbot is set up for automated code reviews on this repo. Configure here.

@andrewklatzke andrewklatzke requested a review from a team as a code owner April 7, 2026 20:02
@andrewklatzke andrewklatzke requested a review from jsonbailey April 8, 2026 16:34
optimize_context, iteration
)
if all_valid:
return self._handle_success(last_ctx, iteration)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Success returns validation context with inflated iteration number

Medium Severity

When validation passes, _handle_success receives last_ctx — the last validation sample's context — instead of optimize_context (the original passing turn). The validation context's .iteration is set to iteration + i + 1 (a synthetic validation-internal number), so the returned result and the "success" status update carry an inflated iteration number. For example, if the main loop passes on attempt 1 with 2 validation samples, the result reports iteration=3 instead of 1. This propagates into on_passing_result, on_status_update, and the API persistence layer via _persist_and_forward, causing misleading iteration counts in the UI and stored records.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3042984. Configure here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 288336e. Configure here.

f"— replaced {total_count} occurrence(s) with placeholder {placeholder}"
)

return text, warnings
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync callbacks crash in await_if_needed after return type change

High Severity

await_if_needed checks isinstance(result, str) to detect synchronous returns, but the callback signatures now return OptimizationResponse instead of str. When a synchronous (non-async) callback returns an OptimizationResponse, the isinstance check is False, so the code falls through to await result, which raises a TypeError because OptimizationResponse is not awaitable. All tests use AsyncMock so this path is untested.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 288336e. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants