fix(run): return structured conversation history and preserve multi-artifact tool results by mike-inkeep · Pull Request #2743 · inkeep/agents

mike-inkeep · 2026-03-18T14:48:01Z

Summary

getConversationHistoryWithCompression now returns MessageSelect[] instead of a pre-formatted string. Callers that need prompt text use the shared formatMessagesAsConversationHistory helper (wired from buildConversationHistory).
Tool-result rewriting for the ledger no longer keeps only one artifact per toolCallId: all matching artifacts are inlined as compact references, with a single Tool call args: line when present.
Message content.parts handling treats kind and legacy type as equivalent in reconstructMessageText and when persisting streamed assistant turns in ExecutionHandler.
Runtime DAL: getConversationHistory / context-window truncation are typed as MessageSelect[]; synthetic truncation summary rows are fully shaped MessageSelect records. MessageMetadata gains optional compressionType.

Key decisions

Structured return vs string at the API boundary: Compression and artifact substitution mutate or filter MessageSelect[]; formatting to <conversation_history> XML is explicit and reusable (formatMessagesAsConversationHistory) instead of being baked into getConversationHistoryWithCompression.
Multi-artifact tool calls: toolCallId maps to an array of artifacts so none are dropped when the ledger has multiple rows for the same id.

Manual QA

No manual QA performed.

Test plan

Unit/integration tests updated: Agent.test, conversations.test, conversations.artifact-replacement.test, ledgerArtifacts.test

…l sites

changeset-bot · 2026-03-18T14:48:06Z

⚠️ No Changeset found

Latest commit: 9ad1a65

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2026-03-18T14:48:14Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 25, 2026 7:54pm
agents-docs	Ready	Preview, Comment	Mar 25, 2026 7:54pm
agents-manage-ui	Ready	Preview, Comment	Mar 25, 2026 7:54pm

pullfrog

The return-type refactor from string to MessageSelect[] is well-structured — separation of fetching/compressing vs. formatting is a good direction, and the multi-artifact grouping fix is correct. Three areas need attention before merging: a behavioral inconsistency between getFormattedConversationHistory and the new formatMessagesAsConversationHistory, a createdAt regression in the summary message, and missing test coverage for two new code paths.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ pullfrog.com ｜ 𝕏}

pullfrog · 2026-03-18T14:51:33Z

@@ -937,8 +947,13 @@ function formatMessagesAsConversationHistory(messages: any[]): string {
        roleLabel = msg.role || 'system';
      }

-      return `${roleLabel}: """${reconstructMessageText(msg)}"""`;
+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)
    .join('\n');

  return `<conversation_history>\n${formattedHistory}\n</conversation_history>\n`;


The newly exported formatMessagesAsConversationHistory differs from the inline formatting in getFormattedConversationHistory (around line 376) in two meaningful ways:

This function uses reconstructMessageText(msg) (handles multi-part content with artifact refs), while the other uses msg.content.text directly — silently dropping artifact reference tags from multi-part messages.

This function filters out messages with empty reconstructed text, while the other includes them (producing role: """""" entries).

Since getFormattedConversationHistory is still actively called from AgentSession.ts, these inconsistencies produce different conversation history formats depending on the code path. Consider refactoring getFormattedConversationHistory to delegate to this exported function.

pullfrog · 2026-03-18T14:51:33Z

+        const referenceMessage = messageHistory[0];
+        const summaryMessage: MessageSelect = {
          id: `summary-${getConversationId()}`,
+          tenantId: referenceMessage.tenantId,
+          projectId: referenceMessage.projectId,
+          conversationId: referenceMessage.conversationId,
          role: 'system',
+          fromSubAgentId: null,
+          toSubAgentId: null,
+          fromExternalAgentId: null,
+          toExternalAgentId: null,
+          fromTeamAgentId: null,
+          toTeamAgentId: null,
          content: {
            text: `[Previous conversation history truncated - ${i + 1} earlier messages]`,
          },
          visibility: 'system',
          messageType: 'chat',
-          createdAt: messageHistory[0].createdAt,
+          taskId: null,
+          parentMessageId: null,
+          a2aTaskId: null,
+          a2aSessionId: null,
+          metadata: null,
+          createdAt: new Date().toISOString(),
+          updatedAt: new Date().toISOString(),


createdAt changed from messageHistory[0].createdAt to new Date().toISOString(). The summary message now has a timestamp after all the messages it summarizes, rather than being anchored to the start of the conversation. The array ordering is still correct (it's unshifted), but if any downstream code sorts or filters by createdAt, the summary would sort to the end. The previous behavior was more semantically correct — consider reverting to referenceMessage.createdAt.

pullfrog · 2026-03-18T14:51:33Z

+    const result = await getConversationHistoryWithCompression(baseParams);
+    const toolResult = result.find((msg) => msg.messageType === 'tool-result');
+    const toolResultText = toolResult?.content?.text ?? '';
+
+    expect(toolResultText).toContain('id: art-1');
+    expect(toolResultText).toContain('id: art-2');


This test verifies both artifact IDs are present but doesn't assert the \n join between them. If the join separator were accidentally changed to empty string or space, this test would still pass. Consider:

expect(toolResultText).toMatch(/id: art-1[\s\S]*\n[\s\S]*id: art-2/);

pullfrog · 2026-03-18T14:51:33Z

  return parts
    .map((part: any) => {
-      if (part.type === 'text') {
+      const partKind = part.kind ?? part.type;


The part.kind ?? part.type fallback handles both A2A protocol shape ({ kind: 'text' }) and legacy shape ({ type: 'text' }). The existing reconstructMessageText tests only cover { type: ... } parts — there's no test coverage for the { kind: ... } shape. Add a test case to verify the fallback works.

pullfrog · 2026-03-18T14:51:33Z

+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)


formatMessagesAsConversationHistory now filters out messages where reconstructMessageText returns empty string — this is new behavior (previously all messages were formatted). A message with only non-text/non-artifact parts (e.g. file parts) would be silently dropped. This needs a direct test, and it's worth confirming this is the desired behavior for all message types.

claude

PR Review Summary

(1) Total Issues | Risk: Low

🟡 Minor (1) 🟡

Inline Comments:

🟡 Minor: conversations.ts:510 Args duplicated for each artifact sharing a toolCallId

💭 Consider (2) 💭

💭 1) conversations.ts:903 Defensive fallback for part.kind ?? part.type lacks documentation

Issue: The change from part.type to part.kind ?? part.type suggests two different part schemas exist, but there's no documentation of when each occurs.
Why: The canonical types use kind. If type is a legacy format, a comment would clarify intent.
Fix: Add inline comment explaining the variants, or log occurrences if type is unexpected.
Refs: utility.ts:91-96 (canonical kind-based schema)

💭 2) conversations.ts:950-956 Behavioral change - empty messages now filtered out

Issue: New implementation filters out messages with empty reconstructed text. Previously these appeared as roleLabel: """""".
Why: This is arguably an improvement but goes beyond the stated refactoring goal.
Fix: Document this as intentional and consider adding a test case.

Inline Comments:

💭 Consider: conversations.ts:903 Defensive kind ?? type fallback
💭 Consider: conversations.ts:950-956 Empty message filtering behavior change

💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-structured refactoring that improves type safety by returning MessageSelect[] from getConversationHistoryWithCompression and moving formatting to call sites. The multi-artifact support for shared toolCallId values is a valuable fix. The one Minor issue (args duplication) is a small optimization opportunity, and the Consider items are documentation/testing suggestions rather than blocking concerns.

The PR is ready to merge. Nice work on improving the type boundaries! 🎉

Discarded (3)

Location	Issue	Reason Discarded
`conversations.artifact-replacement.test.ts:147`	Test doesn't verify newline-joining format	Nitpick - the core assertion (both artifacts present) is correct
`conversations.test.ts`	Missing tests for `kind` property in `reconstructMessageText`	Valid but downgraded - defensive code for compatibility, not a bug risk
`conversations.artifact-replacement.test.ts:35`	Test helper missing `MessageSelect` fields	Pre-existing pattern in test fixtures, low impact

Reviewers (5)

Reviewer	Returned	Consider	Inline Comments	Discarded
`pr-review-tests`	4	0	0	3
`pr-review-precision`	3	2	1	0
`pr-review-standards`	0	0	0	0
`pr-review-types`	0	0	0	0
`pr-review-errors`	0	0	0	0
Total	7	2	1	3

claude · 2026-03-18T14:54:00Z

+            const artifactParts = [
+              `Artifact: "${artifact.name ?? artifact.artifactId}" (id: ${artifact.artifactId})`,
+            ];
+            if (argsStr) artifactParts.push(`args: ${argsStr}`);


🟡 Minor: Args duplicated for each artifact sharing a toolCallId

Issue: When multiple artifacts share the same toolCallId, the argsStr (computed once from message metadata at lines 492-495) gets appended to every artifact's reference string inside the map. Since all artifacts for the same tool call came from the same invocation, they have identical toolArgs, resulting in the same args: {...} appearing N times in the output.

Why: This unnecessarily bloats token count in the conversation history. For tool calls that produce multiple artifacts (e.g., multiple images or documents), the args string could be repeated 2-10x.

Fix: Consider moving args to a single prefix line before the artifact list:

const argsLine = argsStr ? `Tool call args: ${argsStr}\n` : ''; const artifactRefs = relatedArtifacts.map((artifact) => { // ... build artifact-specific parts WITHOUT args }); return { ...msg, content: { text: argsLine + artifactRefs.join('\n') }, };

Refs:

Test showing multi-artifact case

claude · 2026-03-18T14:54:05Z

  return parts
    .map((part: any) => {
-      if (part.type === 'text') {
+      const partKind = part.kind ?? part.type;


💭 Consider: Defensive fallback for part.kind ?? part.type

Issue: This fallback suggests there are two different part schemas in play: the canonical MessageContent.parts type (which uses kind per the schema definitions) and some other format that uses type. The fallback is reasonable defensive code, but there's no documentation explaining when each variant occurs.

Why: Without clarity on the source of type-keyed parts, this could mask upstream inconsistencies rather than solve a real polymorphism need. The canonical types in utility.ts and a2a.ts both use kind.

Fix: If both are legitimate variants (e.g., legacy data or external formats), add a brief inline comment:

// Support both 'kind' (canonical schema) and 'type' (legacy/external format) const partKind = part.kind ?? part.type;

If type is never expected, consider logging when encountered to track occurrences.

claude · 2026-03-18T14:54:08Z

+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)


💭 Consider: Behavioral change - empty messages now filtered out

Issue: The new implementation filters out messages with empty reconstructed text (returning null then filtering). The previous implementation would include these as roleLabel: """""". This is arguably an improvement, but goes beyond the stated refactoring goal.

Why: This could affect downstream consumers that rely on message boundaries (e.g., compression logic counting messages) or debugging traces. If intentional, it's worth documenting.

Fix: If intentional, consider adding a test case validating the filtering behavior and noting this in the PR description as a deliberate improvement.

@tim-inkeep does this seem reasonable? It's now filtering out empty messages, whereas before it would sometimes have messages like user: "". I did this because a user a submitting an image without text would sometimes confuse the model, which only saw the empty user text and replied with something like "how can I help you?"

i think this is fine

pullfrog · 2026-03-23T23:10:01Z

TL;DR — Shifts getConversationHistoryWithCompression from returning a pre-formatted string to returning raw MessageSelect[], moving formatting responsibility to call sites via the now-exported formatMessagesAsConversationHistory. Also fixes a data-loss bug where multiple artifacts sharing a toolCallId collapsed to a single reference, and tightens any → MessageSelect typing across the history stack.

Key changes

getConversationHistoryWithCompression returns MessageSelect[] instead of string — call sites in buildConversationHistory now explicitly call formatMessagesAsConversationHistory to produce the final string, decoupling data retrieval from presentation.
Fix multi-artifact toolCallId grouping — the artifact-replacement logic now collects all artifacts per toolCallId into an array instead of keeping only the last one, preserving every reference in the compressed output.
Export formatMessagesAsConversationHistory — promoted from a private function to a public export so call sites can format MessageSelect[] on demand; filters out messages that reconstruct to empty strings and short-circuits on empty input.
Replace any with MessageSelect across history APIs — getScopedHistory, applyContextWindowManagement, getConversationHistory, and related locals are now properly typed, including a fully-typed summary placeholder message.
Normalize kind/type part detection — reconstructMessageText and executionHandler now use part.kind ?? part.type to support both the current field name and legacy shapes.
Deduplicate formatting in getFormattedConversationHistory — replaced 29 lines of inline role-label formatting logic with a single call to formatMessagesAsConversationHistory.
Add compressionType to MessageMetadata — new optional field on the metadata type to support compression-related annotations.

_{Summary ｜ 9 files ｜ 5 commits ｜ base: main ← stack/history_shape}

Return structured messages instead of pre-formatted strings

Before: getConversationHistoryWithCompression called formatMessagesAsConversationHistory internally and returned a string.
After: It returns MessageSelect[]; buildConversationHistory calls the formatter explicitly.

This decouples history retrieval from serialization, letting future consumers of the history data (e.g. token-counting, streaming, or alternate formats) operate on structured messages without re-parsing a string. The summary placeholder created by applyContextWindowManagement is now a fully-typed MessageSelect object populated from a referenceMessage, satisfying the complete type contract instead of using a loosely-typed bag.

conversations.ts · conversation-history.ts · runtime/conversations.ts

Multi-artifact grouping per `toolCallId`

Before: artifactsByToolCallId was a Map<string, Artifact> — when multiple artifacts shared a toolCallId, only the last one survived.
After: It is a Map<string, Artifact[]>, and the replacement template iterates all related artifacts, joining their references with newlines.

A shared Tool call args: header line is emitted once per tool call, followed by separate [Artifact: ...] blocks for each artifact. New test cases in both conversations.artifact-replacement.test.ts and ledgerArtifacts.test.ts validate the 1:N relationship end-to-end.

conversations.ts · conversations.artifact-replacement.test.ts · ledgerArtifacts.test.ts

`kind`/`type` normalization in message part detection

Before: Code checked part.kind or part.type inconsistently — some paths only checked one field.
After: Both reconstructMessageText and executionHandler use part.kind ?? part.type, ensuring legacy data with type and current data with kind are handled uniformly.

The executionHandler extracts this into a getResponsePartKind() helper for reuse across text/data detection. A dedicated test validates the backward-compatible fallback for legacy type-based parts.

conversations.ts · executionHandler.ts · conversations.test.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: Changes since last review (8eea5bf88a06). The delta consists of a merge commit bringing in PDF attachment support from origin/main (PR #2709).

Delta Review Summary

Files with delta changes:

conversation-history.ts: Added mapFileToAiSdkContentPart helper for PDF/image handling, renamed imageParts → fileParts
conversations.ts: Updated createMessage API to use { scopes, data } structure
Agent.test.ts: Added PDF test + mock path rename

Findings: No new issues meeting the severity/confidence thresholds were identified in the delta changes. The PDF support code is well-structured with proper type handling, defensive null returns for unsupported file types, and appropriate warning logging.

🕐 Pending Recommendations (5)

Prior review feedback that remains applicable to the original PR changes:

🟡 conversations.ts:510 Args duplicated for each artifact sharing a toolCallId
💭 conversations.ts:903 Defensive kind ?? type fallback lacks documentation
💭 conversations.ts:950-956 Empty messages now filtered out (behavioral change)
🟠 runtime/conversations.ts:268 createdAt regression in summary message (pullfrog)
🟠 conversations.ts:961 Behavioral inconsistency between formatting functions (pullfrog)

✅ APPROVE

Summary: The delta changes from the merge commit are clean. PDF attachment support is correctly integrated with the existing file handling infrastructure. Prior review feedback from the first review run remains applicable to the original PR changes — see Pending Recommendations above. 🎉

Reviewers (2)

Reviewer	Returned	Discarded
`pr-review-standards`	0	0
`pr-review-tests`	3	3
Total	3	3

Note: Test coverage findings (unsupported file types, filename metadata, URI path) were assessed as MEDIUM/LOW confidence and discarded per review criteria.

itoqa · 2026-03-24T01:02:20Z

Ito Test Report ❌

7 test cases ran. 1 failed, 6 passed.

Overall, the unified run executed 7 test cases with 6 passes and 1 failure (plus several verification runs with no executable cases), indicating generally solid behavior with one confirmed production defect. Key findings were that auth setup and run-domain gates behaved correctly (web_client app creation returned 201, PoW was required and successfully minted anon tokens when solved, and unauthorized origins were blocked with 403), while chat edge checks for 10 parallel requests and a ~50k prompt were stable, but first-turn requests incorrectly inject an empty <conversation_history> wrapper (medium severity, likely introduced by this PR) causing no-history semantics drift.

❌ Failed (1)

Category	Summary	Screenshot
Edge	🟠 First-turn requests can still inject an empty <conversation_history> wrapper when prior history is empty.	N/A

🟠 Empty conversation history wrapper injected on first turn

What failed: Expected no conversation-history payload on first turn, but the formatter still emits <conversation_history>\n\n</conversation_history>\n, which is treated as non-empty and injected.
Impact: First-turn prompts can carry an empty synthetic wrapper that should not exist, adding avoidable prompt noise and behavior drift. This can cause brittle instruction-following in edge flows that depend on exact no-history semantics.
Introduced by this PR: Yes – this PR modified the relevant code
Steps to reproduce:
1. Create a brand-new conversation with no prior messages.
2. Call the chat endpoint for the first turn and inspect how conversation history is built.
3. Observe that an empty <conversation_history> wrapper is still added to initial model messages.
Code analysis: Inspected conversation-history retrieval/formatting and message assembly paths. getConversationHistoryWithCompression correctly returns an empty array when there is no history, but formatMessagesAsConversationHistory always wraps output, and buildInitialMessages only checks trim() !== '', so the wrapper is injected even when there are zero historical messages.
Why this is likely a bug: The production path composes [] history into a non-empty wrapper string and then unconditionally injects it, which contradicts intended first-turn no-history behavior.

Relevant code:

agents-api/src/domains/run/data/conversations.ts (lines 462-464)

if (!messagesToFormat.length) {
  return [];
}

agents-api/src/domains/run/data/conversations.ts (lines 925-962)

export function formatMessagesAsConversationHistory(messages: MessageSelect[]): string {
  const formattedHistory = messages
    .map((msg: MessageSelect) => {
      // ...role formatting...
      const reconstructedMessage = reconstructMessageText(msg);
      if (!reconstructedMessage) {
        return null;
      }
      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
    .filter((line): line is string => line !== null)
    .join('\n');

  return `<conversation_history>\n${formattedHistory}\n</conversation_history>\n`;
}

agents-api/src/domains/run/agents/generation/conversation-history.ts (lines 124-126)

if (conversationHistory.trim() !== '') {
  messages.push({ role: 'user', content: conversationHistory });
}

✅ Passed (6)

Category	Summary	Screenshot
Adversarial	Valid PoW from an unauthorized origin is denied, preventing token issuance.
Auth	Created prerequisite project/app setup and successfully created a web_client app via manage API (HTTP 201).
Auth	Anonymous-session creation without a PoW solution is rejected with the expected bad-request error.
Auth	Solved PoW challenge mints an anonymous session token successfully, with expected JWT claims.
Edge	10 parallel non-stream requests succeeded with HTTP 200 and conversation remained readable.
Edge	A ~50k-character prompt completed with HTTP 200 and no uncaught 500/internal error.

Commit: a96c949

View Full Run

Tell us how we did: Give Ito Feedback

claude

PR Review Summary

(1) Total Issues | Risk: Medium

Re-review scope: Changes since last review (a96c94951ea5). The delta includes 6 commits with improvements that address prior feedback.

Delta Review Summary

Prior Feedback Addressed in This Delta:

✅ Args duplication fixed — argsStr is now output once as a prefix argsLine, not per-artifact
✅ createdAt regression fixed — Summary message now uses referenceMessage.createdAt instead of new Date()
✅ Type safety improved — toolCallId now has typeof === 'string' guard; reconstructMessageText properly typed
✅ Test coverage for kind ?? type — Added test for legacy part.type shape fallback

Delta Code Quality:
The delta changes are clean improvements that tighten types and fix prior issues. The getResponsePartKind helper in executionHandler.ts nicely encapsulates the kind ?? type fallback pattern.

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

🟠 Major: conversations.ts:901 Empty array produces non-empty wrapper (Ito test failure root cause)

💭 Consider (2) 💭

💭 1) conversations.test.ts Missing test for content.text fallback when parts yield empty

Issue: The implementation at reconstructMessageText (line 898) returns fromParts || textFallback, but there's no test verifying the fallback path when parts exist but produce empty output.
Why: This behavioral change was noted in prior reviews but lacks explicit test coverage documenting the contract.
Fix: Add test: it('falls back to content.text when parts yield empty') with fixture { content: { text: 'fallback', parts: [{ kind: 'text' }] } }.

💭 2) conversations.test.ts Missing test for mixed kind/type parts in same message

Issue: No test verifies behavior when a message contains BOTH legacy type and modern kind parts.
Why: Edge case during data migration; implementation handles it per-part but it's untested.
Fix: Add test with mixed fixture: parts: [{ kind: 'text', text: 'a' }, { type: 'text', text: 'b' }].

🕐 Pending Recommendations (4)

Prior review feedback that may still apply:

💭 conversations.ts:879 Defensive kind ?? type fallback — consider adding inline comment explaining the two schemas
💭 conversations.ts:929-934 Empty message filtering behavior change — document as intentional (relates to mike-inkeep's question to tim-inkeep)
🟠 conversations.ts:937 Behavioral inconsistency between getFormattedConversationHistory and formatMessagesAsConversationHistory (pullfrog) — now unified, but empty array handling differs
🟠 conversations.artifact-replacement.test.ts:147 Test doesn't verify newline join between artifacts (pullfrog) — now includes /\]\s*\n\n\[/ assertion

🚫 REQUEST CHANGES

Summary: The delta changes are solid improvements that address prior feedback. However, there's one Major issue that needs fixing before merge: formatMessagesAsConversationHistory([]) returns a non-empty wrapper string instead of empty string, causing the Ito test failure where first-turn prompts incorrectly include an empty <conversation_history> block. This is a simple fix (early return for empty array).

Discarded (2)

Location	Issue	Reason Discarded
`executionHandler.ts:37`	`getResponsePartKind` lacks direct unit test	Low priority - trivial helper, integration coverage sufficient
`utility.ts:119`	`compressionType` addition undocumented	Additive type change, no behavioral impact

Reviewers (2)

Reviewer	Returned	Main Findings	Consider	Inline Comments	Discarded
`pr-review-standards`	0	0	0	0	0
`pr-review-tests`	4	0	2	0	2
Orchestrator	1	1	0	1	0
Total	5	1	2	1	2

Note: The Major finding was identified by cross-referencing the Ito test failure with the code path analysis.

claude · 2026-03-24T15:43:04Z

 }

-function formatMessagesAsConversationHistory(messages: any[]): string {
+export function formatMessagesAsConversationHistory(messages: MessageSelect[]): string {


🟠 MAJOR: Empty array produces non-empty wrapper (Ito test failure root cause)

Issue: When messages is empty, this function returns <conversation_history>\n\n</conversation_history>\n instead of an empty string. This wrapper passes the trim() !== '' check in buildInitialMessages, causing an empty history wrapper to be injected into first-turn prompts.

Why: This was flagged by the Ito test suite: "First-turn requests can still inject an empty <conversation_history> wrapper when prior history is empty." This adds noise to first-turn prompts and can cause behavioral drift in edge cases that depend on no-history semantics.

Fix: Return empty string for empty input:

export function formatMessagesAsConversationHistory(messages: MessageSelect[]): string { if (messages.length === 0) { return ''; } // ... rest of function }

Refs:

Ito test report — documents the failure

buildInitialMessages:124 — the check that passes erroneously

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: Changes since last review (37a910e7169a). The delta includes 2 commits with fixes that address the blocking issue from the prior review.

Delta Review Summary

Prior Blocking Issue Resolved:

✅ Empty wrapper injection FIXED — formatMessagesAsConversationHistory([]) now returns '' instead of <conversation_history>\n\n</conversation_history>\n. Two guards added: early return for empty array (line 903-905) and check for empty formattedHistory after filtering (lines 942-944). This directly addresses the Ito test failure.

Additional Improvements in Delta:

✅ Inline comment added — Line 879-880 now documents the kind ?? type fallback: "Canonical MessageContent.parts use kind; older persisted rows and some external payloads might still use type."
✅ Test coverage added — New describe('formatMessagesAsConversationHistory') block with 3 tests covering empty array, all-empty-reconstruction, and normal wrapping behavior.

Code Quality:
The delta changes are minimal and focused. The fix is correctly placed — both the early return and the post-filter check are necessary to handle (1) empty input and (2) input that produces no reconstructable text.

🕐 Pending Recommendations (2)

Lower-priority items from prior reviews that remain applicable but are non-blocking:

💭 conversations.test.ts Missing test for content.text fallback when parts exist but yield empty — would document the fromParts || textFallback contract
💭 conversations.test.ts Missing test for mixed kind/type parts in same message — edge case during data migration

✅ APPROVE

Summary: The delta resolves the blocking Ito test failure. The empty-wrapper injection bug is fixed with proper guards at both the array-level and post-reconstruction-filtering levels. The implementation is clean, the fix is well-tested, and the inline documentation for the kind ?? type fallback addresses prior feedback. Ship it! 🚀

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	2	0	0	0	0	2	0
Total	2	0	0	0	0	2	0

Note: Delta was small enough to review directly without dispatching sub-reviewers.

itoqa · 2026-03-24T19:30:30Z

Ito Test Report ✅

15 test cases ran. 15 passed.

Across the unified QA run, 15 executed test cases passed with 0 failures, and two additional sessions reported no includable verification results. The most important confirmed behaviors were stable conversation continuity and persistence across /run/api/chat and OpenAI-compatible /run/v1/chat/completions (including SSE [DONE]), clean first-turn handling without conversation-history leakage, correct and safe history/tool-result formatting (artifact substitution, no-artifact preservation, truncation limits, and wrapper-text injection resistance), robust non-500 handling for malformed payloads and header abuse, resilience through refresh/navigation/mobile/abort/deep-link flows without duplication or corruption, and cross-context authorization isolation with no secret leakage.

✅ Passed (15)

Category	Summary	Screenshot
Adversarial	Confirmed truncation guards limit tool args/artifact summaries and prevent oversized raw payload inclusion.
Adversarial	Confirmed injection-like wrapper text is treated as literal message content, not structural history markup.
Adversarial	Replayed a context-A conversationId using context-B token; no DELTA-77 leakage appeared in context-B response.
Edge	Malformed data-part input returned controlled handling and conversation remained usable on follow-up.
Edge	Re-test showed refresh-mid-stream resume preserved conversation continuity without corruption.
Edge	Repeated submit plus back/forward churn persisted one logical user turn per submit.
Edge	Mobile viewport execution preserved two-turn continuity and conversation persistence.
Journey	Deep-link continuation with an existing conversationId remained stable after refresh.
Logic	First-turn `/run/api/chat` on a fresh conversationId returned 200 with clean output and no `<conversation_history>` leakage.
Logic	Confirmed multi-artifact substitution preserves all artifact references for a shared toolCallId.
Logic	Confirmed tool-result content remains unchanged when no matching artifact exists.
Rapid	Abort-then-resubmit sequence completed with stable history and no retry duplication.
Happy-path	Two sequential `/run/api/chat` calls with the same conversationId returned 200 and preserved `ALPHA-17` context on turn 2.
Happy-path	Two-turn `/run/v1/chat/completions` flow returned 200 with OpenAI-style SSE + `[DONE]` and recalled `BRAVO-29` on turn 2.
Sec	Invalid client headers and malformed header-map input were handled safely without 500 responses.

Commit: 393705c

View Full Run

Tell us how we did: Give Ito Feedback

tim-inkeep · 2026-03-24T20:00:19Z

+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)


i think this is fine

tim-inkeep · 2026-03-24T20:02:18Z

@mike-inkeep I would just want to make sure that all functionality still works changing to parts, but i think it will

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: Changes since last review (393705cdb17a). The delta consists of 1 merge commit (b38c54045) bringing main into the PR branch — no changes to the PR's implementation files.

Delta Review Summary

Delta contains no PR-relevant changes. The commit history since the last automated review shows only:

b38c54045 — Merge branch 'main' into stack/history_shape
Various unrelated commits from main: UI component updates (#2829, #2830), skill additions (#2824, #2825, #2826), package upgrades (#2809, #2812), and documentation fixes (#2831)

PR Implementation Files Unchanged:

conversations.ts, conversation-history.ts, executionHandler.ts — no changes
runtime/conversations.ts, utility.ts — no changes
All test files — no changes

Prior Review Status:

✅ All blocking issues resolved in prior commits (empty wrapper injection, args duplication, createdAt regression)
✅ Human approval received from tim-inkeep (2026-03-24)
✅ Ito tests: 15/15 passed (2026-03-24)
✅ Prior automated review approved (2026-03-24T17:28:49Z)

🕐 Pending Recommendations (2)

Lower-priority items from prior reviews that remain applicable but non-blocking:

💭 conversations.test.ts Missing test for content.text fallback when parts exist but yield empty
💭 conversations.test.ts Missing test for mixed kind/type parts in same message

✅ APPROVE

Summary: No substantive changes since the last review — the delta is purely a merge from main. All prior blocking issues were resolved, tests pass, and human approval was received. Ship it! 🚀

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	0	0	0	0	0	2	0
Total	0	0	0	0	0	2	0

Note: Delta was merge-only with no PR-relevant changes; no sub-reviewers dispatched.

vercel · 2026-03-25T16:37:25Z

Deployment failed with the following error:

Failed to create deployment for team_um5iJbF9BOPOQPqiPlXKmPeN in project prj_0WiiMeKIgrGu2CrWXvNtCL3A5iBD: FetchError: request to https://76.76.21.112/v13/now/deployments?ownerId=team_um5iJbF9BOPOQPqiPlXKmPeN&projectId=prj_0WiiMeKIgrGu2CrWXvNtCL3A5iBD&skipAutoDetectionConfirmation=1&teamId=team_um5iJbF9BOPOQPqiPlXKmPeN&traceCarrier=%7B%22ot-baggage-webhookAt%22%3A%221774454931648%22%2C%22ot-baggage-senderUsername%22%3A%22gh.mike-inkeep%22%2C%22x-datadog-trace-id%22%3A%229058888615371130173%22%2C%22x-datadog-parent-id%22%3A%227238391149465214436%22%2C%22x-datadog-sampling-priority%22%3A%222%22%2C%22x-datadog-tags%22%3A%22_dd.p.tid%3D69c4089300000000%2C_dd.p.dm%3D-3%22%2C%22traceparent%22%3A%2200-69c40893000000007db7a3331b06c53d-6473ee6659349de4-01%22%2C%22tracestate%22%3A%22dd%3Dt.tid%3A69c4089300000000%3Bt.dm%3A-3%3Bs%3A2%3Bp%3A6473ee6659349de4%22%7D failed, reason: socket hang up

vercel · 2026-03-25T16:37:28Z

Deployment failed with the following error:

Failed to create deployment for team_um5iJbF9BOPOQPqiPlXKmPeN in project prj_OMMC7n1DAjz3wtZVufMISb3ZlXwb: FetchError: request to https://76.76.21.112/v13/now/deployments?ownerId=team_um5iJbF9BOPOQPqiPlXKmPeN&projectId=prj_OMMC7n1DAjz3wtZVufMISb3ZlXwb&skipAutoDetectionConfirmation=1&teamId=team_um5iJbF9BOPOQPqiPlXKmPeN&traceCarrier=%7B%22ot-baggage-webhookAt%22%3A%221774454931648%22%2C%22ot-baggage-senderUsername%22%3A%22gh.mike-inkeep%22%2C%22x-datadog-trace-id%22%3A%229058888615371130173%22%2C%22x-datadog-parent-id%22%3A%227238391149465214436%22%2C%22x-datadog-sampling-priority%22%3A%222%22%2C%22x-datadog-tags%22%3A%22_dd.p.tid%3D69c4089300000000%2C_dd.p.dm%3D-3%22%2C%22traceparent%22%3A%2200-69c40893000000007db7a3331b06c53d-6473ee6659349de4-01%22%2C%22tracestate%22%3A%22dd%3Dt.tid%3A69c4089300000000%3Bt.dm%3A-3%3Bs%3A2%3Bp%3A6473ee6659349de4%22%7D failed, reason: socket hang up

vercel · 2026-03-25T16:37:30Z

Deployment failed with the following error:

Failed to create deployment for team_um5iJbF9BOPOQPqiPlXKmPeN in project prj_FSIKo7PgY96y3XYJpsnqD4Geyajw: FetchError: request to https://76.76.21.112/v13/now/deployments?ownerId=team_um5iJbF9BOPOQPqiPlXKmPeN&projectId=prj_FSIKo7PgY96y3XYJpsnqD4Geyajw&skipAutoDetectionConfirmation=1&teamId=team_um5iJbF9BOPOQPqiPlXKmPeN&traceCarrier=%7B%22ot-baggage-webhookAt%22%3A%221774454931648%22%2C%22ot-baggage-senderUsername%22%3A%22gh.mike-inkeep%22%2C%22x-datadog-trace-id%22%3A%229058888615371130173%22%2C%22x-datadog-parent-id%22%3A%22975793348311734464%22%2C%22x-datadog-sampling-priority%22%3A%222%22%2C%22x-datadog-tags%22%3A%22_dd.p.tid%3D69c4089300000000%2C_dd.p.dm%3D-3%22%2C%22traceparent%22%3A%2200-69c40893000000007db7a3331b06c53d-0d8ab6e15c4f58c0-01%22%2C%22tracestate%22%3A%22dd%3Dt.tid%3A69c4089300000000%3Bt.dm%3A-3%3Bs%3A2%3Bp%3A0d8ab6e15c4f58c0%22%7D failed, reason: socket hang up

vercel · 2026-03-25T16:38:15Z

Deployment failed with the following error:

Creating the Deployment Timed Out.

itoqa · 2026-03-25T18:48:35Z

Ito Test Report ❌

12 test cases ran. 1 failed, 11 passed.

Overall, 11 of 12 test cases passed, confirming stable baseline playground and API behavior across friendly-agent and activities-planner flows, including successful 200 chat responses, correct SSE streaming with terminal [DONE], proper 400 validation on malformed content-part input, controlled handling of ~40k-character prompts without UI freeze, solid mobile usability (390x844), and working tool/timeline rendering (with observed instability attributed to environment/login harness issues rather than product logic). Security/adversarial coverage was strong—unauthorized access denied, conversation-ID probing blocked, XSS payloads rendered inert, concurrent same-conversation writes preserved, and malformed custom-header JSON blocked client-side—but one medium issue remains where mismatched spoofed tenant/project/agent headers can trigger a 500 “Context validation failed” instead of a deterministic 4xx due to pre-existing context-validation error handling.

❌ Failed (1)

Category	Summary	Screenshot
Adversarial	🟠 Mismatched agent header can return `500 Context validation failed` instead of a scoped auth/validation denial.

🟠 Header spoofing with mismatched tenant/project/agent

What failed: The invalid-agent spoof path returns a 500 internal error (Context validation failed) rather than a controlled client/authz-style denial (4xx).
Impact: Invalid scope/header requests can trigger avoidable 500s, which obscures the real client error and degrades API reliability signals. Security-denial paths should fail deterministically with 4xx semantics.
Introduced by this PR: No – pre-existing bug (code not changed in this PR)
Steps to reproduce:
1. Capture a valid bearer token from an authenticated local playground session.
2. Send POST /run/api/chat with mismatched spoofed headers, especially an invalid x-inkeep-agent-id.
3. Observe that one spoof path responds with 500 Context validation failed instead of a controlled 4xx denial.
Code analysis: I reviewed the run context validation flow and found two compounding issues: (1) validateHeaders dereferences agent.contextConfig without guarding missing agents, and (2) middleware throws a bad_request for invalid validation but then catches and rewrites all thrown errors to internal_server_error.
Why this is likely a bug: The production middleware currently transforms expected validation failures into 500 responses, so the observed behavior follows directly from code paths rather than test setup.

Relevant code:

agents-api/src/domains/run/context/validation.ts (lines 282-287)

const { tenantId, projectId, agentId, project } = executionContext;
logger.info({ tenantId, projectId, agentId }, 'Validating headers');
const agent = project.agents[agentId];

const contextConfig = agent.contextConfig;
logger.info({ contextConfig }, 'Context config found');

agents-api/src/domains/run/context/validation.ts (lines 417-421)

const errorMessage = `Invalid headers: ${validationResult.errors.map((e) => `${e.field}: ${e.message}`).join(', ')}`;
throw createApiError({
  code: 'bad_request',
  message: errorMessage,
});

agents-api/src/domains/run/context/validation.ts (lines 436-446)

} catch (error) {
  logger.error(
    {
      error: error instanceof Error ? error.message : 'Unknown error',
    },
    'Context validation middleware error'
  );
  throw createApiError({
    code: 'internal_server_error',
    message: 'Context validation failed',
  });
}

✅ Passed (11)

Category	Summary	Screenshot
Adversarial	Unauthenticated POST to `/run/api/chat` was denied with 401 and no assistant payload.
Adversarial	Unknown conversation probe was denied (401) with no foreign conversation data.
Adversarial	Script payload rendered as inert text and no `window.__xss_probe` execution signal appeared.
Adversarial	Near-simultaneous writes to one conversation persisted correctly with no corruption or dropped turns.
Adversarial	Invalid custom-header JSON is rejected client-side and prevents chat send/request dispatch.
Edge	Mobile viewport 390x844 remained usable, with input/send visible and readable assistant output without horizontal overflow.
Edge	Submitted a ~40,000-character prompt in the friendly-agent playground. The app handled it in a controlled manner by returning a visible error response and remained interactive (no freeze/silent stall).
Edge	Malformed payload with unsupported content-part type returned HTTP 400 validation failure with no stream output.
Happy-path	Baseline playground chat request succeeded (200); response content matched the configured mock-model behavior for this environment.
Happy-path	Authorized streaming request returned HTTP 200 SSE with assistant content deltas and terminal [DONE].
Happy-path	Tool-heavy activities-planner behavior is implemented; observed instability was environment-related, not a product defect.

Commit: b38c540

View Full Run

Tell us how we did: Give Ito Feedback

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: Changes since last review (b38c54045a1e). The delta includes 2 commits — both merge commits bringing main into the PR branch.

Delta Review Summary

Delta contains no substantive PR-relevant changes. The commit history since the last automated review shows:

9ad1a65de — Merge remote-tracking branch 'origin/main' into stack/history_shape
1199d452f — feat(agents-api): Add PDF URL attachments and Vercel-compatible file parts in chat (#2804)

PR Implementation Files:
The only change to PR files is a trivial export keyword addition to ExecutionHandlerParams in executionHandler.ts (line 41). This is a non-behavioral change that simply exposes the interface for external consumption — likely needed by the merged PDF attachment feature.

All other PR files unchanged:

conversations.ts, conversation-history.ts — no changes
runtime/conversations.ts, utility.ts — no changes
All test files — no changes

Prior Review Status:

✅ All blocking issues resolved in prior commits (empty wrapper injection, args duplication, createdAt regression)
✅ Human approval received from tim-inkeep (2026-03-24)
✅ Ito tests: 15/15 passed (2026-03-24), 11/12 passed (2026-03-25, 1 failure is pre-existing)
✅ Prior automated reviews approved

🕐 Pending Recommendations (2)

Lower-priority items from prior reviews that remain applicable but non-blocking:

💭 conversations.test.ts Missing test for content.text fallback when parts exist but yield empty
💭 conversations.test.ts Missing test for mixed kind/type parts in same message

✅ APPROVE

Summary: No substantive changes since the last review — the delta is purely merge commits from main. The only PR file change is exposing ExecutionHandlerParams as a public export (1 word change). All prior blocking issues were resolved, tests pass, and human approval was received. Ready to ship! 🚀

Reviewers (1)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
`orchestrator`	0	0	0	0	0	2	0
Total	0	0	0	0	0	2	0

Note: Delta was merge-only with no substantive PR-relevant changes; no sub-reviewers dispatched.

itoqa · 2026-03-25T21:09:13Z

Ito Test Report ✅

15 test cases ran. 15 passed.

The unified run passed completely with 15/15 executed test cases successful (0 failures), and additional verification-only passes found no code-supported production defects. Key findings were stable chat streaming and conversation retrieval under correct end-user auth/PoW with enforced ownership boundaries, resilient history/serialization behavior (including normalized parts, duplicate messages, legacy part types, artifact substitution/fallback, and safe truncation on long threads), and robust edge/adversarial handling across mid-stream refresh recovery, mobile and back/forward continuity, injection/malformed inputs, and burst concurrent submissions.

✅ Passed (15)

Category	Summary	Screenshot
Adversarial	Injection-like content and follow-up both succeeded without framing corruption or server errors.
Adversarial	Malformed payload returned a clean validation error, and follow-up requests remained healthy.
Adversarial	Concurrent submissions completed successfully and conversation retrieval remained coherent under corrected auth flow.
Adversarial	Cross-session conversation access was denied and conversation listing stayed isolated per session.
Edge	Two identical consecutive user messages completed successfully and were preserved in order on conversation retrieval.
Edge	Mid-stream interruption recovered by resuming on the same conversationId, and conversation retrieval remained valid.
Edge	Mobile viewport (390x844) remained usable, with no horizontal overflow and successful rendering of a new turn.
Edge	After rapid back/forward churn, follow-up messaging stayed on the same conversation and chronology remained intact.
Logic	Follow-up on a seeded legacy conversation completed successfully with stable streaming and no server error.
Logic	Multi-artifact substitution behavior is stable and retains all artifact references for a shared toolCallId.
Logic	No-artifact fixture retrieval preserved original tool-result content without placeholder injection.
Logic	Oversized tool arguments and summaries are truncated safely with deterministic markers.
Logic	Long-conversation retrieval and serializer handling remained stable during truncation-summary flow.
Happy-path	Re-execution with correct end-user app auth + PoW succeeded for stream and conversation retrieval.
Happy-path	End-user-authenticated conversation retrieval returned 200 with assistant content, normalized parts, and pagination metadata.

Commit: 9ad1a65

View Full Run

Tell us how we did: Give Ito Feedback

Return MessageSelect[] from history APIs and format at generation cal…

8eea5bf

…l sites

vercel Bot deployed to Preview – agents-api March 18, 2026 14:50 View deployment

vercel Bot deployed to Preview – agents-docs March 18, 2026 14:50 View deployment

pullfrog Bot reviewed Mar 18, 2026

View reviewed changes

vercel Bot deployed to Preview – agents-manage-ui March 18, 2026 14:51 View deployment

claude Bot reviewed Mar 18, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Mar 18, 2026

This was referenced Mar 18, 2026

fix(agents-api): sanitize artifact binary data before persistence #2680

Closed

feat(images): enable multi-turn conversations about image context parts #2604

Closed

Merge remote-tracking branch 'origin/main' into stack/history_shape

a96c949

vercel Bot deployed to Preview – agents-api March 23, 2026 23:11 View deployment

vercel Bot deployed to Preview – agents-manage-ui March 23, 2026 23:11 View deployment

vercel Bot deployed to Preview – agents-docs March 23, 2026 23:11 View deployment

claude Bot reviewed Mar 23, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Mar 23, 2026

mike-inkeep added 2 commits March 24, 2026 11:33

Further review feedback and tightening of types

4e1416a

Merge remote-tracking branch 'origin/main' into stack/history_shape

37a910e

vercel Bot deployed to Preview – agents-manage-ui March 24, 2026 15:39 View deployment

vercel Bot deployed to Preview – agents-docs March 24, 2026 15:40 View deployment

vercel Bot deployed to Preview – agents-api March 24, 2026 15:40 View deployment

claude Bot reviewed Mar 24, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Mar 24, 2026

mike-inkeep added 2 commits March 24, 2026 13:14

More cleanup

513ed69

Merge remote-tracking branch 'origin/main' into stack/history_shape

393705c

vercel Bot deployed to Preview – agents-manage-ui March 24, 2026 17:27 View deployment

vercel Bot deployed to Preview – agents-docs March 24, 2026 17:27 View deployment

claude Bot reviewed Mar 24, 2026

View reviewed changes

mike-inkeep changed the title ~~Conversation history returns MessageSelect[] and formats at call site~~ fix(run): Conversation history as structured messages + artifact / part-shape fixes Mar 24, 2026

github-actions Bot deleted a comment from claude Bot Mar 24, 2026

mike-inkeep changed the title ~~fix(run): Conversation history as structured messages + artifact / part-shape fixes~~ fix(run): return structured conversation history and preserve multi-artifact tool results Mar 24, 2026

tim-inkeep approved these changes Mar 24, 2026

View reviewed changes

Merge branch 'main' into stack/history_shape

b38c540

mike-inkeep enabled auto-merge March 25, 2026 16:09

claude Bot reviewed Mar 25, 2026

View reviewed changes

github-actions Bot deleted a comment from claude Bot Mar 25, 2026

mike-inkeep disabled auto-merge March 25, 2026 19:30

Merge remote-tracking branch 'origin/main' into stack/history_shape

9ad1a65

vercel Bot deployed to Preview – agents-manage-ui March 25, 2026 19:51 View deployment

claude Bot reviewed Mar 25, 2026

View reviewed changes

vercel Bot deployed to Preview – agents-docs March 25, 2026 19:52 View deployment

mike-inkeep enabled auto-merge March 25, 2026 19:53

github-actions Bot deleted a comment from claude Bot Mar 25, 2026

vercel Bot deployed to Preview – agents-api March 25, 2026 19:54 View deployment

mike-inkeep added this pull request to the merge queue Mar 25, 2026

Merged via the queue into main with commit d2b5368 Mar 25, 2026
18 checks passed

mike-inkeep deleted the stack/history_shape branch March 25, 2026 20:07

claude Bot mentioned this pull request Mar 25, 2026

Bugfix/compressor bug #2833

Merged

Conversation

mike-inkeep commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key decisions

Manual QA

Test plan

Uh oh!

changeset-bot Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

vercel Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟡 Minor (1) 🟡

💭 Consider (2) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

mike-inkeep Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

tim-inkeep Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

Return structured messages instead of pre-formatted strings

Multi-artifact grouping per toolCallId

kind/type normalization in message part detection

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Review Summary

🕐 Pending Recommendations (5)

✅ APPROVE

Uh oh!

itoqa Bot commented Mar 24, 2026

Ito Test Report ❌

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Review Summary

🟠⚠️ Major (1) 🟠⚠️

💭 Consider (2) 💭

🕐 Pending Recommendations (4)

mike-inkeep commented Mar 18, 2026 •

edited

Loading

changeset-bot Bot commented Mar 18, 2026 •

edited

Loading

vercel Bot commented Mar 18, 2026 •

edited

Loading

pullfrog Bot left a comment •

edited

Loading

pullfrog Bot commented Mar 23, 2026 •

edited

Loading

Multi-artifact grouping per `toolCallId`

`kind`/`type` normalization in message part detection