Skip to content

Commit 1a835a8

Browse files
committed
base2-gpt-5!
1 parent 0ea9d07 commit 1a835a8

File tree

6 files changed

+188
-17
lines changed

6 files changed

+188
-17
lines changed

.agents/base2/base2-gpt-5.ts

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
import { createBase2 } from './base2'
22
import type { SecretAgentDefinition } from '../types/secret-agent-definition'
33

4-
const base2 = createBase2('fast')
4+
const base2 = createBase2('fast', { isGpt5: true })
55

66
const definition: SecretAgentDefinition = {
77
...base2,
88
id: 'base2-gpt-5',
9-
model: 'openai/gpt-5',
109
}
1110

1211
export default definition

.agents/base2/base2.ts

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,16 @@ export const createBase2: (
1010
mode: 'fast' | 'max',
1111
options?: {
1212
hasNoValidation?: boolean
13+
isGpt5?: boolean
1314
},
1415
) => Omit<SecretAgentDefinition, 'id'> = (mode, options) => {
15-
const { hasNoValidation = false } = options ?? {}
16+
const { hasNoValidation = false, isGpt5 = false } = options ?? {}
1617
const isFast = mode === 'fast'
1718
const isMax = mode === 'max'
1819

1920
return {
2021
publisher,
21-
model: 'anthropic/claude-sonnet-4.5',
22+
model: isGpt5 ? 'openai/gpt-5' : 'anthropic/claude-sonnet-4.5',
2223
displayName: 'Buffy the Orchestrator',
2324
spawnerPrompt:
2425
'Advanced base agent that orchestrates planning, editing, and reviewing for complex coding tasks',
@@ -48,15 +49,18 @@ export const createBase2: (
4849
'write_file',
4950
),
5051
spawnableAgents: buildArray(
51-
'file-researcher',
52+
!isGpt5 && 'file-researcher',
5253
'file-picker-max',
5354
'code-searcher',
5455
'directory-lister',
5556
'glob-matcher',
5657
'researcher-web',
5758
'researcher-docs',
5859
'commander',
59-
isFast && 'best-of-n-orchestrator-fast',
60+
isFast &&
61+
(isGpt5
62+
? 'best-of-n-orchestrator-gpt-5'
63+
: 'best-of-n-orchestrator-fast'),
6064
isMax && 'base2-gpt-5-worker',
6165
'context-pruner',
6266
),
@@ -74,13 +78,14 @@ Continue to spawn layers of agents until have completed the user's request or re
7478
## Spawning agents guidelines
7579
7680
- **Sequence agents properly:** Keep in mind dependencies when spawning different agents. Don't spawn agents in parallel that depend on each other. Be conservative sequencing agents so they can build on each other's insights:
77-
- Spawn file pickers, code-searcher, directory-lister, glob-matcher, commanders, and researchers before making edits.
81+
- Spawn ${isGpt5 ? 'file pickers, code-searcher, directory-lister, glob-matcher, commanders, and researchers' : 'the file researcher and optionally the web researcher and docs researcher'} before making edits.${isGpt5 ? '' : ' After that, spawn further agents to gather context as needed (e.g. the code-searcher, directory-lister, glob-matcher, commanders, and researchers).'}
7882
${buildArray(
7983
isFast &&
80-
'- Spawn a best-of-n-orchestrator-fast agent to implement the changes after you have gathered all the context you need (and not before!).',
84+
`- Spawn a ${isGpt5 ? 'best-of-n-orchestrator-gpt-5' : 'best-of-n-orchestrator-fast'} agent to implement the changes after you have gathered all the context you need (and not before!).`,
8185
isMax &&
8286
'- Spawn a base2-gpt-5-worker agent inline after you have gathered all the context you need (and not before!).',
8387
).join('\n ')}
88+
- **Spawn with the correct prompt and/or params:** Each agent has a schema for the input it expects. The prompt is an optional string, and the params is a json object. Note that some agents don't take any input prompt or params.
8489
- **No need to include context:** When prompting an agent, realize that many agents can already see the entire conversation history, so you can be brief in prompting them without needing to include context.
8590
8691
# Core Mandates
@@ -143,20 +148,33 @@ ${PLACEHOLDER.GIT_CHANGES_PROMPT}
143148
The user asks you to implement a new feature. You respond in multiple steps:
144149
145150
${buildArray(
146-
'- First, you must spawn a file-researcher to find relevant files; consider also spawning a web and/or docs researcher to find relevant information online. (Note: For the first layer, only spawn researchers, not other agents. Do not spawn a code-searcher yet!)',
147-
'- Read **ALL** the files that the file-researcher found using the read_files tool. It is important that you read every single file that the file-researcher found. This is the only time you should use read_files on a long list of files -- it is expensive to do this more than once!',
148-
`- Consider spawning other agents or reading more files as needed to gather comprehensive context to answer the user's request.`,
151+
!isGpt5 &&
152+
'- First, for a new task, you must spawn a file-researcher to find relevant files; consider also spawning a web and/or docs researcher to find relevant information online. (Note: For the first layer, only spawn researchers, not other agents. Do not spawn a code-searcher yet!)',
153+
!isGpt5 &&
154+
'- Read **ALL** the files that the file-researcher found using the read_files tool. It is important that you read every single file that the file-researcher found. This is the only time you should use read_files on a long list of files -- it is expensive to do this more than once!',
155+
`- Consider spawning other agents or reading more files as needed to gather comprehensive context to answer the user's request. When in doubt, read more files!`,
156+
isGpt5 &&
157+
`- Spawn file pickers, code-searcher, directory-lister, glob-matcher, commanders, and researchers to gather context as needed. Read all the relevant files using the read_files tool. Read as many files as possible so that you have a comprehensive context on the user's request.`,
149158
isFast &&
150159
`- Use the write_todos tool to write out your step-by-step implementation plan.${hasNoValidation ? '' : ' You should include at least one step to validate/test your changes: be specific about whether to typecheck, run tests, run lints, etc.'}`,
151160
isFast &&
152-
`- You must spawn the best-of-n-orchestrator-fast agent to implement the code changes, since it will generate the best code changes from multiple implementation proposals, which the user wants you to do.`,
161+
`- You must spawn the ${isGpt5 ? 'best-of-n-orchestrator-gpt-5' : 'best-of-n-orchestrator-fast'} agent to implement the code changes, since it will generate the best code changes from multiple implementation proposals, which the user wants you to do.`,
153162
isMax &&
154163
`- IMPORTANT: You must spawn a base2-gpt-5-worker agent inline (with spawn_agent_inline tool) to do the planning and editing.`,
155164
!hasNoValidation &&
156165
`- Test your changes${isFast ? ' briefly' : ''} by running appropriate validation commands for the project (e.g. typechecks, tests, lints, etc.). You may have to explore the project to find the appropriate commands. Don't skip this step!`,
157166
`- Inform the user that you have completed the task in one sentence or a few short bullet points. Don't create any markdown summary files or example documentation files, unless asked by the user. If you already finished the user request and said you're done, then don't say anything else.`,
158167
).join('\n')}`,
159-
stepPrompt: `${isMax ? "Keep working until the user's request is completely satisfied. " : ''}${isFast ? "You must spawn the best-of-n-orchestrator-fast agent to implement any code changes. Don't forget to do this! " : ''}After completing the user request, summarize your changes in a sentence or a few short bullet points. Do not create any summary markdown files or example documentation files, unless asked by the user. If you already summarized your changes, then end turn and don't say anything else.`,
168+
stepPrompt: buildArray(
169+
(isMax || isGpt5) &&
170+
`Keep working until the user's request is completely satisfied${!hasNoValidation ? ' and validated' : ''}. `,
171+
isFast &&
172+
`You must spawn the ${isGpt5 ? 'best-of-n-orchestrator-gpt-5' : 'best-of-n-orchestrator-fast'} agent to implement any code changes. Don't forget to do this! `,
173+
`After completing the user request, summarize your changes in a sentence or a few short bullet points. Do not create any summary markdown files or example documentation files, unless asked by the user. If you already summarized your changes, then end turn and don't say anything else.`,
174+
isGpt5 &&
175+
`IMPORTANT: every response *must* include at least one tool call (using "<codebuff_tool_call>" tags), unless you are done with the task. If you don't include at least one tool call, your response will be cut off and the task will be ended prematurely, which is very bad for fulfilling the user's request. When completely done, you can respond without a tool call.`,
176+
).join('\n'),
177+
160178
handleSteps: function* ({ params }) {
161179
let steps = 0
162180
while (true) {

.agents/base2/best-of-n/best-of-n-implementor.ts

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,10 @@ OR for new files or major rewrites:
6060
"content": "Complete file content or edit snippet"
6161
}
6262
</codebuff_tool_call>
63-
63+
${
64+
isGpt5
65+
? ``
66+
: `
6467
You can also use <think> tags interspersed between tool calls to think about the best way to implement the changes. Keep these thoughts very brief. You may not need to use think tags at all.
6568
6669
<example>
@@ -85,7 +88,8 @@ You can also use <think> tags interspersed between tool calls to think about the
8588
[ Third tool call to implement the feature ]
8689
</codebuff_tool_call>
8790
88-
</example>
91+
</example>`
92+
}
8993
9094
Your implementation should:
9195
- Be complete and comprehensive

.agents/base2/best-of-n/best-of-n-orchestrator-fast.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ const definition: SecretAgentDefinition = {
88
model: 'anthropic/claude-sonnet-4.5',
99
displayName: 'Best-of-N Fast Implementation Orchestrator',
1010
spawnerPrompt:
11-
'Orchestrates multiple implementor agents to generate implementation proposals and selects the best one',
11+
'Orchestrates multiple implementor agents to generate implementation proposals, selects the best one, and applies the changes.',
1212

1313
includeMessageHistory: true,
1414
inheritParentSystemPrompt: true,
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
import type { SecretAgentDefinition } from '../../types/secret-agent-definition'
2+
import { publisher } from '../../constants'
3+
import { StepText, ToolCall } from 'types/agent-definition'
4+
5+
const definition: SecretAgentDefinition = {
6+
id: 'best-of-n-orchestrator-gpt-5',
7+
publisher,
8+
model: 'openai/gpt-5',
9+
displayName: 'Best-of-N GPT-5 Implementation Orchestrator',
10+
spawnerPrompt:
11+
'Orchestrates multiple implementor agents to generate implementation proposals, selects the best one, and applies the changes (no need to make the edits yourself).',
12+
13+
includeMessageHistory: true,
14+
inheritParentSystemPrompt: true,
15+
16+
toolNames: [
17+
'spawn_agents',
18+
'str_replace',
19+
'write_file',
20+
'set_messages',
21+
'set_output',
22+
],
23+
spawnableAgents: ['best-of-n-implementor-gpt-5', 'best-of-n-selector-gpt-5'],
24+
25+
inputSchema: {},
26+
outputMode: 'structured_output',
27+
28+
handleSteps: function* ({ agentState }) {
29+
// Remove userInstruction message for this agent.
30+
const messages = agentState.messageHistory.concat()
31+
messages.pop()
32+
yield {
33+
toolName: 'set_messages',
34+
input: {
35+
messages,
36+
},
37+
includeToolCall: false,
38+
} satisfies ToolCall<'set_messages'>
39+
40+
// Spawn 1 of each model for easy prompt caching
41+
const { toolResult: implementorsResult1 } = yield {
42+
toolName: 'spawn_agents',
43+
input: {
44+
agents: [
45+
{ agent_type: 'best-of-n-implementor-gpt-5' },
46+
{ agent_type: 'best-of-n-implementor-gpt-5' },
47+
{ agent_type: 'best-of-n-implementor-gpt-5' },
48+
{ agent_type: 'best-of-n-implementor-gpt-5' },
49+
{ agent_type: 'best-of-n-implementor-gpt-5' },
50+
],
51+
},
52+
includeToolCall: false,
53+
}
54+
55+
const implementorsResult = extractSpawnResults<string>(implementorsResult1)
56+
57+
// Extract all the plans from the structured outputs
58+
const letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
59+
// Parse implementations from tool results
60+
const implementations = implementorsResult.map((content, index) => ({
61+
id: letters[index],
62+
content,
63+
}))
64+
65+
// Spawn selector with implementations as params
66+
const { toolResult: selectorResult } = yield {
67+
toolName: 'spawn_agents',
68+
input: {
69+
agents: [
70+
{
71+
agent_type: 'best-of-n-selector-gpt-5',
72+
params: { implementations },
73+
},
74+
],
75+
},
76+
includeToolCall: false,
77+
} satisfies ToolCall<'spawn_agents'>
78+
79+
const selectorOutput = extractSpawnResults<{
80+
implementationId: string
81+
reasoning: string
82+
}>(selectorResult)[0]
83+
84+
if ('errorMessage' in selectorOutput) {
85+
yield {
86+
toolName: 'set_output',
87+
input: { error: selectorOutput.errorMessage },
88+
} satisfies ToolCall<'set_output'>
89+
return
90+
}
91+
const { implementationId } = selectorOutput
92+
const chosenImplementation = implementations.find(
93+
(implementation) => implementation.id === implementationId,
94+
)
95+
if (!chosenImplementation) {
96+
yield {
97+
toolName: 'set_output',
98+
input: { error: 'Failed to find chosen implementation.' },
99+
} satisfies ToolCall<'set_output'>
100+
return
101+
}
102+
103+
// Apply the chosen implementation using STEP_TEXT
104+
const { agentState: postEditsAgentState } = yield {
105+
type: 'STEP_TEXT',
106+
text: chosenImplementation.content,
107+
} as StepText
108+
const { messageHistory } = postEditsAgentState
109+
const lastAssistantMessageIndex = messageHistory.findLastIndex(
110+
(message) => message.role === 'assistant',
111+
)
112+
const editToolResults = messageHistory
113+
.slice(lastAssistantMessageIndex)
114+
.filter((message) => message.role === 'tool')
115+
.flatMap((message) => message.content.output)
116+
.filter((output) => output.type === 'json')
117+
.map((output) => output.value)
118+
119+
// Set output with the chosen implementation and reasoning
120+
yield {
121+
toolName: 'set_output',
122+
input: {
123+
response: chosenImplementation.content,
124+
toolResults: editToolResults,
125+
},
126+
} satisfies ToolCall<'set_output'>
127+
128+
function extractSpawnResults<T>(
129+
results: any[] | undefined,
130+
): (T | { errorMessage: string })[] {
131+
if (!results) return []
132+
const spawnedResults = results
133+
.filter((result) => result.type === 'json')
134+
.map((result) => result.value)
135+
.flat() as {
136+
agentType: string
137+
value: { value?: T; errorMessage?: string }
138+
}[]
139+
return spawnedResults.map(
140+
(result) =>
141+
result.value.value ?? {
142+
errorMessage:
143+
result.value.errorMessage ?? 'Error extracting spawn results',
144+
},
145+
)
146+
}
147+
},
148+
}
149+
150+
export default definition

.agents/base2/best-of-n/best-of-n-orchestrator.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ const definition: SecretAgentDefinition = {
88
model: 'anthropic/claude-sonnet-4.5',
99
displayName: 'Best-of-N Implementation Orchestrator',
1010
spawnerPrompt:
11-
'Orchestrates multiple implementor agents to generate implementation proposals and selects the best one',
11+
'Orchestrates multiple implementor agents to generate implementation proposals, selects the best one, and applies the changes.',
1212

1313
includeMessageHistory: true,
1414
inheritParentSystemPrompt: true,

0 commit comments

Comments
 (0)