Skip to content

Commit ebe87a2

Browse files
committed
base2-with-task-researcher (and context restarts!)
1 parent fd4daea commit ebe87a2

File tree

3 files changed

+261
-15
lines changed

3 files changed

+261
-15
lines changed
Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
import { buildArray } from '@codebuff/common/util/array'
2+
3+
import { publisher } from '../constants'
4+
import {
5+
PLACEHOLDER,
6+
type SecretAgentDefinition,
7+
} from '../types/secret-agent-definition'
8+
import { ToolCall } from 'types/agent-definition'
9+
10+
export const createBase2WithTaskResearcher: () => Omit<
11+
SecretAgentDefinition,
12+
'id'
13+
> = () => {
14+
return {
15+
publisher,
16+
model: 'anthropic/claude-sonnet-4.5',
17+
displayName: 'Buffy the Orchestrator',
18+
spawnerPrompt:
19+
'Advanced base agent that orchestrates planning, editing, and reviewing for complex coding tasks',
20+
inputSchema: {
21+
prompt: {
22+
type: 'string',
23+
description: 'A coding task to complete',
24+
},
25+
params: {
26+
type: 'object',
27+
properties: {
28+
maxContextLength: {
29+
type: 'number',
30+
},
31+
},
32+
required: [],
33+
},
34+
},
35+
outputMode: 'last_message',
36+
includeMessageHistory: false,
37+
toolNames: ['spawn_agents', 'read_files', 'str_replace', 'write_file'],
38+
spawnableAgents: buildArray(
39+
'task-researcher',
40+
'file-picker-max',
41+
'code-searcher',
42+
'directory-lister',
43+
'glob-matcher',
44+
'researcher-web',
45+
'researcher-docs',
46+
'commander',
47+
'code-reviewer',
48+
'validator',
49+
'context-pruner',
50+
),
51+
52+
systemPrompt: `You are Buffy, a strategic coding assistant that orchestrates complex coding tasks through specialized sub-agents.
53+
54+
# Layers
55+
56+
You spawn agents in "layers". Each layer is one spawn_agents tool call composed of multiple agents that answer your questions, do research, edit, and review.
57+
58+
In between layers, you are encouraged to use the read_files tool to read files that you think are relevant to the user's request. It's good to read as many files as possible in between layers as this will give you more context on the user request.
59+
60+
Continue to spawn layers of agents until have completed the user's request or require more information from the user.
61+
62+
## Spawning agents guidelines
63+
64+
65+
- **Sequence agents properly:** Keep in mind dependencies when spawning different agents. Don't spawn agents in parallel that depend on each other. Be conservative sequencing agents so they can build on each other's insights:
66+
- **Task researcher:** For medium to complex requests, you should first spawn a task-researcher agent by itself to gather context about the user's request. Spawn this before any other agents.
67+
- Spawn file pickers, code-searcher, directory-lister, glob-matcher, commanders, and researchers before making edits.
68+
- Spawn generate-plan agent after you have gathered all the context you need (and not before!).
69+
- Only make edits after generating a plan.
70+
- Code reviewers/validators should be spawned after you have made your edits.
71+
- **No need to include context:** When prompting an agent, realize that many agents can already see the entire conversation history, so you can be brief in prompting them without needing to include context.
72+
- **Don't spawn code reviewers/validators for trivial changes or quick follow-ups:** You should spawn the code reviewer/validator for most changes, but not for little changes or simple follow-ups.
73+
74+
# Core Mandates
75+
76+
- **Tone:** Adopt a professional, direct, and concise tone suitable for a CLI environment.
77+
- **Understand first, act second:** Always gather context and read relevant files BEFORE editing files.
78+
- **Quality over speed:** Prioritize correctness over appearing productive. Fewer, well-informed agents are better than many rushed ones.
79+
- **Validate assumptions:** Use researchers, file pickers, and the read_files tool to verify assumptions about libraries and APIs before implementing.
80+
- **Proactiveness:** Fulfill the user's request thoroughly, including reasonable, directly implied follow-up actions.
81+
- **Be careful about terminal commands:** Be careful about instructing subagents to run terminal commands that could be destructive or have effects that are hard to undo (e.g. git push, running scripts that could alter production environments, installing packages globally, etc). Don't do any of these unless the user explicitly asks you to.
82+
- **Do what the user asks:** If the user asks you to do something, even running a risky terminal command, do it.
83+
- **Make at least one tool call in every step:** You *must* make at least one tool call (with "<codebuff_tool_call>" tags) in every step unless you are done with the task. If you don't, you will be cut off by the system and the task will be incomplete.
84+
85+
# Code Editing Mandates
86+
87+
- **Conventions:** Rigorously adhere to existing project conventions when reading or modifying code. Analyze surrounding code, tests, and configuration first.
88+
- **Libraries/Frameworks:** NEVER assume a library/framework is available or appropriate. Verify its established usage within the project (check imports, configuration files like 'package.json', 'Cargo.toml', 'requirements.txt', 'build.gradle', etc., or observe neighboring files) before employing it.
89+
- **Style & Structure:** Mimic the style (formatting, naming), structure, framework choices, typing, and architectural patterns of existing code in the project.
90+
- **Idiomatic Changes:** When editing, understand the local context (imports, functions/classes) to ensure your changes integrate naturally and idiomatically.
91+
- **No new code comments:** Do not add any new comments while writing code, unless they were preexisting comments (keep those!) or unless the user asks you to add comments!
92+
- **Minimal Changes:** Make as few changes as possible to satisfy the user request! Don't go beyond what the user has asked for.
93+
- **Code Reuse:** Always reuse helper functions, components, classes, etc., whenever possible! Don't reimplement what already exists elsewhere in the codebase.
94+
- **Front end development** We want to make the UI look as good as possible. Don't hold back. Give it your all.
95+
- Include as many relevant features and interactions as possible
96+
- Add thoughtful details like hover states, transitions, and micro-interactions
97+
- Apply design principles: hierarchy, contrast, balance, and movement
98+
- Create an impressive demonstration showcasing web development capabilities
99+
- **Refactoring Awareness:** Whenever you modify an exported symbol like a function or class or variable, you should find and update all the references to it appropriately.
100+
- **Package Management:** When adding new packages, use the run_terminal_command tool to install the package rather than editing the package.json file with a guess at the version number to use (or similar for other languages). This way, you will be sure to have the latest version of the package. Do not install packages globally unless asked by the user (e.g. Don't run \`npm install -g <package-name>\`). Always try to use the package manager associated with the project (e.g. it might be \`pnpm\` or \`bun\` or \`yarn\` instead of \`npm\`, or similar for other languages).
101+
- **Code Hygiene:** Make sure to leave things in a good state:
102+
- Don't forget to add any imports that might be needed
103+
- Remove unused variables, functions, and files as a result of your changes.
104+
- If you added files or functions meant to replace existing code, then you should also remove the previous code.
105+
- **Edit multiple files at once:** When you edit files, you must make as many tool calls as possible in a single message. This is faster and much more efficient than making all the tool calls in separate messages. It saves users thousands of dollars in credits if you do this!
106+
107+
${PLACEHOLDER.FILE_TREE_PROMPT_SMALL}
108+
${PLACEHOLDER.KNOWLEDGE_FILES_CONTENTS}
109+
110+
# Initial Git Changes
111+
112+
The following is the state of the git repository at the start of the conversation. Note that it is not updated to reflect any subsequent changes made by the user or the agents.
113+
114+
${PLACEHOLDER.GIT_CHANGES_PROMPT}
115+
`,
116+
117+
instructionsPrompt: `Orchestrate the completion of the user's request using your specialized sub-agents. Take your time and be comprehensive.
118+
119+
## Example workflow
120+
121+
The user asks you to implement a new feature. You respond in multiple steps:
122+
123+
1. Spawn a task-researcher agent to research the task and get key facts and insights.
124+
2. Use the str_replace or write_file tool to make the changes.
125+
3. Spawn a code-reviewer to review the changes. Consider making changes suggested by the code-reviewer.
126+
4. Spawn a validator to run validation checks (tests, typechecks, etc.) to ensure the changes are correct.
127+
128+
You may not need to spawn the task-researcher if the user's request is trivial or if you have already gathered all the information you need from the conversation history.
129+
`,
130+
131+
stepPrompt: `Don't forget to spawn agents that could help, especially: the task-researcher to research the task, code-reviewer to review changes, and the validator to run validation commands.`,
132+
133+
handleSteps: function* ({ params, logger }) {
134+
let steps = 0
135+
while (true) {
136+
steps++
137+
// Run context-pruner before each step
138+
yield {
139+
toolName: 'spawn_agent_inline',
140+
input: {
141+
agent_type: 'context-pruner',
142+
params: params ?? {},
143+
},
144+
includeToolCall: false,
145+
} as any
146+
147+
const { stepsComplete, agentState } = yield 'STEP'
148+
if (stepsComplete) break
149+
150+
// Check tool results for spawning of a task researcher...
151+
// If found, reset messages to only include the task researcher's result and read the relevant files!
152+
const spawnAgentsToolResults = agentState.messageHistory
153+
.filter((message) => message.role === 'tool')
154+
.filter((message) => message.content.toolName === 'spawn_agents')
155+
.map((message) => message.content.output)
156+
.flat()
157+
.filter((result) => result.type === 'json')
158+
.map((result) => result.value)[0] as {
159+
agentType: string
160+
value: any
161+
}[]
162+
163+
const taskResearcherResult = spawnAgentsToolResults?.find(
164+
(result) => result.agentType === 'task-researcher',
165+
)
166+
if (taskResearcherResult) {
167+
const taskResearcherOutput = taskResearcherResult.value.value as {
168+
analysis: string
169+
keyFacts: string[]
170+
relevantFiles: string[]
171+
userPrompt: string
172+
}
173+
const initialMessage = `<research>${taskResearcherOutput.keyFacts.join('\n')}</research>${taskResearcherOutput.userPrompt}`
174+
const message = {
175+
role: 'user',
176+
content: initialMessage,
177+
}
178+
const instructionsMessage = agentState.messageHistory.findLast(
179+
(message) =>
180+
message.role === 'user' &&
181+
message.keepLastTags?.[0] === 'INSTRUCTIONS_PROMPT',
182+
)
183+
yield {
184+
toolName: 'set_messages',
185+
input: {
186+
messages: [message, instructionsMessage],
187+
},
188+
includeToolCall: false,
189+
} satisfies ToolCall<'set_messages'>
190+
yield {
191+
toolName: 'read_files',
192+
input: { paths: taskResearcherOutput.relevantFiles },
193+
} satisfies ToolCall<'read_files'>
194+
}
195+
// Continue loop!
196+
}
197+
},
198+
}
199+
}
200+
201+
const definition = {
202+
...createBase2WithTaskResearcher(),
203+
id: 'base2-with-task-researcher',
204+
}
205+
export default definition

.agents/researcher/task-researcher.ts

Lines changed: 56 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import {
55
PLACEHOLDER,
66
type SecretAgentDefinition,
77
} from '../types/secret-agent-definition'
8+
import { ToolCall, AgentState } from '../types/agent-definition'
89

910
export const createTaskResearcher: () => Omit<
1011
SecretAgentDefinition,
@@ -21,16 +22,8 @@ export const createTaskResearcher: () => Omit<
2122
type: 'string',
2223
description: 'A coding task to research',
2324
},
24-
params: {
25-
type: 'object',
26-
properties: {
27-
maxContextLength: {
28-
type: 'number',
29-
},
30-
},
31-
required: [],
32-
},
3325
},
26+
includeMessageHistory: true,
3427
outputMode: 'structured_output',
3528
outputSchema: {
3629
type: 'object',
@@ -52,9 +45,8 @@ export const createTaskResearcher: () => Omit<
5245
'A comprehensive list of the paths of files that are relevant to the coding task.',
5346
},
5447
},
55-
required: ['report'],
48+
required: ['analysis', 'keyFacts', 'relevantFiles'],
5649
},
57-
includeMessageHistory: false,
5850
toolNames: ['spawn_agents', 'read_files', 'set_output'],
5951
spawnableAgents: buildArray(
6052
'file-picker-max',
@@ -93,11 +85,50 @@ You recieve a coding task to implement a new feature. You do research in multipl
9385
2a. Read all the relevant files using the read_files tool.
9486
3. Spawn a decomposing-thinker agent to help figure out key facts and insights about the coding task.
9587
3a. Read any remaining relevant files using the read_files tool.
96-
4. Now the most important part: use the set_output tool to compile the information into a final report. Start with the analysis, and then put the most effort into the key facts list, which should be comprehensive. Finally, include ALL the relevant files in the report.`,
88+
4. Now the most important part: use the set_output tool to compile the information into a final report. Start with the analysis, and then put the most effort into the key facts list, which should be comprehensive. Finally, include ALL the relevant files in the report.
89+
Important: the report should only include the analysis, key facts, and relevant files. It should not include a plan or recommendations or any other information.
90+
5. End your turn.
91+
`,
9792

9893
stepPrompt: `Don't forget to spawn agents that could help, especially: the file-picker-max and find-all-referencer to get codebase context, and the decomposing-thinker agent to help figure out key facts and insights.`,
9994

100-
handleSteps: function* ({ prompt, params }) {
95+
handleSteps: function* ({ prompt, agentState, logger }) {
96+
// Reset the message history to the initial user message and the input prompt.
97+
const { messageHistory } = agentState
98+
const instructionsMessage = messageHistory[messageHistory.length - 1]
99+
const userMessages = messageHistory.filter(
100+
(message) =>
101+
message.role === 'user' &&
102+
(typeof message.content === 'string'
103+
? message.content.includes('<user_message>')
104+
: message.content[0].type === 'text' &&
105+
message.content[0].text.includes('<user_message>')),
106+
)
107+
const lastUserMessage = userMessages[userMessages.length - 2]
108+
let userPrompt = prompt
109+
if (lastUserMessage) {
110+
const userMessageBlock =
111+
typeof lastUserMessage.content === 'string'
112+
? lastUserMessage.content
113+
: (lastUserMessage.content as any)[0].text || ''
114+
const userMessage = userMessageBlock
115+
.split('<user_message>')[1]
116+
.split('</user_message>')[0]
117+
.trim()
118+
userPrompt = `<user_message>${userMessage}<generated_summary_of_task>${prompt}</generated_summary_of_task></user_message>`
119+
yield {
120+
toolName: 'set_messages',
121+
input: {
122+
messages: [
123+
{ role: 'user', content: userPrompt },
124+
instructionsMessage,
125+
],
126+
},
127+
includeToolCall: false,
128+
} satisfies ToolCall
129+
}
130+
131+
let lastAgentState: AgentState
101132
let steps = 0
102133
while (true) {
103134
steps++
@@ -106,14 +137,24 @@ You recieve a coding task to implement a new feature. You do research in multipl
106137
toolName: 'spawn_agent_inline',
107138
input: {
108139
agent_type: 'context-pruner',
109-
params: params ?? {},
140+
params: {},
110141
},
111142
includeToolCall: false,
112143
} as any
113144

114-
const { stepsComplete } = yield 'STEP'
145+
const { stepsComplete, agentState } = yield 'STEP'
146+
lastAgentState = agentState
115147
if (stepsComplete) break
116148
}
149+
150+
yield {
151+
toolName: 'set_output',
152+
input: {
153+
...lastAgentState.output,
154+
userPrompt,
155+
},
156+
includeToolCall: false,
157+
} satisfies ToolCall<'set_output'>
117158
},
118159
}
119160
}

0 commit comments

Comments
 (0)