-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
Environment
- Llama Stack Version: 0.3.3
- API Endpoint:
/v1/responses - MCP Server: Kubernetes MCP Server
- Model: gemini-llm/models/gemini-2.5-flash
Information
- The official example scripts
- My own modified scripts
🐛 Describe the bug
Bug Description
When using the OpenAI-compatible Responses API (/v1/responses) with MCP tools that trigger parallel tool calls, Llama Stack concatenates the arguments of multiple tool calls into a single arguments field, producing invalid JSON.
This violates the OpenAI Responses API specification where each parallel tool call should be a separate object in the response output array, each with its own valid JSON arguments and unique call_id.
Steps to Reproduce
- Configure Llama Stack with an MCP server (e.g., Kubernetes MCP server)
- Send a request to
/v1/responsesthat triggers parallel tool calls - Example prompt: "What resources are there in namespace aap?"
Impact
- Parsing failures:
JSON.parse()fails on the malformed arguments - Infinite loops: When used with HITL (Human-in-the-Loop) approval workflows, the malformed arguments cause 500 errors when sent back to Llama Stack, triggering retry loops
- Lost tool calls: Only the first tool call can be extracted via regex workaround; subsequent parallel calls are lost
This is not ideal as it loses all but the first parallel tool call.
Related
This may be related to how Llama Stack handles the parallel_tool_calls behavior internally when interfacing with MCP servers.
Error logs
Actual Behavior
Llama Stack returns a single tool call object with concatenated JSON in the arguments field:
{
"call_id": "single_call_id",
"type": "function_call",
"name": "get_resources",
"arguments": "{\"namespace\":\"aap\"}{\"namespace\":\"aap\"}"
}This is **invalid JSON** - multiple JSON objects concatenated without array wrapper or separators.
Expected behavior
Expected Behavior
Per the OpenAI Function Calling documentation, parallel tool calls should return separate objects:
[
{
"id": "fc_12345xyz",
"call_id": "call_12345xyz",
"type": "function_call",
"name": "get_resources",
"arguments": "{\"namespace\":\"aap\"}"
},
{
"id": "fc_67890abc",
"call_id": "call_67890abc",
"type": "function_call",
"name": "get_resources",
"arguments": "{\"namespace\":\"aap\"}"
}
]
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working