Skip to content

Custom prompt config handling in API tool calling#431

Open
nuwangeek wants to merge 87 commits intobuerokratt:wipfrom
rootcodelabs:llm-430
Open

Custom prompt config handling in API tool calling#431
nuwangeek wants to merge 87 commits intobuerokratt:wipfrom
rootcodelabs:llm-430

Conversation

@nuwangeek
Copy link
Copy Markdown
Collaborator

No description provided.

nuwangeek and others added 30 commits February 20, 2026 16:06
Get update from wip into llm-316
get update from wip into llm-304
Service layer validation in tool classifier (buerokratt#321)
Pulling changes from BYK wip to LLM-Module WIP
Get update from wip into optimization/data-enrichment
Get update from optimization/data-enrichment into optimization/vector-indexer
nuwangeek and others added 25 commits April 22, 2026 06:56
Get  update from llm-394 into llm-345-dev
Get update from llm-394 into llm-403
Get update from llm-345-dev into llm-403
Get update from llm-403 into llm-408
Get update from llm-408 into llm-348
Integrate agentic loop with semantic searcher and streaming (buerokratt#420)
Implemented the API caller module (buerokratt#421)
CKB API integration for agency data sync (buerokratt#392)
Integrate CKB and RAG changelogs with schema updates for RAG (buerokratt#422)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the API tool-calling workflow to respect organization-specific prompt configuration (“custom instructions”) during both parameter collection and API response formatting, and adds schema sanitization to reduce format-hint leakage into clarifying questions.

Changes:

  • Fetch custom prompt instructions from the orchestration service’s prompt_config_loader and pass them into ParamExtractionModule and APIResponseFormatterModule.
  • Sanitize parameter schema descriptions before sending them to the LLM to avoid propagating format hints (e.g., YYYY-MM-DD) into user-facing questions.
  • Expand streaming support and tests for stream_forward() / stream_run_turn() paths, and adjust tests to reflect “new extraction overrides prior value” semantics.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/tool_classifier/workflows/api_tool_workflow.py Loads custom instructions via prompt config loader; applies them to extractor/formatter and derives an effective session language from instructions.
src/tool_classifier/param_extractor.py Adds custom_instructions input, schema-description sanitization, streaming cleanup, and same-type required-param reassignment logic.
src/tool_classifier/api_response_formatter.py Adds custom_instructions input propagation to formatter predictor (blocking + streaming) and stream cleanup.
src/tool_classifier/agentic_loop.py Adds optional continuation_language override for the hardcoded continuation question (run + stream).
tests/test_param_extractor.py Updates override behavior expectation and adds tests for custom instructions, schema sanitization, and streaming extraction.
tests/test_api_response_formatter.py Adds tests for custom instructions and streaming behavior (including stream token yielding / fallback paths).
tests/test_agentic_loop.py Switches imports to src.tool_classifier..., updates override expectation, and adds streaming-path tests for stream_run_turn().

Comment on lines +38 to +40
``in the format YYYY-MM-DD`` phrases. The sanitised description is used
only for LLM question generation; the original description (with format
hints intact) is still used for extraction context.
Comment on lines +537 to +561
# SINGLE-VALUE REASSIGNMENT: if the LLM assigned a value to a later same-type
# param while an earlier same-type param is still missing, move the value forward.
# This fixes the common case where a lone date like "2026-04-01" is extracted as
# endDate when startDate is still missing.
combined_after_extraction = {**already_collected, **validated_params}
required_schema_order = [
p for p in params_schema if isinstance(p, dict) and p.get("required", False)
]
for idx, missing_entry in enumerate(required_schema_order):
m_name = missing_entry["name"]
m_type = missing_entry.get("type", "string")
if m_name in combined_after_extraction:
continue # already satisfied
# Find the first later param with the same type that was just extracted
for later_entry in required_schema_order[idx + 1 :]:
l_name = later_entry["name"]
l_type = later_entry.get("type", "string")
if l_type == m_type and l_name in validated_params:
logger.debug(
f"ParamExtractor: reassigning '{l_name}' → '{m_name}' "
f"(single {m_type} value assigned to wrong param by LLM)"
)
validated_params[m_name] = validated_params.pop(l_name)
break

Comment on lines +392 to +394
custom_instructions = await self._get_custom_instructions()
loop = self._build_agentic_loop(session_store, custom_instructions) # type: ignore[arg-type]

Comment on lines +423 to +433
def _make_async_iter(*chunks: Any) -> AsyncMock:
"""Return an async context manager that yields the given chunks then closes cleanly."""

async def _gen() -> AsyncGenerator[Any, None]:
for chunk in chunks:
yield chunk

mock_stream = AsyncMock()
mock_stream.__aiter__ = lambda self: _gen()
mock_stream.aclose = AsyncMock()
return mock_stream
Comment on lines +8 to +10
from src.tool_classifier.agentic_loop import AgenticLoop
from src.tool_classifier.enums import AgenticLoopStatus
from src.tool_classifier.param_extractor import ParamExtractionResult
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement custom prompt changes to responses in the API tool calling

4 participants