feat(base): add message_tool_names field for per-message tool attribution#74
Merged
Merged
Conversation
…bution
Adds ``RenderedTokens.message_tool_names: list[str | None]`` — a
sidecar parallel to ``message_roles`` that carries the tool function
name for each tool-role message in the rendered slice. For each
tool message the name is taken from ``msg["name"]`` (caller-
provided) or recovered by joining ``msg["tool_call_id"]`` against
any prior assistant's ``tool_calls[i].function.name`` in the same
list. Tool messages whose issuing assistant lives outside the slice
(e.g. on a ``bridge_to_next_turn`` call where ``new_messages``
covers only the new turn) resolve to ``None``.
Pure metadata: ``extract_message_tool_names`` runs independently of
the render path, never mutates the caller's messages, and has no
effect on the rendered token stream — HF chat-template byte parity
is preserved on every renderer. Callers that want the function name
to appear in the rendered scaffold (e.g. GPT-OSS Harmony's
``functions.{name}`` prefix) continue to attach ``name`` themselves
before calling ``render`` — that responsibility stays with the
caller (verifiers does this in ``_attach_tool_call_names``).
Trainers (prime-rl) join this list with ``message_indices`` to
recover per-token tool attribution — the canonical use case is SFT
on tool response bodies of a specific tool while RL acts on
assistant tokens.
Wired into every concrete renderer's ``RenderedTokens(...)``
construction site (render + ``bridge_to_next_turn``).
``extract_message_tool_names`` is exported at package level.
Tests: 5 unit tests covering the case matrix (empty, caller-
provided wins, resolves from prior assistant, orphan tool message,
non-mutation invariant) + 1 integration test that runs across
every renderer in the conftest matrix to catch missed wire-up at
any of the ~25 ``RenderedTokens(...)`` sites.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ApprovabilityVerdict: Approved Additive metadata field with default value and pure extraction function. Follows established pattern for per-message analytics, doesn't affect render output, and includes comprehensive tests. Author has prior experience with similar features in this codebase. You can customize Macroscope's approvability policy. Learn more. |
hallerite
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
RenderedTokens.message_tool_names: list[str | None]— a sidecar parallel tomessage_rolesthat carries the tool function name for each tool-role message in the rendered slice. For each tool message the name is taken frommsg["name"](caller- provided) or recovered by joiningmsg["tool_call_id"]against any prior assistant'stool_calls[i].function.namein the same list. Tool messages whose issuing assistant lives outside the slice (e.g. on abridge_to_next_turncall wherenew_messagescovers only the new turn) resolve toNone.Pure metadata:
extract_message_tool_namesruns independently of the render path, never mutates the caller's messages, and has no effect on the rendered token stream — HF chat-template byte parity is preserved on every renderer. Callers that want the function name to appear in the rendered scaffold (e.g. GPT-OSS Harmony'sfunctions.{name}prefix) continue to attachnamethemselves before callingrender— that responsibility stays with the caller (verifiers does this in_attach_tool_call_names).Trainers (prime-rl) join this list with
message_indicesto recover per-token tool attribution — the canonical use case is SFT on tool response bodies of a specific tool while RL acts on assistant tokens.Wired into every concrete renderer's
RenderedTokens(...)construction site (render +bridge_to_next_turn).extract_message_tool_namesis exported at package level.Tests: 5 unit tests covering the case matrix (empty, caller- provided wins, resolves from prior assistant, orphan tool message, non-mutation invariant) + 1 integration test that runs across every renderer in the conftest matrix to catch missed wire-up at any of the ~25
RenderedTokens(...)sites.Note
Low Risk
Additive metadata and read-only helper; token streams unchanged. Risk is mainly consumers mis-joining with
message_indiceson partial slices (e.g. bridge-onlynew_messages).Overview
Adds
RenderedTokens.message_tool_names— a per-message sidecar parallel tomessage_roles— so trainers can attribute tool-response tokens to a specific function without changing rendered bytes.extract_message_tool_namesinrenderers/base.pybuilds the list: tool messages usemsg["name"]when present, otherwise jointool_call_idto the prior assistant’stool_calls[].function.namein the same message slice; other roles areNone. It does not mutate input messages and does not affect tokenization (HF parity for tool messages withoutnameis unchanged).Every concrete renderer’s
render()andbridge_to_next_turn()now setsmessage_tool_namesonRenderedTokens; the helper is exported fromrenderers.tests/test_message_tool_names.pycovers the join cases and a matrix integration test across renderers.Reviewed by Cursor Bugbot for commit d7c160b. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Add
message_tool_namesfield toRenderedTokensfor per-message tool attributionextract_message_tool_names(messages)in renderers/base.py that returns a list parallel to the input messages, yielding the tool function name for each tool message andNonefor non-tool messages. Names are resolved frommsg['name']first, then by matchingtool_call_idagainst prior assistanttool_calls.message_tool_names: list[str | None]field to theRenderedTokensdataclass, defaulting to an empty list.message_tool_namesin all rendererrenderandrender_idsmethods across 13 renderer modules.extract_message_tool_namesas a top-level symbol from therendererspackage.Macroscope summarized d7c160b.