Conversation
0ec64ac to
4f94efc
Compare
|
Please address the comments from the above PR on this.. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f94efcd44
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| resolved_messages = list(messages(state)) | ||
| if schema_context: | ||
| resolved_messages = list( | ||
| messages(state, additional_context=schema_context) |
There was a problem hiding this comment.
Preserve one-arg message callables when injecting DF context
When Data Fabric schemas are available, INIT invokes the message factory as messages(state, additional_context=...). Existing callable message generators commonly take only one positional parameter (e.g., create_messages(state)), so this raises a TypeError at startup for agents with enabled Data Fabric contexts. This is a runtime-breaking regression for previously valid message callables and should be guarded by signature detection or a fallback call without the extra keyword.
Useful? React with 👍 / 👎.
| else: | ||
| resolved_messages = list(messages(state)) | ||
| else: | ||
| resolved_messages = list(messages) |
There was a problem hiding this comment.
Inject schema context for static message definitions
If messages is provided as a static sequence (which create_agent supports), the INIT node ignores schema_context and returns list(messages) unchanged. In that path, Data Fabric schemas are fetched but never reach the system prompt, so query_datafabric runs without table/column guidance and the feature silently underperforms or fails to generate valid SQL. The static-message branch should also append/update a system message with the generated schema context.
Useful? React with 👍 / 👎.
|
Addressed the comments @UIPath-Harshit . Changes addressing review feedback:
Couldn't find refined SQL tool though. |
91edca2 to
9bfc30c
Compare
9bfc30c to
796d1d9
Compare
| from uipath.agent.models.agent import BaseAgentResourceConfig | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
There was a problem hiding this comment.
Redundant whitespaces. Can you run ruff with whitespace check if there is a rule for that. Dont like these spaces Agents generate
There was a problem hiding this comment.
Pull request overview
Adds Data Fabric (DF) tool support to the ReAct agent runtime by registering a generic query_datafabric tool and injecting DF entity schema/context into the system prompt at INIT via a pluggable init-context registry.
Changes:
- Introduces a
datafabric_toolmodule (schema fetching, prompt/context formatting, andquery_datafabrictool). - Adds init-time context registry + wires INIT node to gather provider context (and plumbs
resourcesinto agent creation). - Updates context/tool factory behavior and adjusts tests for new required context fields + async init node.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/agent/tools/test_tool_factory.py | Updates test context resource config to include context_type. |
| tests/agent/tools/test_context_tool.py | Updates helper to include context_type in context resources. |
| tests/agent/react/test_init_node.py | Updates INIT node tests for async init-node execution and callable signature change. |
| tests/agent/react/test_create_agent.py | Updates expectations for create_init_node(..., resources_for_init=None). |
| src/uipath_langchain/agent/tools/tool_factory.py | Adds tool deduping behavior when building from resources. |
| src/uipath_langchain/agent/tools/datafabric_tool/system_prompt.txt | Adds SQL generation guidelines for DF. |
| src/uipath_langchain/agent/tools/datafabric_tool/sql_constraints.txt | Adds explicit SQL constraints reference text. |
| src/uipath_langchain/agent/tools/datafabric_tool/schema_context.py | Builds/derives DF schema + query pattern context for system prompt injection. |
| src/uipath_langchain/agent/tools/datafabric_tool/models.py | Adds Pydantic models for schema/context + tool input. |
| src/uipath_langchain/agent/tools/datafabric_tool/datafabric_tool.py | Implements DF schema fetch + query_datafabric tool creation. |
| src/uipath_langchain/agent/tools/datafabric_tool/init.py | Registers DF init-context provider at import time. |
| src/uipath_langchain/agent/tools/context_tool.py | Routes DF context resources to the query_datafabric tool. |
| src/uipath_langchain/agent/tools/init.py | Exposes DF helpers and ensures DF module import (provider registration). |
| src/uipath_langchain/agent/react/types.py | Adds AgentResources type alias. |
| src/uipath_langchain/agent/react/init_node.py | Makes INIT node async and gathers init-context from registered providers. |
| src/uipath_langchain/agent/react/init_context_registry.py | Adds registry + gather_init_context mechanism. |
| src/uipath_langchain/agent/react/agent.py | Plumbs resources into INIT node creation. |
| src/uipath_langchain/agent/react/init.py | Exports AgentResources. |
| async def _datafabric_init_context_provider( | ||
| resources: Sequence[BaseAgentResourceConfig], | ||
| ) -> str | None: | ||
| """Fetch and format DataFabric entity schemas for system prompt injection.""" | ||
| entity_identifiers = get_datafabric_entity_identifiers_from_resources(resources) | ||
| if not entity_identifiers: | ||
| return None | ||
|
|
||
| _logger.info( | ||
| "Fetching Data Fabric schemas for %d identifier(s)", | ||
| len(entity_identifiers), | ||
| ) | ||
| entities = await fetch_entity_schemas(entity_identifiers) | ||
| return format_schemas_for_context(entities) | ||
|
|
||
|
|
||
| register_init_context_provider("datafabric", _datafabric_init_context_provider) |
There was a problem hiding this comment.
New Data Fabric init-time context registration and schema formatting logic is introduced here, but there are no unit tests covering (a) provider registration + gather_init_context integration, and (b) that fetched entity schemas are formatted and injected as expected. Given the repo’s existing comprehensive tool/init tests, adding focused tests for the provider behavior would help prevent regressions.
| else: | ||
| resolved_messages = list(messages(state)) | ||
| else: | ||
| resolved_messages = list(messages) |
There was a problem hiding this comment.
additional_context is computed from resources_for_init, but it is only passed into a callable messages generator. When messages is a static list/sequence, the additional context is silently ignored, so Data Fabric schemas (and any other provider output) won’t be injected into the system prompt in that common usage. Consider appending additional_context to the SystemMessage content (or inserting a new SystemMessage) when messages is a sequence, so init-time context is applied consistently.
| resolved_messages = list(messages) | |
| resolved_messages = list(messages) | |
| # When using a static sequence of messages, inject any init-time context | |
| # into the system prompt so provider output (e.g., Data Fabric schemas) | |
| # is not silently ignored. | |
| if additional_context: | |
| # Try to append the additional context to the first SystemMessage. | |
| system_msg_index = next( | |
| (i for i, m in enumerate(resolved_messages) if isinstance(m, SystemMessage)), | |
| None, | |
| ) | |
| if system_msg_index is not None: | |
| system_msg = resolved_messages[system_msg_index] | |
| # Safely append to existing content, assuming string content. | |
| existing_content = str(system_msg.content) | |
| updated_content = f"{existing_content}\n\n{additional_context}" | |
| resolved_messages[system_msg_index] = SystemMessage( | |
| content=updated_content, additional_kwargs=system_msg.additional_kwargs | |
| ) | |
| else: | |
| # No SystemMessage present; prepend a new one with the additional context. | |
| resolved_messages.insert(0, SystemMessage(content=additional_context)) |
| if callable(messages): | ||
| resolved_messages = list(messages(state)) | ||
| if additional_context: | ||
| resolved_messages = list( | ||
| messages(state, additional_context=additional_context) | ||
| ) | ||
| else: | ||
| resolved_messages = list(messages(state)) |
There was a problem hiding this comment.
When additional_context is present, the init node calls messages(state, additional_context=...). This will raise TypeError for existing message generator callables that only accept a single positional state argument (no **kwargs), making Data Fabric/resources injection a breaking change for those users. Consider detecting whether the callable accepts additional_context (or **kwargs) and falling back to messages(state) if not.
| group_field = text_field or (field_names[0] if field_names else "Category") | ||
| agg_field = numeric_field or (field_names[1] if len(field_names) > 1 else "Amount") | ||
| filter_field = text_field or (field_names[0] if field_names else "Name") | ||
| fields_sample = ", ".join(field_names[:5]) if field_names else "*" | ||
| count_col = field_names[0] if field_names else "id" | ||
|
|
||
| query_patterns = [ | ||
| QueryPattern( | ||
| intent="Show all", | ||
| sql=f"SELECT {fields_sample} FROM {table} LIMIT 100", | ||
| ), | ||
| QueryPattern( | ||
| intent="Find by X", | ||
| sql=f"SELECT {fields_sample} FROM {table} WHERE {filter_field} = 'value' LIMIT 100", | ||
| ), |
There was a problem hiding this comment.
fields_sample falls back to "*" when no non-hidden fields are present. This contradicts the documented constraints (NO SELECT *) and will produce example patterns that are guaranteed to violate the tool’s own SQL rules. Prefer omitting the “Show all / Find by X / Top N” patterns when there are no fields, or choose a safe default like the primary key field when available.
| QueryPattern( | ||
| intent="Top N by Y", | ||
| sql=f"SELECT {fields_sample} FROM {table} ORDER BY {agg_field} DESC LIMIT N", | ||
| ), | ||
| QueryPattern( | ||
| intent="Count by X", | ||
| sql=f"SELECT {group_field}, COUNT({count_col}) as count FROM {table} GROUP BY {group_field}", | ||
| ), | ||
| QueryPattern( | ||
| intent="Top N segments", | ||
| sql=f"SELECT {group_field}, COUNT({count_col}) as count FROM {table} GROUP BY {group_field} ORDER BY count DESC LIMIT N", | ||
| ), |
There was a problem hiding this comment.
The query pattern examples use LIMIT N, which is not valid SQLite syntax and conflicts with the prompt rule “Ensure the query is syntactically correct”. Since these examples are intended for LLM copying, they should be executable SQL (e.g., LIMIT 10) or use a clearly marked comment-style placeholder that won’t be emitted verbatim.
| async def _query_datafabric(sql_query: str) -> dict[str, Any]: | ||
| from uipath.platform import UiPath | ||
|
|
||
| logger.debug(f"query_datafabric called with SQL: {sql_query}") | ||
|
|
||
| sdk = UiPath() |
There was a problem hiding this comment.
create_datafabric_query_tool() is cached as a singleton, but _query_datafabric creates a new UiPath() SDK instance on every call. This adds avoidable overhead on each SQL query and can impact latency. Consider instantiating the SDK once at tool-creation time (outside the inner coroutine) so calls reuse the same client/config.
| async def _query_datafabric(sql_query: str) -> dict[str, Any]: | |
| from uipath.platform import UiPath | |
| logger.debug(f"query_datafabric called with SQL: {sql_query}") | |
| sdk = UiPath() | |
| from uipath.platform import UiPath | |
| # Instantiate the SDK once per tool instance to avoid per-call overhead. | |
| sdk = UiPath() | |
| async def _query_datafabric(sql_query: str) -> dict[str, Any]: | |
| logger.debug(f"query_datafabric called with SQL: {sql_query}") |
| ) | ||
| except Exception: | ||
| logger.exception("Init context provider '%s' failed; skipping", name) | ||
| return "\n\n".join(parts) if parts else None |
There was a problem hiding this comment.
Make it structured rather than free form text. Expose it as a pydantic model to be consumed by caller
UIPath-Harshit
left a comment
There was a problem hiding this comment.
Left few comments for @cristian-groza to chime-in. Others please address
| # --- Init-time context self-registration --- | ||
|
|
||
|
|
||
| async def _datafabric_init_context_provider( |
There was a problem hiding this comment.
Move this to datafabric_tool rather than exposing it as a module method.
|
|
||
| # --- Generic Tool Creation --- | ||
|
|
||
| _MAX_RECORDS_IN_RESPONSE = 50 |
There was a problem hiding this comment.
We can remove this check with free-form queries and LIMIT clause.
| def display_type(self) -> str: | ||
| """Type string with PK/required modifiers for markdown display.""" | ||
| modifiers = [] | ||
| if self.is_primary_key: |
There was a problem hiding this comment.
Concept of PK will get removed with VDOs integration. You need is_required and other attributes.
| query_patterns: list[QueryPattern] | ||
|
|
||
|
|
||
| class SQLContext(BaseModel): |
| @lru_cache(maxsize=1) | ||
| def _load_system_prompt() -> str: | ||
| """Load SQL generation strategy from system_prompt.txt.""" | ||
| prompt_path = _PROMPTS_DIR / "system_prompt.txt" |
There was a problem hiding this comment.
@cristian-groza this needs a discussion.
|
|
||
| return create_datafabric_query_tool() | ||
|
|
||
| assert resource.settings is not None |
There was a problem hiding this comment.
Put this behind isIndexTool flag.
| async def create_tools_from_resources( | ||
| agent: LowCodeAgentDefinition, llm: BaseChatModel | ||
| ) -> list[BaseTool]: | ||
| """Create tools from agent resources including Data Fabric tools. |
There was a problem hiding this comment.
Lets remove this comment. Doesnt make sense from overall agent perspective
| messages, | ||
| None, # input schema | ||
| True, # is_conversational | ||
| resources_for_init=None, |
There was a problem hiding this comment.
Can we add a test for resources_for_init also. Also this involves an async network call. @cristian-groza do we see any implication of this when using the "convert to coded agent" flow?
Summary
Adds Data Fabric tool support. Enables agents to query Data Fabric entities using SQL by injecting entity schemas into the system prompt at INIT time and providing a generic query_datafabric tool.
What it does: When an agent has Data Fabric entity contexts configured in Studio Web, the INIT node fetches entity metadata (table names, column names, types) from the Data Fabric API at startup. This metadata is formatted as markdown with SQL guidelines and constraints, then appended to the system prompt. A generic
query_datafabrictool is registered that accepts raw SQL and dispatches it to the Data Fabric API.Changes
react/agent.py- Add resources param, pass to create_init_node()react/init_node.py- Fetch DF schemas at INIT, inject into system prompt via additional_contextreact/types.py- Define AgentResources type aliastools/datafabric_tool/- New module: schema fetching, formatting, entity detection, query_datafabric tool, SQL guidelines and constraintstools/tool_factory.py- Register DF tool, skip DF contexts in generic resource looptools/context_tool.py- Guard: raise error if DF context is accidentally routed hereCompanion PR
https://github.com/UiPath/uipath-agents-python/pull/383
Testing
Screen.Recording.2026-03-24.at.4.48.53.PM.mov