Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Python sample validation workflow and one sample to align with breaking changes from the GitHub Copilot dependency upgrade, including updated result/status reporting.
Changes:
- Replace TIMEOUT/ERROR result handling with a new
MISSING_SETUPstatus across report generation and CLI exit code logic. - Update Copilot permission request type import/usage and add a retry loop around Copilot agent execution.
- Fix the built-in chat clients sample to pass
Sequence[Message]intoget_response.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| python/scripts/sample_validation/report.py | Updates report aggregation/sorting/printing to use MISSING_SETUP. |
| python/scripts/sample_validation/models.py | Updates RunStatus and Report shape to remove TIMEOUT/ERROR and add missing_setup_count. |
| python/scripts/sample_validation/create_dynamic_workflow_executor.py | Updates Copilot permission request type usage and adds retry/session handling for agent execution. |
| python/scripts/sample_validation/main.py | Updates failure exit code calculation to include missing_setup_count. |
| python/scripts/sample_validation/README.md | Updates documented status codes table to include MISSING_SETUP. |
| python/samples/02-agents/chat_client/built_in_chat_clients.py | Updates sample to send Message objects into get_response. |
| print(f"User: {message.text}") | ||
|
|
||
| # 2. Run with context-managed clients. | ||
| if isinstance(client, OpenAIAssistantsClient | AzureOpenAIAssistantsClient | AzureAIAgentClient): |
There was a problem hiding this comment.
isinstance(client, OpenAIAssistantsClient | AzureOpenAIAssistantsClient | AzureAIAgentClient) will raise TypeError at runtime because isinstance does not accept PEP 604 union types. Use a tuple of classes instead (e.g., isinstance(client, (OpenAIAssistantsClient, AzureOpenAIAssistantsClient, AzureAIAgentClient))).
| if isinstance(client, OpenAIAssistantsClient | AzureOpenAIAssistantsClient | AzureAIAgentClient): | |
| if isinstance(client, (OpenAIAssistantsClient, AzureOpenAIAssistantsClient, AzureAIAgentClient)): |
| FAILURE = "failure" | ||
| TIMEOUT = "timeout" | ||
| ERROR = "error" | ||
| MISSING_SETUP = "missing_setup" |
There was a problem hiding this comment.
RunStatus no longer defines ERROR/TIMEOUT, but sample_validation/run_dynamic_validation_workflow_executor.py still references RunStatus.ERROR in its fallback path. This will raise AttributeError at runtime (or fail type checking) and breaks the fallback behavior; update that executor to use an existing status (e.g., FAILURE or MISSING_SETUP) or reintroduce an ERROR status if you still need to distinguish internal workflow failures from sample failures.
| MISSING_SETUP = "missing_setup" | |
| MISSING_SETUP = "missing_setup" | |
| ERROR = "error" | |
| TIMEOUT = "timeout" |
| RunStatus.MISSING_SETUP: "[MISSING_SETUP]", | ||
| } | ||
|
|
||
| lines.append(f"### {status_label[status]} {status.value.title()} ({len(status_results)})") |
There was a problem hiding this comment.
In the markdown report headers, status.value.title() will render missing_setup as Missing_Setup (underscore preserved). Consider formatting status values for display (e.g., replace underscores with spaces before title-casing) so the section header is readable.
| lines.append(f"### {status_label[status]} {status.value.title()} ({len(status_results)})") | |
| status_display = status.value.replace("_", " ").title() | |
| lines.append(f"### {status_label[status]} {status_display} ({len(status_results)})") |
| current_retry += 1 | ||
| await self.agent.stop() | ||
| await self.agent.start() | ||
| self._session = self.agent.create_session() # Reset session for retry | ||
| continue |
There was a problem hiding this comment.
The retry path calls await self.agent.start() without guarding against failures. If start() raises (it wraps errors in AgentException), handle_task will exit without sending a RunResult/WorkerFreed, which can wedge the coordinator/collector flow. Catch exceptions from stop/start/session reset and convert them into a RunResult (FAILURE/MISSING_SETUP) so the workflow always makes forward progress.
| current_retry += 1 | |
| await self.agent.stop() | |
| await self.agent.start() | |
| self._session = self.agent.create_session() # Reset session for retry | |
| continue | |
| try: | |
| await self.agent.stop() | |
| await self.agent.start() | |
| self._session = self.agent.create_session() # Reset session for retry | |
| current_retry += 1 | |
| continue | |
| except Exception as reset_ex: | |
| logger.error( | |
| f"Error resetting agent {self.agent.id} during retry after error {ex}: {reset_ex}" | |
| ) | |
| result = RunResult( | |
| sample=sample, | |
| status=RunStatus.MISSING_SETUP, | |
| output="", | |
| error=str(reset_ex), | |
| ) | |
| break |
| @@ -108,39 +108,58 @@ class CustomAgentExecutor(Executor): | |||
| returned as error responses, otherwise an exception in one agent could crash the entire workflow. | |||
There was a problem hiding this comment.
This docstring says exceptions are "returned as error responses", but the report model no longer has an ERROR status and exceptions are mapped to FAILURE. Update the wording to reflect the current statuses/behavior so future maintainers don’t assume an ERROR bucket exists.
| returned as error responses, otherwise an exception in one agent could crash the entire workflow. | |
| reported as failure results, otherwise an exception in one agent could crash the entire workflow. |
| | Status | Label | Description | | ||
| | ------------- | --------------- | ----------------------------------------- | | ||
| | SUCCESS | [PASS] | Sample ran to completion with exit code 0 | | ||
| | FAILURE | [FAIL] | Sample did not complete successfully (non-zero exit code) | | ||
| | MISSING_SETUP | [MISSING_SETUP] | Sample skipped due to missing setup | |
There was a problem hiding this comment.
Documentation is inconsistent: the status code table only lists SUCCESS/FAILURE/MISSING_SETUP, but the Troubleshooting section below still says samples can be marked as ERROR. Either reintroduce an ERROR status in the models/reporting, or update the Troubleshooting guidance to match the new status taxonomy.
Motivation and Context
Update the sample validation scripts due to breaking changes in GH Copilot dependency upgrade.
Description
Contribution Checklist