Fix CodeQL alert #28 in Azure endpoint validation#4455
Fix CodeQL alert #28 in Azure endpoint validation#4455theCyberTech wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses CodeQL alert #28 by replacing substring-based Azure OpenAI endpoint detection with URL parsing + hostname/path validation to prevent spoofed endpoints from being treated as Azure OpenAI deployment URLs.
Changes:
- Implement parsed-URL validation for Azure OpenAI endpoints (hostname labels +
/openai/deployments/<name>path). - Preserve endpoint auto-construction behavior while avoiding query/host/path substring spoofing.
- Add regression tests covering spoofed query/hostname endpoint inputs.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
lib/crewai/src/crewai/llms/providers/azure/completion.py |
Replaces substring checks with urlparse-based hostname/path segment validation and updates endpoint fixing logic accordingly. |
lib/crewai/tests/llms/azure/test_azure.py |
Adds tests ensuring spoofed URLs don’t set is_azure_openai_endpoint and don’t suppress the model parameter. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| hostname = parsed_endpoint.hostname or "" | ||
| labels = [label for label in hostname.lower().split(".") if label] | ||
|
|
||
| return len(labels) >= 3 and labels[-3:] == ["openai", "azure", "com"] |
There was a problem hiding this comment.
_is_azure_openai_hostname currently returns true for openai.azure.com (no resource label) because it only requires len(labels) >= 3. That contradicts the documented expected format https://<resource-name>.openai.azure.com/... and could misclassify an invalid/placeholder hostname as an Azure OpenAI endpoint. Consider requiring at least one subdomain label before openai.azure.com (e.g., len(labels) >= 4).
| return len(labels) >= 3 and labels[-3:] == ["openai", "azure", "com"] | |
| return len(labels) >= 4 and labels[-3:] == ["openai", "azure", "com"] |
| parsed_endpoint = urlparse(endpoint) | ||
| if parsed_endpoint.hostname: | ||
| return parsed_endpoint | ||
|
|
||
| # Support endpoint values without a URL scheme. | ||
| return urlparse(f"https://{endpoint}") | ||
|
|
There was a problem hiding this comment.
_parse_endpoint_url adds an https:// prefix only for parsing, but _validate_and_fix_endpoint returns the original endpoint string unchanged (aside from appending path). If a user provides an endpoint without a scheme, this logic can still produce a scheme-less self.endpoint, which is likely not a valid URL for the Azure SDK. Either normalize the stored/returned endpoint to include a scheme when missing, or drop the “support without a URL scheme” behavior/comment to avoid implying support that isn’t actually provided end-to-end.
| parsed_endpoint = urlparse(endpoint) | |
| if parsed_endpoint.hostname: | |
| return parsed_endpoint | |
| # Support endpoint values without a URL scheme. | |
| return urlparse(f"https://{endpoint}") | |
| return urlparse(endpoint) |
Summary
/openai/deployments/<name>)CodeQL
py/incomplete-url-substring-sanitizationValidation
python -m compileall lib/crewai/src/crewai/llms/providers/azure/completion.py lib/crewai/tests/llms/azure/test_azure.pyuv run pytest lib/crewai/tests/llms/azure/test_azure.py -k "endpoint"Evidence