Conversation
WalkthroughTelemetry payloads were simplified to omit raw file paths and URL identifiers. Two helper builders, Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/telemetry.py`:
- Around line 544-551: The code coerces issue.get("location", "") to str,
turning None into the string "None" and causing _hash_identifier to produce
synthetic IDs; change issue_location to keep the original value (e.g.,
issue_location = issue.get("location")) and only convert to string or call
_hash_identifier / _extract_model_name / _classify_path when the value is
non-null/non-empty (e.g., if issue_location: location_str = str(issue_location);
use location_str for classification but call _hash_identifier(location_str) only
when issue_location is truthy, otherwise set location_identifier/model_name to
None). Apply the same fix to the other block around the references to
_hash_identifier/_classify_path/_extract_model_name noted at 572-576.
In `@tests/test_telemetry.py`:
- Around line 370-372: Add a return type annotation "-> None" to the test method
signature for test_path_and_url_fields_use_hashed_identifiers_only so the method
reads "def test_path_and_url_fields_use_hashed_identifiers_only(self) -> None:";
update only the function signature to include the annotation to satisfy the
repository typing rules for tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6352dee5-7c86-42e4-86fb-b473b01282c1
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/telemetry.pytests/test_telemetry.py
| issue_location = str(issue.get("location", "")) | ||
| issue_details.append( | ||
| { | ||
| "type": issue_type, | ||
| "severity": severity, | ||
| "location_type": self._classify_path(issue_location), | ||
| "file_path": issue_location, | ||
| "location_identifier": self._hash_identifier(issue_location) if issue_location else None, | ||
| "model_name": self._extract_model_name(issue_location) if issue_location else None, |
There was a problem hiding this comment.
Avoid hashing "None" for missing locations/paths.
Current str(...) coercion turns None into "None" (truthy), so missing location/path values get a synthetic identifier instead of being omitted.
🔧 Proposed fix
- issue_location = str(issue.get("location", ""))
+ raw_issue_location = issue.get("location")
+ issue_location = raw_issue_location if isinstance(raw_issue_location, str) else ""
issue_details.append(
{
"type": issue_type,
"severity": severity,
"location_type": self._classify_path(issue_location),
"location_identifier": self._hash_identifier(issue_location) if issue_location else None,
"model_name": self._extract_model_name(issue_location) if issue_location else None,
}
)
@@
- "file_identifiers": [
- self._hash_identifier(asset_path)
- for asset in assets
- if (asset_path := str(asset.get("path", "")))
- ],
+ "file_identifiers": [
+ self._hash_identifier(asset_path)
+ for asset in assets
+ if isinstance((asset_path := asset.get("path")), str) and asset_path
+ ],Also applies to: 572-576
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelaudit/telemetry.py` around lines 544 - 551, The code coerces
issue.get("location", "") to str, turning None into the string "None" and
causing _hash_identifier to produce synthetic IDs; change issue_location to keep
the original value (e.g., issue_location = issue.get("location")) and only
convert to string or call _hash_identifier / _extract_model_name /
_classify_path when the value is non-null/non-empty (e.g., if issue_location:
location_str = str(issue_location); use location_str for classification but call
_hash_identifier(location_str) only when issue_location is truthy, otherwise set
location_identifier/model_name to None). Apply the same fix to the other block
around the references to _hash_identifier/_classify_path/_extract_model_name
noted at 572-576.
tests/test_telemetry.py
Outdated
| def test_path_and_url_fields_use_hashed_identifiers_only(self): | ||
| """Path and URL fields should not include raw sensitive values.""" | ||
| mock_posthog = MagicMock() |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add -> None return annotation on this test method.
This keeps test typing consistent with repository rules.
✏️ Proposed fix
- def test_path_and_url_fields_use_hashed_identifiers_only(self):
+ def test_path_and_url_fields_use_hashed_identifiers_only(self) -> None:As per coding guidelines: "tests/**/*.py: Use type hints -> None on all test methods and tmp_path: Path / monkeypatch: pytest.MonkeyPatch on test parameters".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_telemetry.py` around lines 370 - 372, Add a return type annotation
"-> None" to the test method signature for
test_path_and_url_fields_use_hashed_identifiers_only so the method reads "def
test_path_and_url_fields_use_hashed_identifiers_only(self) -> None:"; update
only the function signature to include the annotation to satisfy the repository
typing rules for tests.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_telemetry.py`:
- Around line 273-274: The test method
test_scan_completed_uses_top_level_results_schema is missing a return type
annotation; update its definition to include "-> None" (i.e., change the
function signature for test_scan_completed_uses_top_level_results_schema to
declare a None return type) to comply with the tests/**/*.py typing guideline.
- Around line 335-336: The test method definition
test_scan_started_includes_per_path_fields is missing a return type annotation;
update its signature to include the explicit return annotation "-> None" (i.e.,
change "def test_scan_started_includes_per_path_fields(self):" to "def
test_scan_started_includes_per_path_fields(self) -> None:") so the test follows
the project's typing guideline for tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c05982f2-472b-459c-82de-24148c75fb41
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/telemetry.pytests/test_telemetry.py
| def test_scan_completed_uses_top_level_results_schema(self): | ||
| """Test that scan completion telemetry aggregates from top-level issues/assets.""" | ||
| """Test that scan completion telemetry aggregates without file identifiers.""" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add -> None return annotation on this test method.
- def test_scan_completed_uses_top_level_results_schema(self):
+ def test_scan_completed_uses_top_level_results_schema(self) -> None:As per coding guidelines: "tests/**/*.py: Use type hints -> None on all test methods".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_scan_completed_uses_top_level_results_schema(self): | |
| """Test that scan completion telemetry aggregates from top-level issues/assets.""" | |
| """Test that scan completion telemetry aggregates without file identifiers.""" | |
| def test_scan_completed_uses_top_level_results_schema(self) -> None: | |
| """Test that scan completion telemetry aggregates without file identifiers.""" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_telemetry.py` around lines 273 - 274, The test method
test_scan_completed_uses_top_level_results_schema is missing a return type
annotation; update its definition to include "-> None" (i.e., change the
function signature for test_scan_completed_uses_top_level_results_schema to
declare a None return type) to comply with the tests/**/*.py typing guideline.
| def test_scan_started_includes_per_path_fields(self): | ||
| """Scan started events include per-path arrays and model names.""" | ||
| """Scan started events include coarse path metadata but no identifiers.""" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add -> None return annotation on this test method.
- def test_scan_started_includes_per_path_fields(self):
+ def test_scan_started_includes_per_path_fields(self) -> None:As per coding guidelines: "tests/**/*.py: Use type hints -> None on all test methods".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_scan_started_includes_per_path_fields(self): | |
| """Scan started events include per-path arrays and model names.""" | |
| """Scan started events include coarse path metadata but no identifiers.""" | |
| def test_scan_started_includes_per_path_fields(self) -> None: | |
| """Scan started events include coarse path metadata but no identifiers.""" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_telemetry.py` around lines 335 - 336, The test method definition
test_scan_started_includes_per_path_fields is missing a return type annotation;
update its signature to include the explicit return annotation "-> None" (i.e.,
change "def test_scan_started_includes_per_path_fields(self):" to "def
test_scan_started_includes_per_path_fields(self) -> None:") so the test follows
the project's typing guideline for tests.
# Conflicts: # CHANGELOG.md
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 84: Update the CHANGELOG entry that currently reads "- **telemetry:**
simplify telemetry payloads to omit file path identifiers" to accurately reflect
the implementation: if the code replaces paths with hashed values, change the
wording to "hash file path identifiers in telemetry payloads" or "replace raw
file path identifiers with hashed values"; if the code actually removes them,
keep "omit file path identifiers"; ensure the changelog text matches the
implementation and mentions "telemetry" and "file path identifiers" so it's easy
to locate and verify.
|
|
||
| ### Fixed | ||
|
|
||
| - **telemetry:** simplify telemetry payloads to omit file path identifiers |
There was a problem hiding this comment.
Clarify whether file paths are hashed or omitted.
The changelog entry states that file path identifiers are "omitted," but the PR title says "hash file paths in payloads." These describe different approaches:
- Hashing means raw values are replaced with hashed identifiers (data is transformed but still present)
- Omitting means the values are completely removed (data is absent)
For clarity and accuracy, align the changelog wording with what the implementation actually does. If paths are being hashed, consider: "replace raw file path identifiers with hashed values" or "hash file path identifiers in telemetry payloads."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` at line 84, Update the CHANGELOG entry that currently reads "-
**telemetry:** simplify telemetry payloads to omit file path identifiers" to
accurately reflect the implementation: if the code replaces paths with hashed
values, change the wording to "hash file path identifiers in telemetry payloads"
or "replace raw file path identifiers with hashed values"; if the code actually
removes them, keep "omit file path identifiers"; ensure the changelog text
matches the implementation and mentions "telemetry" and "file path identifiers"
so it's easy to locate and verify.
Use hashed identifiers for telemetry file and URL fields instead of raw values.
Update the telemetry regression tests and changelog wording.
Summary by CodeRabbit
Bug Fixes
Tests
Documentation