-
Notifications
You must be signed in to change notification settings - Fork 1
Fix test fixtures: resolve Pydantic validation and pickle errors in chunker tests #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test fixtures: resolve Pydantic validation and pickle errors in chunker tests #36
Conversation
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
…ation issues Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
…with SgNode Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
bashandbone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work tackling some of the issues
|
@claude please address the issues you identified in your review and make the suggested changes. Your review: Claude Code Report🚀 System InitializationAvailable Tools: 20 tools loaded I'll review PR #36 for bugs, security issues, and performance problems. Let me first fetch the pull request details. Token usage: 18450 input, 3 output 🔧
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors fixture usage and method naming for improved consistency across the codebase. The changes primarily focus on:
- Consolidating test fixtures to use shared
chunk_governorfixture from conftest.py - Renaming method calls from
set_batch_idtoset_batch_keysfor consistency with the data model - Updating model rebuilding in conftest to include
CodeChunk - Adjusting
model_copybehavior to use shallow copying
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/engine/chunker/test_delimiter_basic.py | Removed duplicate governor fixture and updated delimiter_chunker fixture to use shared chunk_governor fixture |
| tests/unit/engine/chunker/conftest.py | Added CodeChunk.model_rebuild() call and imported EmbeddingBatchInfo to resolve forward references |
| src/codeweaver/engine/chunker/semantic.py | Updated all calls from set_batch_id() to set_batch_keys() to match the actual method name |
| src/codeweaver/core/chunks.py | Changed model_copy from deep=True to deep=False with explanatory comment about avoiding pickling issues |
| src/codeweaver/_version.py | Updated version string from rc306 to rc1 |
| coverage.xml | Auto-generated coverage report changes |
Comments suppressed due to low confidence (4)
src/codeweaver/core/chunks.py:261
- This test may raise an Exception as the target may be of non-container class EmbeddingRegistry.
return registry[self.chunk_id].dense if self.chunk_id in registry else None
src/codeweaver/core/chunks.py:269
- This test may raise an Exception as the target may be of non-container class EmbeddingRegistry.
return registry[self.chunk_id].sparse if self.chunk_id in registry else None
src/codeweaver/core/chunks.py:383
- This test may raise an Exception as the target may be of non-container class EmbeddingRegistry.
return registry[self.chunk_id].dense if self.chunk_id in registry else None
src/codeweaver/engine/chunker/semantic.py:406
- Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.
def _parse_file(self, content: str, file_path: Path | None) -> FileThing[SgRoot]:
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from codeweaver.engine.chunker.base import ChunkGovernor | ||
| from codeweaver.engine.chunker.governance import ResourceGovernor | ||
| from codeweaver.providers.embedding.capabilities.base import EmbeddingModelCapabilities | ||
| from codeweaver.providers.embedding.registry import EmbeddingBatchInfo # noqa: F401 |
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'EmbeddingBatchInfo' is not used.
| from codeweaver.providers.embedding.registry import EmbeddingBatchInfo # noqa: F401 |
Addresses code review issues from PR #36: 1. **Bug Fix**: Refactored set_batch_id() to follow immutability pattern - Changed return type from None to Self - Uses model_copy() to return new instance instead of mutating - Updated all 4 call sites in semantic.py to capture return value 2. **Security Fix**: Add explicit metadata copying in set_batch_id() - Prevents shared reference issues between CodeChunk instances - Shallow copies metadata dict while keeping SgNode references - Uses deep=False to avoid pickling unpicklable SgNode objects 3. **Performance**: Simplified _finalize_chunks() method - Removed redundant iteration over chunks - batch_id setting now handled in _deduplicate_chunks() 4. **Type Safety**: Fixed batch_id type annotation - Changed from Any to UUID7 in _finalize_chunks() This ensures CodeChunk immutability is properly maintained and prevents potential race conditions from shared metadata references. Co-authored-by: Adam Poulemanos <bashandbone@users.noreply.github.com>
Tests on
003-our-aim-tobranch were failing due to fixture configuration and model serialization issues. Fixed validation errors and pickle incompatibilities blocking test execution.Changes
Test Fixtures
governorfixture intest_delimiter_basic.py- now useschunk_governorfromconftest.pywith properEmbeddingModelCapabilitiesCodeChunk.model_rebuild()inconftest.pywithEmbeddingBatchInfoimport to resolve forward referencesMethod Name Updates
set_batch_id()→set_batch_keys()insemantic.py(5 locations) - aligns with currentCodeChunkAPIPickle Compatibility
model_copy(deep=True)→model_copy(deep=False)inCodeChunk.set_batch_keys()- ast-grepSgNodeobjects in metadata cannot be pickledResults
Outstanding Issues
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
astral.shcurl -LsSf REDACTED(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.