Migrate gene_cnv() to AnophelesCnvData#1214
Migrate gene_cnv() to AnophelesCnvData#1214kunal-10-cloud wants to merge 5 commits intomalariagen:masterfrom
Conversation
Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to AnophelesCnvData class to achieve architectural consistency with the modular CNV data-layer mixin pattern used throughout the codebase. Changes: - Add gene_cnv() and _gene_cnv() methods to AnophelesCnvData class - Move _cn_mode() and _cn_mode_1d() numba-compiled helper functions to cnv_data.py - Add required imports (bisect, dask, numba) to cnv_data.py - Remove moved methods and functions from AnophelesCnvFrequencyAnalysis in cnv_frq.py - Clean up unused imports from cnv_frq.py - Remove TODO block and type:ignore comment from dipclust.py - Update AnophelesCnvFrequencyAnalysis to inherit gene_cnv() via AnophelesCnvData Benefits: - Fixes architectural inconsistency: data access methods now in correct layer - Improves type checking: removes need for type:ignore comment in dipclust.py - Enhances discoverability: gene_cnv() no Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to AnophelesCnvData class to achieve arorkAnophelesCnvData class to achieve architectural consistency with the mod tCNV data-layer mixin pattern used throughout the codebase. Changes: - Add gfo Changes: - Add gene_cnv() and _gene_cnv() methods to Anos d- Add g
There was a problem hiding this comment.
Pull request overview
This PR migrates the gene_cnv() CNV gene-level data access method from the CNV frequency analysis layer into the AnophelesCnvData data-layer mixin, aligning CNV access patterns with the existing modular architecture and unblocking cleaner inheritance usage in dipclust.
Changes:
- Moved
gene_cnv()and_gene_cnv()fromcnv_frq.pyintocnv_data.py(including required helpers/imports). - Removed now-unneeded imports and numba helper definitions from
cnv_frq.py. - Cleaned up
dipclust.pyby removing the migration TODO and the# type: ignoreonself.gene_cnv(...).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
malariagen_data/anoph/cnv_data.py |
Adds gene_cnv() implementation to the CNV data-layer mixin and relocates numba mode helpers. |
malariagen_data/anoph/cnv_frq.py |
Removes the migrated gene_cnv() implementation and associated unused imports/helpers. |
malariagen_data/anoph/dipclust.py |
Removes TODO and type ignore now that gene_cnv() exists in the inherited data layer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Raise clear ValueError when no genes found in requested region - Prevents confusing downstream errors (NaN in min/max, empty vstack) - Improves user experience with descriptive error message - All tests passing (96/96) - All quality checks passing (ruff, pre-commit)
|
Hi @jonbrenas can you please take a look at this PR i have also stated my approach in the issue(#1208) |
Fixes: #1208
Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to AnophelesCnvData class to achieve architectural consistency with the modular CNV data-layer mixin pattern used throughout the codebase.
Changes Made
Core Migration
Moved
gene_cnv()method fromcnv_frq.pytocnv_data.py@_check_types,@doc(...)xr.DatasetMoved
_gene_cnv()implementation fromcnv_frq.pytocnv_data.pybisect,dask, and numba helpersMoved numba helpers to
cnv_data.py_cn_mode_1d(): 1D modal copy number computation_cn_mode(): 2D modal copy number computation (calls 1D per column)Cleanup
cnv_frq.py(bisect,dask,numba)dipclust.pytype:ignorecomment fromdipclust.py(type checking now passes)Architecture
Before: Analysis layer method → Data access mixed with frequency analysis
After: Data layer method → Frequency analysis inherits from data layer
Inheritance Chain:
Testing & Validation
Breaking Changes
None. The method signature, return type, and public API remain unchanged. All existing code continues to work through inheritance.
Code Quality