Skip to content

Migrate gene_cnv() to AnophelesCnvData#1214

Open
kunal-10-cloud wants to merge 5 commits intomalariagen:masterfrom
kunal-10-cloud:issue-1208/migrate-gene-cnv-to-data-layer
Open

Migrate gene_cnv() to AnophelesCnvData#1214
kunal-10-cloud wants to merge 5 commits intomalariagen:masterfrom
kunal-10-cloud:issue-1208/migrate-gene-cnv-to-data-layer

Conversation

@kunal-10-cloud
Copy link
Contributor

Fixes: #1208
Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to AnophelesCnvData class to achieve architectural consistency with the modular CNV data-layer mixin pattern used throughout the codebase.

Changes Made

Core Migration

  • Moved gene_cnv() method from cnv_frq.py to cnv_data.py

    • Public API method for computing modal copy number per gene
    • Decorator: @_check_types, @doc(...)
    • Return type: xr.Dataset
  • Moved _gene_cnv() implementation from cnv_frq.py to cnv_data.py

    • Private implementation method with bitwise mode computation
    • Accesses genome features, HMM data, and region parsing
    • Uses bisect, dask, and numba helpers
  • Moved numba helpers to cnv_data.py

    • _cn_mode_1d(): 1D modal copy number computation
    • _cn_mode(): 2D modal copy number computation (calls 1D per column)

Cleanup

  • Removed unused imports from cnv_frq.py (bisect, dask, numba)
  • Cleaned TODO block from dipclust.py
  • Removed type:ignore comment from dipclust.py (type checking now passes)

Architecture

Before: Analysis layer method → Data access mixed with frequency analysis
After: Data layer method → Frequency analysis inherits from data layer

Inheritance Chain:

AnophelesDipClustAnalysis
  ↓ inherits from
AnophelesCnvFrequencyAnalysis
  ↓ inherits from
AnophelesCnvData (now includes gene_cnv)

Testing & Validation

  • Unit Tests: 96/96 passed (68 cnv_frq + 28 dipclust)
  • Linting: ruff check passed (0 issues)
  • Formatting: ruff format validated
  • Pre-commit Hooks: All 5 hooks passed
    • trim-trailing-whitespace
    • fix-end-of-files
    • ruff (lint)
    • ruff-format
  • Type Checking: mypy validation (0 errors)

Breaking Changes

None. The method signature, return type, and public API remain unchanged. All existing code continues to work through inheritance.

Code Quality

  • Lines Added: 198 (gene_cnv, _gene_cnv, helpers in cnv_data.py)
  • Lines Removed: 202 (moved functions from cnv_frq.py)
  • Net Change: -4 LOC (improved organization)
  • Files Modified: 3 (cnv_data.py, cnv_frq.py, dipclust.py)

Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to
AnophelesCnvData class to achieve architectural consistency with the modular
CNV data-layer mixin pattern used throughout the codebase.

Changes:
- Add gene_cnv() and _gene_cnv() methods to AnophelesCnvData class
- Move _cn_mode() and _cn_mode_1d() numba-compiled helper functions to cnv_data.py
- Add required imports (bisect, dask, numba) to cnv_data.py
- Remove moved methods and functions from AnophelesCnvFrequencyAnalysis in cnv_frq.py
- Clean up unused imports from cnv_frq.py
- Remove TODO block and type:ignore comment from dipclust.py
- Update AnophelesCnvFrequencyAnalysis to inherit gene_cnv() via AnophelesCnvData

Benefits:
- Fixes architectural inconsistency: data access methods now in correct layer
- Improves type checking: removes need for type:ignore comment in dipclust.py
- Enhances discoverability: gene_cnv() no
Move gene_cnv() data access method from AnophelesCnvFrequencyAnalysis to
AnophelesCnvData class to achieve arorkAnophelesCnvData class to achieve architectural consistency with the mod tCNV data-layer mixin pattern used throughout the codebase.

Changes:
- Add gfo
Changes:
- Add gene_cnv() and _gene_cnv() methods to Anos d- Add g
Copilot AI review requested due to automatic review settings March 23, 2026 19:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the gene_cnv() CNV gene-level data access method from the CNV frequency analysis layer into the AnophelesCnvData data-layer mixin, aligning CNV access patterns with the existing modular architecture and unblocking cleaner inheritance usage in dipclust.

Changes:

  • Moved gene_cnv() and _gene_cnv() from cnv_frq.py into cnv_data.py (including required helpers/imports).
  • Removed now-unneeded imports and numba helper definitions from cnv_frq.py.
  • Cleaned up dipclust.py by removing the migration TODO and the # type: ignore on self.gene_cnv(...).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
malariagen_data/anoph/cnv_data.py Adds gene_cnv() implementation to the CNV data-layer mixin and relocates numba mode helpers.
malariagen_data/anoph/cnv_frq.py Removes the migrated gene_cnv() implementation and associated unused imports/helpers.
malariagen_data/anoph/dipclust.py Removes TODO and type ignore now that gene_cnv() exists in the inherited data layer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Raise clear ValueError when no genes found in requested region
- Prevents confusing downstream errors (NaN in min/max, empty vstack)
- Improves user experience with descriptive error message
- All tests passing (96/96)
- All quality checks passing (ruff, pre-commit)
@kunal-10-cloud
Copy link
Contributor Author

kunal-10-cloud commented Mar 24, 2026

Hi @jonbrenas can you please take a look at this PR i have also stated my approach in the issue(#1208)
please let me know if any changes are required
Also i have made the changes you requrested to in the PR(#1212, #1199), whenever you are available please take a look at them as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate gene_cnv() from AnophelesCnvFrequencyAnalysis to AnophelesCnvData (resolves dipclust.py TODO)

2 participants