Fix: implement classification for in-frame complex variants#1201
Fix: implement classification for in-frame complex variants#1201kunal-10-cloud wants to merge 4 commits intomalariagen:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses Issue #1200 by replacing the in-frame complex-variant “TODO” placeholder in veff with a helper-based classification path, and adding tests to validate the new behavior.
Changes:
- Replaced placeholder effect/impact assignment for mixed MNP+INDEL in-frame variants with
_classify_inframe_complex_effect(). - Added a new unit test module covering the helper’s effect/impact outputs and basic invariants (e.g., no “TODO”, field preservation).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
malariagen_data/veff.py |
Routes in-frame complex (mixed MNP+INDEL) variants through a new helper to assign non-UNKNOWN effect/impact. |
tests/test_veff.py |
Adds tests for the helper and basic output integrity checks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…sification - Update classification logic with position-aware checks: * START_LOST: check first amino acid only (ref_aa[0]) * STOP mutations: check presence anywhere in string - Add 4 new test cases for multi-codon scenarios (MLR, Q*R, QR*, etc.) - Rename TestInframeComplexEffectIntegration to TestClassifyInframeComplexEffectEdgeCases - Test coverage: 16/16 passing
|
Hi @kunal-10-cloud, as I explained in #1181 (which solves the exact same issue you are addressing here), the situation is more complex. Your solution is more complex than what could be found in #1181, granted, but it is also not accurate. |
Replace TODO with INFRAME_COMPLEX_VARIANT/MODERATE placeholder. Code path unreachable in production (SNP-only), proper classification requires frame-shift analysis per PR malariagen#1181 discussion.
|
hi @jonbrenas, I have tried to simplify the approach. Can you please take a look at it and let me know how can i proceed |
|
Hi @kunal-10-cloud, I don't think you understood what I was saying. You basically reverted your PR to what was in #1181. That's not what I was asking at all. What I am saying is that:
All that to say, this is not an issue that you can or should try to solve and so this PR has no reason to exist. |
Hi @jonbrenas i understood your point and i was aware of this thats why in the comments i mentioned the same thing as well and asked how to proceed irrespective of that really thanks for the description and taking out your time. |
Summary
Replaces TODO placeholder code with a classification helper function that accurately determines effect and impact for in-frame complex variants by comparing amino acid sequences. Includes position-aware detection for START/STOP codon changes and comprehensive multi-codon test coverage.
Fixes: #1198
What Changed
Added
_classify_inframe_complex_effect()helper function that:Added comprehensive test suite with 16 tests validating:
Validation
Technical Approach
In-frame complex variants have complete amino acid information available, enabling SNP-style classification. Position-aware logic ensures: