Gene symbols patch#76
Conversation
Review Summary by QodoEnhance gene symbol validation and ID detection
WalkthroughsDescription• Improve gene ID detection to handle float-stringified Entrez IDs (e.g., "7157.0") • Add validation to reject gene symbols columns containing >50% gene IDs with informative error • Remove unused gene_symbols_col parameter from validate_adata function • Add comprehensive test suite for gene ID detection and percentage calculation Diagramflowchart LR
A["Gene symbols column"] --> B["materialize_canonical_gene_symbols_column"]
B --> C["clean_gene_names"]
C --> D["_id_like_percentage check"]
D --> E{">50% IDs?"}
E -->|Yes| F["Raise ValueError"]
E -->|No| G["Store canonical column"]
H["_is_gene_id_like"] --> D
I["Tests"] --> H
I --> D
File Changes1. cytetype/main.py
|
Code Review by Qodo
1.
|
|
@parashardhapola
|

Catch entrez ids
raise error if auto-detection of gene symbol column contains gene ids > 50%