Skip to content

feat: add list_taxons and list_species_calls utilities to SampleMetadata#1164

Open
Aadik1ng wants to merge 2 commits intomalariagen:masterfrom
Aadik1ng:feat/taxon-discovery-api
Open

feat: add list_taxons and list_species_calls utilities to SampleMetadata#1164
Aadik1ng wants to merge 2 commits intomalariagen:masterfrom
Aadik1ng:feat/taxon-discovery-api

Conversation

@Aadik1ng
Copy link

This PR addresses a common user need for simpler taxon and species discovery within sample sets. Currently, discovering which unique taxonomic groups are present in a release requires manually loading all sample metadata and computing unique values.

The new methods list_taxons() and list_species_calls() provide a clean, read-only API for this common task, improving the overall user experience for data exploration and preparation.

Changes

  • Added AnophelesSampleMetadata.list_taxons(): Returns a sorted list of unique taxon names.
  • Added AnophelesSampleMetadata.list_species_calls(): Returns a sorted list of unique species calls from AIM data (if available).
  • Updated malariagen_data/anoph/sample_metadata.py with robust implementations that handle missing columns and data types gracefully.
  • Added comprehensive numpydoc-compliant docstrings with usage examples.

Testing

Added new test cases to tests/anoph/test_sample_metadata.py:

  • test_list_taxons
  • test_list_species_calls

Verified functionality across the full suite of simulated datasets:

  • ag3
  • af1
  • adir1
  • amin1

All 20 newly added test cases passed locally with 100% success.

@Aadik1ng
Copy link
Author

@jonbrenas Can you please review this pr?

@jonbrenas
Copy link
Collaborator

Thanks @Aadik1ng, I am not sure list_species_calls is a very accurate name. Wouldn't list_aim_calls or list_aim_species_calls be better?

I think there is a duplicated return in your code.

@Aadik1ng Aadik1ng force-pushed the feat/taxon-discovery-api branch from 853827e to e6b2f2c Compare March 24, 2026 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants