Skip to content

fix: replace NotImplementedError in _surveillance_flags with graceful fallback#1207

Open
khushthecoder wants to merge 7 commits intomalariagen:masterfrom
khushthecoder:fix/issue-1206-surveillance-flags-graceful-fallback
Open

fix: replace NotImplementedError in _surveillance_flags with graceful fallback#1207
khushthecoder wants to merge 7 commits intomalariagen:masterfrom
khushthecoder:fix/issue-1206-surveillance-flags-graceful-fallback

Conversation

@khushthecoder
Copy link
Contributor

@khushthecoder khushthecoder commented Mar 23, 2026

Summary

Fixes #1206. The abstract method _surveillance_flags in AnophelesBase previously raised NotImplementedError when a subclass did not implement it. For new species resources or minimal subclasses during development, this caused a hard crash instead of a clear warning and graceful fallback. This PR replaces the raise with a fallback that returns an empty DataFrame with the expected schema and emits a UserWarning.

Problem

When _surveillance_flags is not overridden (e.g. during development of a new species or when using a minimal subclass of AnophelesBase), any call that reaches this method raises NotImplementedError. The crash occurs in several paths:

  • _relevant_sample_sets() when surveillance_use_only=True filters sample sets via _sample_set_has_surveillance_data(), which calls _surveillance_flags()
  • sample_metadata() merges surveillance flags via _surveillance_flags()
  • wgs_data_catalog() when surveillance_use_only=True

In contrast, _parse_surveillance_flags in sample_metadata.py already handles missing surveillance data (FileNotFoundError) by returning an empty DataFrame and emitting a UserWarning. The base stub now follows the same pattern.

Changes

malariagen_data/anoph/base.py

  • Added import warnings
  • Replaced raise NotImplementedError(...) in _surveillance_flags with a graceful fallback that:
    • Emits UserWarning with message: "Surveillance flags not implemented for this resource; returning empty data." (uses stacklevel=2 so the warning points to the caller)
    • Returns an empty pd.DataFrame with columns sample_id (object) and is_surveillance (nullable boolean), matching the schema expected by downstream code

tests/anoph/test_base.py

  • Added TestSurveillanceFlagsBaseFallback with two tests:
    • test_surveillance_flags_base_returns_empty_and_warns: Asserts that the base implementation returns an empty DataFrame with correct columns and dtypes and emits UserWarning
    • test_sample_set_has_surveillance_data_returns_false_when_fallback: Asserts that _sample_set_has_surveillance_data() returns False when the base fallback is used (empty DataFrame → .any() is False)

Compatibility

  • No circular dependencies: The fallback does not call _prep_sample_sets_param() or general_metadata().
  • Schema preserved: Downstream code that expects sample_id and is_surveillance columns receives that structure; empty DataFrame makes merges harmless and .any() False.
  • Existing subclasses unchanged: AnophelesSampleMetadata._surveillance_flags() overrides the base and continues to load real data; all full species (Ag3, Af1, Adir1, etc.) use that implementation.

Impact

  • DX for new species: Developers see a warning instead of a crash during initial integration.
  • Consistency: Same pattern as _parse_surveillance_flags (missing data → warn + empty DataFrame).
  • Robustness: Future refactors or base-only code paths no longer cause unexpected crashes.

@codecov
Copy link

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.48%. Comparing base (c66eb1a) to head (a80d41d).
⚠️ Report is 20 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1207      +/-   ##
==========================================
+ Coverage   89.46%   89.48%   +0.02%     
==========================================
  Files          51       51              
  Lines        6008     6020      +12     
==========================================
+ Hits         5375     5387      +12     
  Misses        633      633              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_surveillance_flags raises NotImplementedError instead of warning and returning empty DataFrame

1 participant