Skip to content

Do not treat NA values or empty cells as filled metadata#171

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-na-value-handling
Draft

Do not treat NA values or empty cells as filled metadata#171
Copilot wants to merge 2 commits intomainfrom
copilot/fix-na-value-handling

Conversation

Copy link
Contributor

Copilot AI commented Feb 27, 2026

Fields containing NA/empty values were treated as already-populated metadata, causing the annotator to skip them on subsequent runs (if job.target not in metadata passes when the key exists with an NA value).

Changes

  • Generic.py — New is_na_value() helper detecting None, float NaN, and common NA strings ('na', 'n/a', 'nan', 'none', '')
  • DataFrame.get_metadata() — Filters out NA values from each record so empty/NA columns appear absent to the annotator
  • Spectra.get_metadata() — Filters out NA values from matchms metadata dicts, returning plain dicts without NA entries
# Before: NA cells survive into metadata, blocking annotation
{'formula': 'H2', 'inchikey': nan, 'smiles': 'NA'}  # inchikey/smiles won't be re-populated

# After: NA values stripped at read time
{'formula': 'H2'}  # annotator will attempt to fill inchikey and smiles

Tests

Added sample_metadata_with_na.csv and sample_with_na.msp covering NA, nan, N/A, None, and empty cell variants, with a parametrized test_na_values_filtered_from_metadata test for both backends.

Original prompt

This section details on the original issue you should resolve

<issue_title>Do not treat NA values or empty cells as filled metadata cells</issue_title>
<issue_description>The tool currently has all cells covered with NA or other values after one run, so subsequent runs don't add new values to those fields. NA or None values should be ignored when reading tabular or MSP data</issue_description>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: hechth <12066490+hechth@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix NA values handling in metadata processing Do not treat NA values or empty cells as filled metadata Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not treat NA values or empty cells as filled metadata cells

2 participants