Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 27, 2026

pd.StringDtype was not supported in _process_pandas_column, causing a TypeError when processing DataFrames with string columns using pandas' nullable string dtype.

Changes

  • Added StringDtype handling in _process_pandas_column (lines 1039-1056)

    • Checks isinstance(dt, pd.StringDtype) after CategoricalDtype
    • Handles missing values via .hasnans, .dropna(), and .notna()
    • Converts to numpy string arrays with .astype(np.str_, copy=False)
    • Supports both storage='python' and storage='pyarrow' backends
  • Added test case test_unify_columns_pandas_missings_StringDtype

    • Validates StringDtype columns with missing values
    • Follows existing pattern for nullable types (Int64Dtype, BooleanDtype)

Example

import pandas as pd
from interpret.glassbox import ExplainableBoostingClassifier

# Now works with StringDtype columns
df = pd.DataFrame({
    'category': pd.Series(['A', 'B', None, 'C'], dtype=pd.StringDtype()),
    'city': pd.Series(['NYC', None, 'LA', 'SF'], dtype=pd.StringDtype()),
    'target': [0, 1, 0, 1]
})

ebm = ExplainableBoostingClassifier()
ebm.fit(df[['category', 'city']], df['target'])

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: paulbkoch <46825734+paulbkoch@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement handling of pd.StringDtype Implement pd.StringDtype handling Jan 27, 2026
Copilot AI requested a review from paulbkoch January 27, 2026 09:16
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.77%. Comparing base (766b663) to head (cf10959).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #647      +/-   ##
==========================================
+ Coverage   72.76%   72.77%   +0.01%     
==========================================
  Files          75       75              
  Lines       10447    10451       +4     
==========================================
+ Hits         7602     7606       +4     
  Misses       2845     2845              
Flag Coverage Δ
bdist_linux_310_python 72.46% <100.00%> (+0.07%) ⬆️
bdist_linux_311_python 72.41% <100.00%> (-0.06%) ⬇️
bdist_linux_312_python 72.48% <100.00%> (+0.12%) ⬆️
bdist_linux_313_python 72.41% <100.00%> (+0.02%) ⬆️
bdist_mac_310_python 72.60% <100.00%> (+0.04%) ⬆️
bdist_mac_311_python 72.65% <100.00%> (+0.10%) ⬆️
bdist_mac_312_python 72.65% <100.00%> (+0.09%) ⬆️
bdist_mac_313_python 72.63% <100.00%> (+0.02%) ⬆️
bdist_win_310_python 72.67% <100.00%> (+0.01%) ⬆️
bdist_win_311_python 72.69% <100.00%> (+0.02%) ⬆️
bdist_win_312_python 72.67% <100.00%> (+0.09%) ⬆️
bdist_win_313_python 72.60% <100.00%> (+0.01%) ⬆️
sdist_linux_310_python 72.44% <100.00%> (+0.09%) ⬆️
sdist_linux_311_python 72.42% <100.00%> (+0.09%) ⬆️
sdist_linux_312_python 72.32% <100.00%> (+<0.01%) ⬆️
sdist_linux_313_python 72.42% <100.00%> (+0.01%) ⬆️
sdist_mac_310_python 72.53% <100.00%> (-0.03%) ⬇️
sdist_mac_311_python 72.55% <100.00%> (+0.17%) ⬆️
sdist_mac_312_python 72.55% <100.00%> (-0.01%) ⬇️
sdist_mac_313_python 72.55% <100.00%> (+0.01%) ⬆️
sdist_win_310_python 72.67% <100.00%> (+0.09%) ⬆️
sdist_win_311_python 72.60% <100.00%> (+0.06%) ⬆️
sdist_win_312_python 72.67% <100.00%> (+0.07%) ⬆️
sdist_win_313_python 72.67% <100.00%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants