Fix expand_dims string coordinate dtype inference #11069

garg-khushi · 2026-01-05T09:35:59Z

This PR fixes an inconsistency where expand_dims created object-dtype
coordinates for string inputs instead of NumPy unicode dtype.

Changes:

Preserve NumPy unicode dtype for string coordinates created via expand_dims
Prevent object-dtype propagation through concat by improving
PandasIndex coord dtype inference
Add regression tests covering both cases

Fixes #11061

for more information, see https://pre-commit.ci

jsignell

Thanks for taking the time to work on this @garg-khushi and sorry it's taken a bit to review. I was wondering if there might be a simpler solution as mentioned in the comments. I tried out the change that I suggested locally and just had to tweak one test:

diff --git a/xarray/tests/test_dataset.py b/xarray/tests/test_dataset.py
index d25ef5a2..6f9edae4 100644
--- a/xarray/tests/test_dataset.py
+++ b/xarray/tests/test_dataset.py
@@ -3860,7 +3860,7 @@ class TestDataset:
         # Regression test for https://github.com/pydata/xarray/issues/7493#issuecomment-1953091000
         # todo: test still needed?
         ds = Dataset().expand_dims({"time": [np.datetime64("2018-01-01", "m")]})
-        assert ds.time.dtype == np.dtype("datetime64[s]")
+        assert ds.time.dtype == np.dtype("datetime64[m]")
 
     def test_set_index(self) -> None:
         expected = create_test_multiindex()

jsignell · 2026-01-22T17:38:00Z

xarray/core/dataset.py

                variables.update(name_and_new_1d_var)
                coord_names.add(k)
                dim[k] = variables[k].size
+


I don't think we need any changes in this file actually.

jsignell · 2026-01-22T18:15:38Z

xarray/core/indexes.py

                coord_dtype = get_valid_numpy_dtype(index)
+                if coord_dtype == object and index.dtype == object:
+                    inferred = getattr(index, "inferred_type", None)
+                    if inferred in ("string", "unicode"):
+                        coord_dtype = np.dtype(str)
+                    else:
+                        data = index.to_numpy(dtype=object, copy=False)
+                        if data.size and all(
+                            isinstance(x, (str, np.str_)) for x in data.ravel()
+                        ):
+                            coord_dtype = np.asarray(data, dtype=str).dtype


I'm not sure if we need to be quite so precise in only targeting objects. This seems to work just as well:

Suggested change

coord_dtype = get_valid_numpy_dtype(index)

if coord_dtype == object and index.dtype == object:

inferred = getattr(index, "inferred_type", None)

if inferred in ("string", "unicode"):

coord_dtype = np.dtype(str)

else:

data = index.to_numpy(dtype=object, copy=False)

if data.size and all(

isinstance(x, (str, np.str_)) for x in data.ravel()

):

coord_dtype = np.asarray(data, dtype=str).dtype

coord_dtype = get_valid_numpy_dtype(np.asarray(array))

Fix string dtype inference for expand_dims coords

174e393

github-actions bot added the topic-indexing label Jan 5, 2026

pre-commit-ci bot and others added 2 commits January 5, 2026 09:38

[pre-commit.ci] auto fixes from pre-commit.com hooks

312668c

for more information, see https://pre-commit.ci

Merge branch 'main' into fix-expand-dims-string-coords

ea96ccb

jsignell reviewed Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix expand_dims string coordinate dtype inference #11069

Fix expand_dims string coordinate dtype inference #11069

garg-khushi commented Jan 5, 2026

Uh oh!

jsignell left a comment

Uh oh!

jsignell Jan 22, 2026

Uh oh!

jsignell Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix expand_dims string coordinate dtype inference #11069

Are you sure you want to change the base?

Fix expand_dims string coordinate dtype inference #11069

Conversation

garg-khushi commented Jan 5, 2026

Uh oh!

jsignell left a comment

Choose a reason for hiding this comment

Uh oh!

jsignell Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

jsignell Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants