Add native multi-GPU device_map support to TensorDeserializer by abatilo · Pull Request #201 · coreweave/tensorizer

abatilo · 2026-02-06T23:42:33Z

Add native multi-GPU device_map support to TensorDeserializer

feat(serialization): add device_map parameter for multi-GPU tensor loading

Add native multi-GPU support to TensorDeserializer via a new device_map
keyword argument. Supports explicit per-tensor placement via a Mapping
and automatic greedy largest-first balancing via a Sequence of devices.

Greedy balancer uses deserialized_length (bytes) with a min-heap
Per-device CUDA streams cached in copy threads
CPU tensors in mixed maps get dedicated buffers to avoid corruption
Underspecified CUDA devices resolved up front
read_numpy_arrays guarded against non-CPU device_map targets
Default path (device_map=None) avoids per-tensor overhead in hot loop
19 new unit tests covering placement, balancing, fallback, edge cases

…ading Add native multi-GPU support to TensorDeserializer via a new device_map keyword argument. Supports explicit per-tensor placement via a Mapping and automatic greedy largest-first balancing via a Sequence of devices. - Greedy balancer uses deserialized_length (bytes) with a min-heap - Per-device CUDA streams cached in copy threads - CPU tensors in mixed maps get dedicated buffers to avoid corruption - Underspecified CUDA devices resolved up front - read_numpy_arrays guarded against non-CPU device_map targets - Default path (device_map=None) avoids per-tensor overhead in hot loop - 19 new unit tests covering placement, balancing, fallback, edge cases

abatilo · 2026-02-06T23:42:34Z

This change is part of the following stack:

Add native multi-GPU device_map support to TensorDeserializer #201 ◀

_{Change managed by git-spice.}

abatilo added 2 commits February 6, 2026 14:26

Add native multi-GPU device_map support to TensorDeserializer

97e606c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add native multi-GPU device_map support to TensorDeserializer#201

Add native multi-GPU device_map support to TensorDeserializer#201
abatilo wants to merge 2 commits into
mainfrom
abatilo/feat/device-map

abatilo commented Feb 6, 2026

Uh oh!

abatilo commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abatilo commented Feb 6, 2026

Uh oh!

abatilo commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant