feat(weights): add exact weight footprint from safetensors header by inureyes · Pull Request #64 · lablup/mlxcel

inureyes · 2026-05-21T12:24:52Z

Summary

Adds byte-accurate weight-size accounting from safetensors metadata so --recommend-quant reports the exact model footprint before any tensor data is loaded, with the analytical config.json estimate as a fallback.

What changed

src/lib/mlxcel-core/src/weights.rs: added weight_footprint_bytes(model_dir) -> Option<u64> (public), parse_shard_index_with_total_size (public) that exposes the previously discarded metadata.total_size field, parse_shard_index_inner (private shared implementation), extract_shards_and_total_size (private), read_safetensors_header_bytes (private), safetensors_dtype_itemsize (private), and the ShardIndexResult type alias. The original parse_shard_index and extract_shards_from_index_json are unchanged in public behavior.
src/execution/quant_advisor.rs: QuantAdvice gains exact_weight_bytes: Option<u64>; advise_quantization calls weight_footprint_bytes and converts exact bytes to a billions-of-parameters signal (takes precedence over the analytical estimate); print_quant_advice shows exact GiB/MiB with source tag and shows the analytical estimate as a reference note. New format_bytes helper. Imports mlxcel_core::weights::weight_footprint_bytes.

Test plan

cargo test -p mlxcel-core --lib weights::tests — all 22 tests pass (9 new: sharded index with/without total_size, single-file binary header, scalar tensor, dtype itemsize, missing case)
cargo test -p mlxcel --lib execution::quant_advisor::tests — all 11 tests pass (4 new: exact_weight_bytes field, index wiring, format_bytes helpers)
cargo clippy -p mlxcel-core --lib --tests -- -D warnings — clean
cargo clippy -p mlxcel --lib --tests -- -D warnings — clean

Closes #53

Add `weight_footprint_bytes(model_dir) -> Option<u64>` to `mlxcel-core::weights` that returns the byte-accurate weight size before any tensors are loaded. Resolution order: 1. `metadata.total_size` from `model.safetensors.index.json` (sharded models — already parsed by `parse_shard_index`, now also extracts the discarded field) 2. Safetensors binary header of a single `model.safetensors` — reads 8-byte LE header-length prefix plus the JSON header object, sums dtype × shape-product per tensor entry without touching tensor data 3. Returns `None` when neither is available; callers fall back to analytical estimate `parse_shard_index` is unchanged in return type; new `parse_shard_index_with_total_size` exposes the extended result via the `ShardIndexResult` type alias (added to silence clippy::type_complexity). Wire exact footprint into `quant_advisor.rs`: - `QuantAdvice` gains `exact_weight_bytes: Option<u64>` - `advise_quantization` calls `weight_footprint_bytes` and converts exact bytes to a billions-of-parameters estimate (bytes / 2 / 1e9, FP16 reference), which supersedes the analytical config.json estimate when present - `print_quant_advice` shows the exact GiB/MiB figure and source tag when available; analytical estimate is shown as a reference note New unit tests: 9 in `weights::tests` (sharded index with/without total_size, single-file binary header, scalar tensor, dtype itemsize table, missing case) and 4 in `quant_advisor::tests` (exact_weight_bytes field, index wiring, format_bytes helpers).

inureyes added status:review Under review status:done Completed and removed status:review Under review labels May 21, 2026

style: apply cargo fmt

e8ef21e

inureyes merged commit 3b1e2b3 into main May 21, 2026
4 checks passed

This was referenced May 21, 2026

Epic: Pre-load model memory requirement estimation #52

Closed

fix: tighten memory estimator preflight coverage #68

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(weights): add exact weight footprint from safetensors header#64

feat(weights): add exact weight footprint from safetensors header#64
inureyes merged 2 commits into
mainfrom
feature/issue-53-exact-weight-footprint

inureyes commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

inureyes commented May 21, 2026

Summary

What changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant