Optimize WAV decoding with buffered read by Dan-Flores · Pull Request #1320 · meta-pytorch/torchcodec

Dan-Flores · 2026-04-02T05:13:34Z

This PR enables the optimized file reading from this TODO:

TODO WavDecoder: Optimize decoding and converison by processing samples in
fixed size buffer. For now, we use a simpler implementation: create a
tensor from raw bytes and convert tensor to float32

The optimization: read samples and convert them through a fixed size buffer, rather than reading all values at once.

In WavDecoder::getSamplesInRange, we create a buffer close to DEFAULT_CHUNK_BUFFER_SIZE bytes.
- The actual buffer size is adjusted based on the numBytesPerSample, to ensure we do not read partial samples.
In WavDecoder::getSamplesInRange, we process samples according to the buffer size and create the output samples tensor.
WavDecoder::convertSamplesToFloat does the actual conversion to float32.

Additional changes:
readValue is split into safeReadValue and readValue. This is necessary for performance, see inline comments for more details.

Benchmarks:

Details

Benchmarking code

Audio Decoding Benchmark

File: test/resources/sine_stereo_s32_10min.wav
Warmup: 3, Trials: 10

AudioDecoder 0.249s
WavDecoder 0.148s
SoundFile 0.146s

…checks, update chunk reading loop

…ing optimization

pytorch-bot · 2026-04-02T05:13:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1320

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 8480eec with merge base 7e8c399 ():

NEW FAILURE - The following job has failed:

Lint / mypy (3.12) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…o wav_decode_optimization

…iiton

Dan-Flores · 2026-04-24T15:04:46Z

+  constexpr float scale =
+      1.0f / static_cast<float>(std::numeric_limits<int32_t>::max());
+  for (int64_t i = 0; i < totalSamples; ++i) {
+    int32_t sample = readValue<int32_t>(


Here we use readValue() without the additional TORCH_CHECKs in safeReadValue() for performance.

It is safe to read values without the bounds checks because the TORCH_CHECK above ensures that bufferData is large enough for totalSamples.

mollyxu · 2026-04-27T17:59:36Z

+  std::vector<uint8_t> buffer(alignedBufferSize);
+
+  int64_t samplesProcessed = 0;
+  auto samples =


From my understanding, numSamples = dataSize / numBytesPerSample. Would we want to validate that dataSize in case the headerValue is some arbitrarily large number due to corruption?

For integer overflow, we do not need to validate before doing that division, because int32_t / uint16_t will always fit inside int64_t.

For general overallocation concerns, this is a bit trickier to validate. Since dataSize is int32_t, the maximum valid value it can have is substantial but not always incorrect. I'll add a clamp to ensure the tensor allocated is not larger than the actual file, to prevent an incorrect header from causing us to overallocate.

To address the other dimension of our allocated tensor, header_.numChannels: there is no equivalent ground truth as the filesystem's fileSize is to dataSize.

Instead, I've added a validation in validateHeader() for numBytesPerSample that the WAV specification already requires (labelled here as BytePerBloc):

numBytesPerSample == numChannels * bitsPerSample / 8

We can substitute the values used to calculate numSamples to see that the number of elements in the tensor is bounded by the file size:

samples tensor elements = numSamples * header_.numChannels = (std::min(header_.dataSize, fileSize_) / header_.numBytesPerSample)) * header_.numChannels = (fileSize_ / header_.numBytesPerSample) * header_.numChannels = (fileSize_ / (numChannels * bitsPerSample / 8)) * header_.numChannels = fileSize_ / (bitsPerSample / 8)

As an aside: If the header is corrupted such that numSamples is calculated incorrectly, we were already going to fail in WavDecoder::getSamplesInRange() when we call safeReadFile() to read data beyond what exists. Clamping dataSize to fileSize doesn't prevent that failure either, but it does prevent us from overallocating before failing.

NicolasHug · 2026-05-08T11:17:53Z

+// Soundfile's default chunk size. See
+// https://github.com/libsndfile/libsndfile/blob/master/src/common.h#L77
+constexpr size_t DEFAULT_CHUNK_BUFFER_SIZE = 8192;


I think we should avoid naming this a "chunk".

A "chunk" in the context of wav files refer to the "format chunk" or "data chunk". This thing is just a temporary buffer that 'views' the data chunk, but it's not a chunk of its own (i.e. it's not a part of the wav specs).

NicolasHug · 2026-05-08T11:26:00Z

+  int64_t totalSamples = samplesInBuffer * header_.numChannels;
+
+  // Normalize 32-bit PCM samples to [-1.0, 1.0] range.
+  // We use readValue because the buffer size is already validated above.


Can you show me where this is validated?

By 'above' I think you mean 'by the caller', but I don't see where. There should also be a comment about this expectation at the top of the function, similar to the one that exists for outputPtr.

The chain of intermediate variables passed to readValue() from getSamplesInRange() is complex, so I am less confident in this claim. Since we create the buffer and determine the number of samples we read, we should be safe to read from the buffer without checking the length each time, but this could probably use some refactoring to make this more clear.

Dan-Flores added 25 commits March 26, 2026 05:17

decode PCM f32 with getSamplesInRange

4c1b549

nits

6dd1def

lint

23d02d1

comments + renaming

8a904e7

safeSeek

c1a92c3

small fixes

28b3617

wavDecoder returns OpsAudioFramesOutput

5ca7d20

fix torch checks, explicit round, remove redundant totalSamples

901a756

safeReadFile optionally checks exact read bytes, remove unreasonable …

89cdd1c

…checks, update chunk reading loop

fix test, samples not frames

0696025

remove partial read option

c6370b7

use AudioFramesOutput everywhere, remove unused var

8c587ca

merge test

bd6ffbc

nit

062abea

remove dead code, rename function

92a48f7

remove break condition, add check to prevent infinite loop

97f01c4

rename blockAlign, simplify getSamplesInRange by removing buffer read…

66d5b8d

…ing optimization

use int64_t mostly, remove some superfluous checks, test start==stop

d21efc7

fix from_blob call

86f3a8c

nits

f245ff7

explicit missing cast, remove temp var dataPosition, fix comment

05f23f4

name in/out templated types, turn assertions into comments

07d524f

swap template type order

d2265e7

remote tautological check

aff4a53

optimization

e18cb2a

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 2, 2026

Merge branch 'main' of https://github.com/meta-pytorch/torchcodec int…

709ddb1

…o wav_decode_optimization

Dan-Flores marked this pull request as draft April 16, 2026 05:42

remove unreasonable torch checks

f3dc115

Dan-Flores added 2 commits April 21, 2026 01:12

adjust error msg, use int64

fe8c66e

adjust error msg again, lint

ae33f13

Dan-Flores commented Apr 21, 2026

View reviewed changes

Comment thread src/torchcodec/_core/WavDecoder.cpp

Dan-Flores commented Apr 21, 2026

View reviewed changes

Comment thread src/torchcodec/_core/WavDecoder.cpp Outdated

Dan-Flores commented Apr 21, 2026

View reviewed changes

Comment thread src/torchcodec/_core/WavDecoder.cpp Outdated

Dan-Flores marked this pull request as ready for review April 21, 2026 02:51

mollyxu reviewed Apr 21, 2026

View reviewed changes

Comment thread src/torchcodec/_core/WavDecoder.cpp Outdated

Dan-Flores added 2 commits April 23, 2026 04:25

safeReadValue and readValue for type punning, remove unreachable cond…

9be7cec

…iiton

iterate over samples not bytes

09491a8

Dan-Flores commented Apr 24, 2026

View reviewed changes

remove duplicate torch check

a138f27

mollyxu reviewed Apr 27, 2026

View reviewed changes

Comment thread src/torchcodec/_core/WavDecoder.cpp Outdated

mollyxu reviewed Apr 27, 2026

View reviewed changes

Dan-Flores added 5 commits April 28, 2026 04:27

remove duplicate torch check, cap dataSize before allocating tensor

fcb1035

tweak wording, torch check fileSize_ > header_.dataOffset

2924545

cap to min(fileSize_, header_.dataSize)

c179453

validate numBytesPerSample

d91fbb0

comments about C++ safety

8480eec

NicolasHug approved these changes May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize WAV decoding with buffered read#1320

Optimize WAV decoding with buffered read#1320
Dan-Flores wants to merge 37 commits into
meta-pytorch:mainfrom
Dan-Flores:wav_decode_optimization

Dan-Flores commented Apr 2, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dan-Flores Apr 24, 2026

Uh oh!

Uh oh!

mollyxu Apr 27, 2026

Uh oh!

Dan-Flores Apr 28, 2026 •

edited

Loading

Uh oh!

Dan-Flores Apr 29, 2026

Uh oh!

NicolasHug May 8, 2026

Uh oh!

NicolasHug May 8, 2026

Uh oh!

Dan-Flores May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Dan-Flores commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1320

❌ 1 New Failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dan-Flores Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mollyxu Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

NicolasHug May 8, 2026

Choose a reason for hiding this comment

Uh oh!

NicolasHug May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dan-Flores commented Apr 2, 2026 •

edited

Loading

pytorch-bot Bot commented Apr 2, 2026 •

edited

Loading

Dan-Flores Apr 28, 2026 •

edited

Loading