Fix tensor lifetime issue by SandSnip3r · Pull Request #4228 · pytorch/TensorRT

SandSnip3r · 2026-05-01T00:19:43Z

Description

This change fixes a correctness issue that I and others were seeing when running the FLUX2 diffusion model. The model, when compiled with either TensorRT or TensorRT-RTX was producing garbage images.

The issue was that the input tensor's lifetime was incorrect. The input tensor's ref count dropped to 0 before the engine ran with enqueueV3(). In this specific case, it was a bit of a perfect storm with an output having the same size and shape and also there being a fp32->bf16 cast. Another tensor was being allocated (the output tensor) and that was given the address of the input tensor.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

SandSnip3r · 2026-05-01T00:34:53Z


    auto dims = core::util::toVec(out_shape);
    auto type = util::TRTDataTypeToScalarType(compiled_engine->exec_ctx->getEngine().getTensorDataType(name.c_str()));
    outputs[pyt_idx] = std::move(at::empty(dims, {at::kCUDA}).to(type).contiguous());


By the way, a separate cleanup should be done where this line is instead

outputs[pyt_idx] = at::empty(dims, at::TensorOptions().device(at::kCUDA).dtype(type));

This would improve from two allocations & a dtype-conversion kernel to just a single allocation.

I think this is the same line Shane identified as well.

narendasan

This looks good to me

narendasan · 2026-05-01T15:45:33Z


    auto dims = core::util::toVec(out_shape);
    auto type = util::TRTDataTypeToScalarType(compiled_engine->exec_ctx->getEngine().getTensorDataType(name.c_str()));
    outputs[pyt_idx] = std::move(at::empty(dims, {at::kCUDA}).to(type).contiguous());


I think this is the same line Shane identified as well.

narendasan · 2026-05-08T21:29:51Z

+// recycled by the caching allocator for output tensors, aliasing inputs
+// onto outputs and corrupting reads after the first output write.
+std::list<at::Tensor> setup_input_tensors(
    std::vector<at::Tensor> inputs,


What if we mutated inputs inplace so something like

void setup_input_tensors( std::vector<at::Tensor>& inputs, ...

Similar to what is happening for the inputShapeTensors

Yeah, it kinda seems like the same thing but wrapped in a different package

narendasan

I think the only alternative design I can think of is that we mutate the vector of input tensors inplace with the new contiguous versions.

meta-cla Bot added the cla signed label May 1, 2026

github-actions Bot added component: tests Issues re: Tests component: core Issues re: The core compiler component: runtime labels May 1, 2026

github-actions Bot requested a review from narendasan May 1, 2026 00:20

SandSnip3r commented May 1, 2026

View reviewed changes

narendasan approved these changes May 1, 2026

View reviewed changes

SandSnip3r force-pushed the fix-runtime-buffer-lifetime branch from 3b6cdb3 to 66c5a42 Compare May 4, 2026 18:39

Fix tensor lifetime issue

61b3003

SandSnip3r force-pushed the fix-runtime-buffer-lifetime branch from 66c5a42 to 61b3003 Compare May 5, 2026 20:25

narendasan added the needs-release-cherrypick label May 8, 2026

github-actions Bot requested a review from zewenli98 May 8, 2026 21:15

narendasan reviewed May 8, 2026

View reviewed changes

narendasan approved these changes May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tensor lifetime issue#4228

Fix tensor lifetime issue#4228
SandSnip3r wants to merge 1 commit intopytorch:mainfrom
SandSnip3r:fix-runtime-buffer-lifetime

SandSnip3r commented May 1, 2026 •

edited

Loading

Uh oh!

SandSnip3r May 1, 2026

Uh oh!

narendasan May 1, 2026

Uh oh!

narendasan left a comment

Uh oh!

narendasan May 1, 2026

Uh oh!

narendasan May 8, 2026

Uh oh!

SandSnip3r May 8, 2026

Uh oh!

narendasan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SandSnip3r commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

SandSnip3r May 1, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan May 1, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

narendasan May 1, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan May 8, 2026

Choose a reason for hiding this comment

Uh oh!

SandSnip3r May 8, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SandSnip3r commented May 1, 2026 •

edited

Loading