Skip to content

INT8 calibration failure of TensorRT 10.3 when running trtexec --int8 on Jetson Orin (CC 8.7) #4797

@olekspickle

Description

@olekspickle

Description

Hey!
I'm trying to build .engine from my.onnx (opset 19, 2144×2144) and seeing it fail on checkSanity.cpp:218: assertion item.second != nullptr fails during graph optimization on calibration graph.

Occurs with all calibrator types (EntropyCalibrator2, Entropy, MinMax, trtexec) on Jetson Orin (SM 8.7). TRT 10.3.0 while simple --fp16 build succeeds.

I suspect TRT 10.x graph optimizer incorrectly removes a region during INT8 calibration pass for models with opset ≥19 or certain operator combinations

Environment

TensorRT Version: 10.3.0

NVIDIA GPU: Jetson Orin (CC 8.7, SM count: 8)

NVIDIA Driver Version: 540.5.0

CUDA Version: 12.6

CUDNN Version: 9.3.0.75

Operating System: Linux aarch64 (JetPack R36.5.0, kernel OOT)

Python Version (if applicable): Python 3.10.12

PyTorch Version (if applicable): 2.9.0

Baremetal or Container (if so, version): Baremetal (Jetson Orin)

Relevant Files

Model link: unfortunately not possible, but should be reproducable with these model params:

  • ONNX Opset: 19 (PyTorch 2.9.0)
  • Model input: 2144×2144×3

Steps To Reproduce

Commands or scripts:

1. Parse ONNX model (any model with opset >= 19)

trtexec --onnx=my.onnx --int8 --saveEngine=/tmp/broken.engine --memPoolSize=workspace:8192

2. Same via Python API

python3 -c "
import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.INFO)
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, TRT_LOGGER)
with open('my.onnx', 'rb') as f:
parser.parse(f.read())
config = builder.create_builder_config()
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 8 << 30)
config.set_flag(trt.BuilderFlag.INT8)
config.set_flag(trt.BuilderFlag.FP16)
config.int8_calibrator = trt.IInt8MinMaxCalibrator() # any calibrator crashes
engine = builder.build_serialized_network(network, config)
"

3. FP16 succeeds — confirms model is valid

trtexec --onnx=my.onnx --fp16 --saveEngine=/tmp/working.engine --memPoolSize=workspace:8192

Error output:
[TRT] [E] [checkSanity.cpp::checkLinks::218] Error Code 2: Internal Error
(Assertion item.second != nullptr failed. region should have been removed from Graph::regions)

Have you tried the latest release?: Bug present in TRT 10.3.0 on Jetson Orin (JetPack R36.5.0). Not tested on x86.

Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): ONNX parses and validates fine. FP16 TensorRT engine builds and runs successfully. The model is valid — TRT's INT8 calibration graph optimizer crashes internally during the calibration graph pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions