Skip to content

Conversation

@nazanin-beheshti
Copy link
Contributor

We are enabling CoPilot models on OV GPU. The models are quantized in uint8/int8 or uint16/int16 datatypes.
FQ layers with int16/uint16 data types are stripped.
https://github.com/openvinotoolkit/openvino/blob/master/src/common/low_precision_transformations/src/qdq_stripping.cpp

With one model, we observer the below error which happens at split LPT transformation.
One input (Gather output) is in FP32 since DQ layers after Gather are stripped and the other input is in FP16 which cause mismatch between input data types during multiplication.

https://github.com/openvinotoolkit/openvino/blob/master/src/common/low_precision_transformations/src/split.cpp

split-args-elem-type split

To resolve that issue, conversion between data types happen before elementwise (mult) operation.

Tickets:

@github-actions github-actions bot added the category: LP transformations OpenVINO Low Precision transformations label Dec 8, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Dec 8, 2025
@nazanin-beheshti nazanin-beheshti marked this pull request as ready for review December 9, 2025 14:39
@nazanin-beheshti nazanin-beheshti requested a review from a team as a code owner December 9, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: LP transformations OpenVINO Low Precision transformations ExternalPR External contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants