Skip to content

Add VK_EXT_ycbcr_3plane_16bit_lsb_formats extension#2709

Open
rmader wants to merge 1 commit into
KhronosGroup:mainfrom
rmader:ycbcr-16bit-lsb-formats
Open

Add VK_EXT_ycbcr_3plane_16bit_lsb_formats extension#2709
rmader wants to merge 1 commit into
KhronosGroup:mainfrom
rmader:ycbcr-16bit-lsb-formats

Conversation

@rmader
Copy link
Copy Markdown

@rmader rmader commented Apr 1, 2026

This extension adds support for 10/12bit YCbCr formats used by software decoders like ffmpeg, dav1d and libvpx.

See also:


This branch as well as the related Mesa and GTK4 MRs have been in draft status for a while and already got a lot of testing. With the recent Gstreamer 1.28 release (shipped in distros like Fedora 44 and Ubuntu 26.04) they can be tested and used - e.g. the default Gnome Video player will make use of the new formats when playing HDR videos (for various codecs, including AV1, HEVC, H266, Pro-Res and DNxHR).

@gfxstrand I'd be super happy if you could have a look at this and, if you find time, help me get this over the line 😅 Most importantly point me to where I should elaborate more.

Comment thread chapters/formats.adoc Outdated
Comment thread proposals/VK_EXT_ycbcr_3plane_16bit_lsb_formats.adoc Outdated
Comment thread proposals/VK_EXT_ycbcr_3plane_16bit_lsb_formats.adoc Outdated
Comment thread appendices/VK_EXT_ycbcr_3plane_16bit_lsb_formats.adoc Outdated
@oddhack
Copy link
Copy Markdown
Contributor

oddhack commented Apr 29, 2026

@rmader please rebase on github main branch, which should fix the CI CTS framework issue, and address the other issues raised by CI around copyright dates and naming.

@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch 4 times, most recently from c04f6cb to 8d96571 Compare April 30, 2026 09:15
@rmader
Copy link
Copy Markdown
Author

rmader commented Apr 30, 2026

@rmader please rebase on github main branch, which should fix the CI CTS framework issue, and address the other issues raised by CI around copyright dates and naming.

Thanks for the review! I think I addressed all issues - and hope the change to append _EXT to all formats is correct. The goal of course being that they'll get promoted at some point going forward.

Comment thread chapters/features.adoc Outdated
@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch 2 times, most recently from 15b2a7c to b73a6f0 Compare April 30, 2026 13:32
@rmader
Copy link
Copy Markdown
Author

rmader commented Apr 30, 2026

I added one more change for completeness - 14bit formats. While those are very uncommon AFAIK, they are supported by ffmpeg. Thus let's include them while on it, so we don't have to revisit this in a couple of years.

@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch from b73a6f0 to 21b0612 Compare April 30, 2026 15:02
Copy link
Copy Markdown
Contributor

@gfxstrand gfxstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this took me nearly a month to get to. Between a Khronos meeting and vacation, it kind of got lost in my e-mail.

Comment thread xml/vk.xml Outdated
Add the formats in question. For any drivers already supporting YCbCr formats
via shaders this should be straight forward. Values in the 0.0 - 1.0 range just
need to get multiplied by 65535.0 / 1023.0 (for 10 bit), 65535.0 / 4095.0 (for
12 bit) or 65535.0 / 16383.0 (for 14 bit).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky assertion to make. Right now, we assume that the X bits are garbage and (mostly) ignored by sampling. I say "mostly" because it's kind of okay if they're not since they're the low bits and any garbage in those bits will show up as noise which likely isn't perceptible to the human eye. With theese formats, however, your assertion that they're easy for shaders by doing a bit of multiplication assumes that the top bits are always zero. While this is fairly easy for software to guarantee when outputting such an image, it's very hard for the driver to defend against. We could make it invalid usage or unknown results if you have non-zero bits in the high bits when sampling but we'd need to be very explicit about that in the API.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. So on the DRM side we solved that by using z instead of x: https://github.com/torvalds/linux/blob/master/include/uapi/drm/drm_fourcc.h#L403-L405

We could do the same here by turning

VK_FORMAT_X6G10_X6B10_X6R10_3PLANE_420_UNORM_3PACK16_EXT

into

VK_FORMAT_Z6G10_Z6B10_Z6R10_3PLANE_420_UNORM_3PACK16_EXT

which would hopefully make this clear. WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that. Since no one will ever be writing to these formats, it's not like we would need hardware to support that. I'd feel a little better about that than saying it's ignored. Again, I think this is where @fluppeteer will have good opinions. As might @spencer-lunarg since he did the format table in the first place.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since no one will ever be writing to these formats

For completeness: these formats are also the native/optimal input formats for common software encoders, meaning there is a chance that at some point we'd want native driver support for writing into shared buffers that can directly be fed into those sw-encoders. Not super likely, just to keep in mind.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this for now (X -> Z).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay - I tried to write this a couple of weeks ago and GitHub ate it.

I don't love Z as a choice for the same depth-related reasons as @gfxstrand (and arguably should we decide to use WXYZ in a format at any point), but it may be the least bad option - N for "null" may be an alternative, but that may conflict with the common writing of "nnnn" to mean a series of bits. As you say, this feels like a 16-bit format with a different decode path for scaling, since we're saying the "must be zero" bits are actually significant and just expected to be zero; I'm not 100% clear that our hardware, at least, would be able to support this, since our hardware matrix transform has limited inputs. At least we avoided using Z for depth. (Fortunately a recent KDFS update can express mandatory 0/1 bits, I hope as is needed here.)

For reference it would be nice to include these formats in the 56.1 descriptions. C.f.:

VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16 specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 10-bit B component in the top 10 bits of the word in bytes 0..1, and a 10-bit R component in the top 10 bits of the word in bytes 2..3, with the bottom 6 bits of each word unused. The horizontal and vertical dimensions of the BR plane are halved relative to the image dimensions, and each R and B value is shared with the G components for which ⌊iG×0.5⌋=[iB]=[iR] and ⌊jG×0.5⌋=[jB]=[jR]. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, using VK_IMAGE_ASPECT_PLANE_0_BIT for the G plane, and VK_IMAGE_ASPECT_PLANE_1_BIT for the BR plane. This format only supports images with a width and height that is a multiple of two.

I don't think I see that in this PR? (Unless they're autogenerated these days.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I see that in this PR? (Unless they're autogenerated these days.)

Ouch, I'm sorry - this was indeed missing and arguably a very important part 😅

Added that now.

I don't love Z as a choice for the same depth-related reasons as @gfxstrand (and arguably should we decide to use WXYZ in a format at any point), but it may be the least bad option

Thanks! IIUC that means unless somebody else voices resistance we can go forward with that.

I'm not 100% clear that our hardware, at least, would be able to support this

For the record: even if the formats can only be supported via a shader fallback path it would still be a useful for multimedia apps, freeing them from having to carry custom shaders themselves.

@gfxstrand
Copy link
Copy Markdown
Contributor

@fluppeteer should probably also give this a read.

Comment thread chapters/resources.adoc Outdated
Comment thread chapters/resources.adoc Outdated
Comment thread chapters/formats.adoc Outdated
Comment thread chapters/formats.adoc Outdated
Comment thread chapters/formats.adoc Outdated
Comment thread chapters/resources.adoc Outdated
@oddhack
Copy link
Copy Markdown
Contributor

oddhack commented May 6, 2026

I got tired with adding explanatory comments, but there are what appear to be a bunch of minor markup errors in the new VUs which I have suggested fixes for.

@oddhack oddhack closed this May 6, 2026
@oddhack oddhack reopened this May 6, 2026
@oddhack
Copy link
Copy Markdown
Contributor

oddhack commented May 6, 2026

(Sorry, accidentally clicked the comment & close button).

@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch 3 times, most recently from 6080136 to b89c40e Compare May 6, 2026 14:28
@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch from b89c40e to a419b1e Compare May 6, 2026 14:58
@oddhack
Copy link
Copy Markdown
Contributor

oddhack commented May 6, 2026

Github is borked ATM (presumably because it is a day ending in 'y') and won't let me comment on the diff for some reason, but features.adoc line 1845 introduced some leading white space before the ' ifdef::' which should be removed.

@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch from a419b1e to 146edec Compare May 6, 2026 15:55
@rmader
Copy link
Copy Markdown
Author

rmader commented May 6, 2026

Github is borked ATM (presumably because it is a day ending in 'y') and won't let me comment on the diff for some reason, but features.adoc line 1845 introduced some leading white space before the ' ifdef::' which should be removed.

Whops, fixed. Also should have addressed all other issues from the last CI run (and ran the corresponding scripts locally to validate that)

@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch from 146edec to 24c3e2e Compare May 6, 2026 15:59
@rmader
Copy link
Copy Markdown
Author

rmader commented May 6, 2026

Sorry for the noise, just some more minor fixes.

@oddhack oddhack requested a review from fluppeteer May 13, 2026 14:37
Copy link
Copy Markdown

@fluppeteer fluppeteer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback from two weeks ago...

Add the formats in question. For any drivers already supporting YCbCr formats
via shaders this should be straight forward. Values in the 0.0 - 1.0 range just
need to get multiplied by 65535.0 / 1023.0 (for 10 bit), 65535.0 / 4095.0 (for
12 bit) or 65535.0 / 16383.0 (for 14 bit).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay - I tried to write this a couple of weeks ago and GitHub ate it.

I don't love Z as a choice for the same depth-related reasons as @gfxstrand (and arguably should we decide to use WXYZ in a format at any point), but it may be the least bad option - N for "null" may be an alternative, but that may conflict with the common writing of "nnnn" to mean a series of bits. As you say, this feels like a 16-bit format with a different decode path for scaling, since we're saying the "must be zero" bits are actually significant and just expected to be zero; I'm not 100% clear that our hardware, at least, would be able to support this, since our hardware matrix transform has limited inputs. At least we avoided using Z for depth. (Fortunately a recent KDFS update can express mandatory 0/1 bits, I hope as is needed here.)

For reference it would be nice to include these formats in the 56.1 descriptions. C.f.:

VK_FORMAT_G10X6_B10X6R10X6_2PLANE_420_UNORM_3PACK16 specifies an unsigned normalized multi-planar format that has a 10-bit G component in the top 10 bits of each 16-bit word of plane 0, and a two-component, 32-bit BR plane 1 consisting of a 10-bit B component in the top 10 bits of the word in bytes 0..1, and a 10-bit R component in the top 10 bits of the word in bytes 2..3, with the bottom 6 bits of each word unused. The horizontal and vertical dimensions of the BR plane are halved relative to the image dimensions, and each R and B value is shared with the G components for which ⌊iG×0.5⌋=[iB]=[iR] and ⌊jG×0.5⌋=[jB]=[jR]. The location of each plane when this image is in linear layout can be determined via vkGetImageSubresourceLayout, using VK_IMAGE_ASPECT_PLANE_0_BIT for the G plane, and VK_IMAGE_ASPECT_PLANE_1_BIT for the BR plane. This format only supports images with a width and height that is a multiple of two.

I don't think I see that in this PR? (Unless they're autogenerated these days.)

This extension adds support for 10/12bit YCbCr formats used
by software decoders like ffmpeg, dav1d and libvpx.

See
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303
@rmader rmader force-pushed the ycbcr-16bit-lsb-formats branch 2 times, most recently from f5bec6c to 143fe5f Compare May 22, 2026 09:17
Comment thread xml/vk.xml
<format name="VK_FORMAT_Z2R14_UNORM_EXT" class="16-bit" blockSize="2" texelsPerBlock="1">
<component name="R" bits="14" numericFormat="UNORM"/>
</format>
<format name="VK_FORMAT_Z6G10_Z6B10_Z6R10_3PLANE_420_UNORM_3PACK16_EXT" class="10-bit 3-plane LSB" blockSize="6" texelsPerBlock="1" chroma="420">
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One open question from my side: is it ok if all three 10-bit formats share class="10-bit 3-plane LSB" - or do they need to have a own class each, with the chroma included/repeated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants