Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions antora/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
////
- Copyright (c) 2023-2025, Holochip Inc
- Copyright (c) 2023-2025, Sascha Willems
- Copyright (c) 2025, Arm Limited and Contributors
- Copyright (c) 2023-2026, Holochip Inc
- Copyright (c) 2023-2026, Sascha Willems
- Copyright (c) 2026, Arm Limited and Contributors
-
- SPDX-License-Identifier: Apache-2.0
-
Expand Down Expand Up @@ -54,6 +54,7 @@
** xref:samples/extensions/buffer_device_address/README.adoc[Buffer device address]
** xref:samples/extensions/calibrated_timestamps/README.adoc[Calibrated timestamps]
** xref:samples/extensions/conditional_rendering/README.adoc[Conditional rendering]
** xref:samples/extensions/compute_shader_derivatives/README.adoc[Compute shader derivatives]
** xref:samples/extensions/conservative_rasterization/README.adoc[Conservative rasterization]
** xref:samples/extensions/debug_utils/README.adoc[Debug utils]
** xref:samples/extensions/descriptor_buffer_basic/README.adoc[Descriptor buffer basic]
Expand Down
10 changes: 8 additions & 2 deletions framework/vulkan_type_mapping.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* Copyright (c) 2025, Arm Limited and Contributors
* Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved.
/* Copyright (c) 2026, Arm Limited and Contributors
* Copyright (c) 2024-2026, NVIDIA CORPORATION. All rights reserved.
*
* SPDX-License-Identifier: Apache-2.0
*
Expand Down Expand Up @@ -285,6 +285,12 @@ struct HPPType<VkPhysicalDeviceTimelineSemaphoreFeaturesKHR>
using Type = vk::PhysicalDeviceTimelineSemaphoreFeaturesKHR;
};

template <>
struct HPPType<VkPhysicalDeviceComputeShaderDerivativesFeaturesKHR>
{
using Type = vk::PhysicalDeviceComputeShaderDerivativesFeaturesKHR;
};

template <>
struct HPPType<VkPhysicalDeviceVertexInputDynamicStateFeaturesEXT>
{
Expand Down
10 changes: 8 additions & 2 deletions samples/extensions/README.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
////
- Copyright (c) 2025, Arm Limited and Contributors
- Copyright (c) 2021-2025, The Khronos Group
- Copyright (c) 2026, Arm Limited and Contributors
- Copyright (c) 2021-2026, The Khronos Group
-
- SPDX-License-Identifier: Apache-2.0
-
Expand Down Expand Up @@ -312,3 +312,9 @@ Demonstrate how to build data graph pipelines and execute neural networks:

* xref:./{extension_samplespath}tensor_and_data_graph/simple_tensor_and_data_graph/README.adoc[simple_tensor_and_data_graph]
- Explains how to set up and execute a simple neural network using a data graph pipeline.

=== xref:./{extension_samplespath}compute_shader_derivatives/README.adoc[Compute shader derivatives]

*Extension*: https://docs.vulkan.org/features/latest/features/proposals/VK_KHR_compute_shader_derivatives.html[`VK_KHR_compute_shader_derivatives`]

Demonstrate how to use derivatives (dFdx/dFdy) in compute shaders via derivative groups and how to request/enable the corresponding device feature.
33 changes: 33 additions & 0 deletions samples/extensions/compute_shader_derivatives/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Copyright (c) 2026, Holochip Inc.

# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 the "License";
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

get_filename_component(FOLDER_NAME ${CMAKE_CURRENT_LIST_DIR} NAME)
get_filename_component(PARENT_DIR ${CMAKE_CURRENT_LIST_DIR} PATH)
get_filename_component(CATEGORY_NAME ${PARENT_DIR} NAME)

add_sample(
ID ${FOLDER_NAME}
CATEGORY ${CATEGORY_NAME}
AUTHOR "Holochip"
NAME "Compute shader derivatives"
DESCRIPTION "Demonstrates VK_KHR_compute_shader_derivatives with a minimal compute dispatch using dFdx/dFdy in compute"
SHADER_FILES_SLANG
"compute_shader_derivatives/slang/derivatives_quad.comp.slang"
"compute_shader_derivatives/slang/derivatives_linear.comp.slang"
"compute_shader_derivatives/slang/fullscreen.vert.slang"
"compute_shader_derivatives/slang/fullscreen.frag.slang"
)
96 changes: 96 additions & 0 deletions samples/extensions/compute_shader_derivatives/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
////
- Copyright (c) 2025-2026, Holochip Inc.
-
- SPDX-License-Identifier: Apache-2.0
-
- Licensed under the Apache License, Version 2.0 the "License";
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-
////
ifdef::site-gen-antora[]
TIP: The source for this sample can be found in the https://github.com/KhronosGroup/Vulkan-Samples/tree/main/samples/extensions/compute_shader_derivatives[Khronos Vulkan samples github repository].
endif::[]

= VK_KHR_compute_shader_derivatives — Derivatives in compute shaders

This sample demonstrates VK_KHR_compute_shader_derivatives, which enables the use of derivative instructions (like dFdx/dFdy) inside compute shaders. Traditionally, derivatives are only available in fragment shaders, but this extension defines derivative groups in compute and how invocations are paired for derivative computations.

// Screenshot of the sample output
.Compute shader derivatives output
image::shader_derivatives.png[align=center,alt="Compute shader derivatives output"]

== What is it?
- SPIR-V: The companion SPIR-V extension allows derivative instructions in the Compute execution model.
- Vulkan: The device feature is exposed via `VkPhysicalDeviceComputeShaderDerivativesFeaturesKHR` with two booleans:
* `computeDerivativeGroupQuads` — enables quad-based derivative groups.
* `computeDerivativeGroupLinear` — enables linearly mapped derivative groups.
- GLSL: Use `#extension GL_KHR_compute_shader_derivatives : enable` and a layout qualifier to choose the grouping:
* `layout(derivative_group_quadsNV) in;`
* `layout(derivative_group_linearNV) in;`
(The `NV` suffix is retained in the GLSL tokens for compatibility.)

== Why/when to use it
- Port algorithms that rely on derivatives (e.g., LOD selection, filtering, gradients) to compute for flexibility or performance.
- Keep consistent behavior with fragment-stage derivatives by choosing an appropriate grouping mode (quads vs. linear).

== What this sample does
- Requests and requires the feature `computeDerivativeGroupQuads`.
- Builds a compute pipeline with a shader that calls `ddx`/`ddy` (derivative instructions) in compute.
- Computes a procedural 2D radial function and uses derivatives to calculate gradient magnitude, demonstrating a practical use case for spatial analysis and edge detection.
- Renders a fullscreen visualization showing:
* Blue: The base procedural radial pattern
* Red/Yellow: Edges detected via high gradient magnitude
* The compute shader writes the visualization to a storage image, which is then displayed via a graphics pipeline
- Displays a GUI overlay explaining the visualization and the practical applications of compute shader derivatives.
- The sample demonstrates how compute shader derivatives enable algorithms that traditionally required fragment shaders (like gradient-based filtering or LOD selection) to run in compute shaders for greater flexibility.

== Rendering architecture

This sample uses a two-stage rendering pipeline to demonstrate compute shader derivatives and display the results:

=== Stage 1: Compute shader (derivative calculation)
The compute shader (`derivatives.comp.slang`) executes with an 8×8 local workgroup size and the `[DerivativeGroupQuad]` attribute, which enables quad-based derivative computation. For each pixel in a 512×512 output image:

1. Computes a procedural radial function based on distance from center
2. Calls `ddx()` and `ddy()` to calculate spatial derivatives of the function
3. Computes gradient magnitude: `sqrt(dx² + dy²)` to detect edges
4. Writes a color visualization to a storage image (VK_FORMAT_R8G8B8A8_UNORM)

The storage image serves as the output buffer for the compute shader and the input texture for the graphics pipeline.

=== Stage 2: Graphics pipeline (fullscreen display)
After a pipeline barrier synchronizes the compute write with the fragment shader read, the graphics pipeline displays the computed image:

1. **Vertex shader** (`fullscreen.vert.slang`): Generates a fullscreen triangle using only vertex IDs (no vertex buffer required)
* Vertex 0: `(-1, -1)` with UV `(0, 0)` — bottom-left corner
* Vertex 1: `(3, -1)` with UV `(2, 0)` — extends far right (off-screen)
* Vertex 2: `(-1, 3)` with UV `(0, 2)` — extends far up (off-screen)
* The oversized triangle covers the entire viewport; hardware automatically clips the parts outside the screen
2. **Fragment shader** (`fullscreen.frag.slang`): Samples the storage image using interpolated UV coordinates and outputs the color
3. **GUI overlay**: Drawn on top using ImGui to explain the visualization

=== Why use a fullscreen triangle instead of a quad?
The fullscreen triangle is a common optimization technique for post-processing and fullscreen effects:

- **Fewer vertices**: Only 3 vertices instead of 4 (quad) or 6 (two triangles)
- **No vertex buffer**: Positions and UVs are generated procedurally from `SV_VertexID`
- **Simpler setup**: Single draw call with `vkCmdDraw(cmd, 3, 1, 0, 0)`
- **Automatic clipping**: The GPU clips the oversized triangle to the viewport bounds
- **Better cache behavior**: Single triangle primitive instead of two

This technique is widely used in modern rendering engines for fullscreen passes like tone mapping, bloom, and other post-processing effects.

== Required Vulkan extensions and features
- Instance extension: `VK_KHR_get_physical_device_properties2` (for feature chaining).
- Device extension: `VK_KHR_compute_shader_derivatives` (required).
- Device feature: `VkPhysicalDeviceComputeShaderDerivativesFeaturesKHR::computeDerivativeGroupQuads = VK_TRUE`.

Loading
Loading