Skip to content

Conversation

@aleph1
Copy link
Contributor

@aleph1 aleph1 commented Feb 28, 2025

This is almost ready to integrate, but there is one outstanding issue.

There is a conflict between two functions, kinc_g5_command_list_set_vertex_constant_buffer and kinc_g5_command_list_set_vertex_buffers, in that they both attempt to set a vertex buffer at index 0.

I added some logging on my local build, and I think the problem stems from the fact the the functions are called in this order:

  1. kinc_g5_command_list_set_vertex_buffers (writes to buffers 0...n)
  2. kinc_g5_command_list_set_vertex_constant_buffer (writes to buffer 0, overwriting the previously set vertex buffer)

The Vulkan backend deals with this by adding three variables static uint32_t lastVertexConstantBufferOffset, static uint32_t lastFragmentConstantBufferOffset, static uint32_t lastComputeConstantBufferOffset and then alters the code in commandlist.m.h in some pretty major ways. I guess it works, but I haven't tried the Vulkan backend as I am on Mac.

I haven't delved into the whole kinc rendering pipeline so I am not sure if it is easy to switch the order that kinc_g5_command_list_set_vertex_buffers and kinc_g5_command_list_set_vertex_constant_buffer are called in, or if it might have a negative effect somewhere else. I also don't know if it will fully fix the issue as I think the kinc_g5_command_list_set_vertex_buffers if it wrote to indexes 1...n I would still need to account for the buffer size of the vertex constant buffer. I couldn't see a way to access that from outside G4.c.h. Is it possible to add a function to retrieve this?

If you want to see the instancing sample working you have to disable the constant buffer being set. You can do that by commenting out the last line of the kinc_g5_command_list_set_vertex_constant_buffer function ([encoder setVertexBuffer:buf offset:offset atIndex:0];).

Now uses all buffers like in the Vulkan backend.
Correctly assembles vertexDescriptor with correct offsets and strides, as well as perInstance and perVertex step functions.
@aleph1 aleph1 changed the title Metal instancing fix Metal backend refactor Feb 28, 2025
@aleph1
Copy link
Contributor Author

aleph1 commented Feb 28, 2025

I have discovered what the other issue is with instancing and it has to do with how vertex fragments constants seem to be handled. The current metal backend has the constant index hardcoded as 1, which is what the outputted metal code uses as the expected buffer index for uniform data. The problem is when we use instancing there are effectively two (or more) buffers assigned based on the generated memory layout.

For example, we create a buffer pipeline like this (I am using a single float as the only uniform as a “most simple” test case before moving to more complex uniform structures):

kinc_g4_vertex_structure_t structure;
kinc_g4_vertex_structure_init(&structure);
kinc_g4_vertex_structure_add(&structure, "pos", KINC_G4_VERTEX_DATA_F32_3X);
kinc_g4_vertex_structure_t structure_inst;
kinc_g4_vertex_structure_init(&structure_inst);
kinc_g4_vertex_structure_add(&structure_inst, "off", KINC_G4_VERTEX_DATA_F32_3X);
structure_inst.instanced = true;
kinc_g4_pipeline_init(&pipeline);
pipeline.vertex_shader = &vertex_shader;
pipeline.fragment_shader = &fragment_shader;
pipeline.input_layout[0] = &structure;
pipeline.input_layout[1] = &structure_inst;
pipeline.input_layout[2] = NULL;
kinc_g4_pipeline_compile(&pipeline);
tmp_loc = kinc_g4_pipeline_get_constant_location(&pipeline, "tmp");

And have shader code like this:

#version 450

in vec3 pos;
in vec3 off;

uniform float tmp;

void main() {
	gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
}

But the resulting kinc/make metal code expects the uniforms to be at buffer 1, but they have to be placed at buffer 2 due to the pipelines vertex structure.

// shader_vert_main

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct shader_vert_main_uniforms
{
    float tmp;
};

struct shader_vert_main_out
{
    float4 gl_Position [[position]];
};

struct shader_vert_main_in
{
    float3 off [[attribute(0)]];
    float3 pos [[attribute(1)]];
};

vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant shader_vert_main_uniforms& uniforms [[buffer(1)]])
{
    shader_vert_main_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + uniforms.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5;       // Adjust clip-space for Metal
    return out;
}

If I simply modify the compiled metal vertex shader and replace uniforms [[buffer(1)]] with uniforms [[buffer(2)]] it works as expected. Do you have any insight this as I would very much appreciate them?

@aleph1
Copy link
Contributor Author

aleph1 commented Feb 28, 2025

This is a quick video illustrating the aforementioned issue of buffer indexes. If I manually adjust the compiled shader to point at the correct buffer index then everything works as expected and the triangles are drawn with instancing and they utilize a uniform to move.

kore-metal-fix.mp4

You can see how I am having to ensure the constants vertex buffer is set as the last vertex buffer, as index 1 might already be occupied when doing things like instancing. Is there a reason why index 1 was chose for the constants vertex buffer, when 0 could have been used, and then all other vertex buffers would simply use 1...n?

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 1, 2025

I did a bit more digging, and I think this might be considered a krafix issue.

https://github.com/Kode/krafix/blob/43f9df54d0624b63e7c98a4019b60d07bc8c9202/Sources/MetalTranslator.cpp#L386-L390

The issue seems to boil down to the fact that when Metal vertex buffers user per vertex and per instance step functions in a single memory layout, they take up more than one buffer slot. I think that is why in my test case changing the compiled metal vertex shader input functions uniforms [[buffer(1)]] with uniforms [[buffer(2)]] fixes things. Though I'm not sure whether MetalTranslator or MetalTranslator2 is used, so I could be wrong about this.

If it is MetalTranslator then given the else in the krafix translator putting the constants at buffer 0, I wonder whether the constants could always be in buffer zero even when there are other vertex buffers. There might be a reason why this isn’t possible, but in theory it might fix things in terms of the metal shader compilation working as expected.

@RobDangerous
Copy link
Member

MetalTranslator2 is used.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 1, 2025

Thanks. I think MetalTranslator2 might require a refactor to get shader compilation working with more complex structures (like having a memory layout that has multiple layouts). I did some research and this seems relevant:

KhronosGroup/SPIRV-Cross#1072

As does the link from within the comments:

https://github.com/KhronosGroup/SPIRV-Cross/wiki/Reflection-API-user-guide#mapping-reflection-output-to-vulkan-cheat-sheet

Compiling my new example with instancing and uniforms works as expected in OpenGL on Mac and I will try it with Vulkan on my Windows machine in the next week or two.

I'm leaving detailed information in case I don’t finish this work.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 2, 2025

A bit more research.

  1. Assigning the constant buffer in the metal backend to index 1 is problematic because the other vertex buffers can take up positions 0 to n. So, do we want to write it consistently to index 0, with the other vertex buffers being in positions 1 to n? I’m still trying to understand if the existing metal implementation chose these indexes for a reason, or were they just because it was only partially implemented? I might lean towards the vertex constant buffer being in position 0 as otherwise I think krafix needs an overhaul for metal as because the shaders are compiled at build time vs runtime it has no way of knowing which buffer index the constants will end up in.
  2. krafix’s integration with spirv seems to lose bindings that are written by the user into the glsl code that it compiles. If I write a vertex shader like the following and compile it from the CLI with glslang -G and use the resultingspirv-cross --msl --msl-decoration-binding the binding position is maintained where krafix loses the bindings somewhere in the compilation. I have included the two compiled versions after the original glsl. Specifically note the difference in [[buffer(n)]] positions.
#version 450 core

layout(binding = 2, set = 0) uniform uniforms
{
    float tmp;
};

layout(location=0) in vec3 pos;
layout(location=1) in vec3 off;

void main() {
    gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
}

Using glslang and spirv-cross on the cli (buffer position is maintained):

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct uniforms
{
    float tmp;
};

struct main0_out
{
    float4 gl_Position [[position]];
};

struct main0_in
{
    float3 pos [[attribute(0)]];
    float3 off [[attribute(1)]];
};

vertex main0_out main0(main0_in in [[stage_in]], constant uniforms& _29 [[buffer(2)]])
{
    main0_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    return out;
}

Using krafix (buffer position is changed from 2 to 1):

// shader_vert_main

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct uniforms
{
    float tmp;
};

struct shader_vert_main_out
{
    float4 gl_Position [[position]];
};

struct shader_vert_main_in
{
    float3 off [[attribute(1)]];
    float3 pos [[attribute(0)]];
};

vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant uniforms& _29 [[buffer(1)]])
{
    shader_vert_main_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5;       // Adjust clip-space for Metal
    return out;
}

@RobDangerous
Copy link
Member

2 is intentional and is done for every graphics backend. Very tricky to get things working cross-platform when we can not assign bindings ourselves (as 1 illustrates). Will check you suggestion.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 5, 2025

I refactored the Metal backend even more with a test to see if vertex constants can be written to index 0, with vertex buffers written from index 1 to n, and it works. I think the only change it would take to krafix is ensuring the vertex shader uses buffer[0] for the uniforms versus buffer[1], and only for the metal backend.

This is the (barely) modified vertex shader krafix would need to output. It would just be the change of uniforms [[buffer(1)]] to uniforms [[buffer(0)]] and it all works as expected.

// shader_vert_main

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct shader_vert_main_uniforms
{
    float tmp;
};

struct shader_vert_main_out
{
    float4 gl_Position [[position]];
};

struct shader_vert_main_in
{
    float3 off [[attribute(0)]];
    float3 pos [[attribute(1)]];
};

vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant shader_vert_main_uniforms& uniforms [[buffer(0)]])
{
    shader_vert_main_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + uniforms.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5;       // Adjust clip-space for Metal
    return out;
}

Do you want me to commit these additional changes and see if I can modify the appropriate file in krafix? I can also compile and test all of the other kinc samples to make sure they work with my changes to the metal backend.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 6, 2025

I tried to take a quick look at MetalTranslator2 in krafix, and what I thought was the line that would need changing didn’t work after rebuilding krafix and replacing the existing one in the pipeline.

I changed mslBinding.msl_buffer = stage == StageVertex ? 1 : 0; to mslBinding.msl_buffer = 0; but that doesn’t impact the position of constant shader_vert_main_uniforms& uniforms [[buffer(1)]] in the output shader.

If you have a moment to let me know what I might need to change in MetalTranslator2 in krafix I am happy to rebuild it and test it with the metal backend changes. Note: I think it might have to do with the glslang settings in krafix, but glslang and spirv-cross are new to me. I can manually adjust the buffers index for now, but the Metal backend changes I have made would work if the output metal shader had the buffer index adjusted.

@RobDangerous
Copy link
Member

Looks like it would be the right thing, did you make sure the shaders were even recompiled?
But I'm not sure if it's ok to put uniforms for the fragment and the vertex shader on slot 0. Can't we simply use slot 2 and up?

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 6, 2025

With Metal I am quite sure the vertex and fragment shader stages maintain separate buffer binding tables, so it is acceptable to assign each at index 0:

[encoder setVertexBuffer:vertexUniformBuffer offset:0 atIndex:0];
[encoder setFragmentBuffer:fragmentUniformBuffer offset:0 atIndex:0];

However, having gaps in indexes also works in Metal, so if you would prefer me to use specific indexes for each than I am happy to refactor the back-end further.

I did replace krafix locally with the version I modified, fully deleted the build project and made it again using Kore/make but the output was the same with uniforms in the shader being at buffer position 1. I think the issue might not be with the spirv-cross settings but the glslang settings. Both glslang and spirv-cross are new to me so I am not sure what compiler option would affect the resulting buffer index in the precompiled shader.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 7, 2025

I finally had a chance to test some of the other backends with this same code. You can find the quick demo here:

https://github.com/aleph1/kore-2-instancing-and-uniforms

These are the platform / backend combos I compiled and ran it on. As expected it worked in all of those I tested on. So I do think it is just the one small change that is required for the buffer index for Metal, which I imagine will affect Mac and iOS the same.

  • Win / Direct3D12
  • Win / Vulkan
  • Win / OpenGL
  • Mac / OpenGL

@RobDangerous
Copy link
Member

Had a look, it's line 9739 in spirv_msl.cpp. We can just change it to zero then in that line if that still works fine for vertex and fragment uniform buffers.

@aleph1
Copy link
Contributor Author

aleph1 commented Mar 15, 2025

Thanks! I can compile a version locally and test it locally with the existing samples as well as a few of my more complex shaders in the next week or two. If there are any shaders you want me to test it with, let me know.

@RobDangerous RobDangerous changed the base branch from main to v2 May 18, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants