Skip to content

It turns out WDDM driver mode is making our RAM - GPU transfer extremely slower compared to TCC or MCDM mode. Anyone has figured out the bypass NVIDIA software level restrictions? #103

@FurkanGozukara

Description

@FurkanGozukara

We are working on generative AI models training. Like training FLUX, or Qwen Image or Wan 2.2.

We have noticed that we are getting massive speed loss when we do big data transfer between RAM and GPU on Windows compared to Linux.

The hit is such a big scale that Linux runs 2x faster than Windows even more.

Tests are made on same : GPU RTX 5090

You can read more info here : kohya-ss/musubi-tuner#700

It turns out if we enable TCC mode on Windows, it gets equal speed as Linux.

However NVIDIA blocked this at driver level.

I found a Chinese article with just changing few letters, via Patching nvlddmkm.sys, the TCC mode fully becomes working on consumer GPUs. However this option is extremely hard and complex for average users.

Now my question is, why we can't get Linux speed on Windows?

Everything I found says it is due to driver mode WDDM

Moreover it seems like Microsoft added this feature : MCDM

https://learn.microsoft.com/en-us/windows-hardware/drivers/display/mcdm-architecture

And as far as I understood, MCDM mode should be also same speed.

How can we solve this slowness on Windows compared to Linux?

Our issue is happening due to this. Recent AI models are massive and not fitting into GPU. So we are doing Block Swapping. Which means only the model blocks that will be trained being on GPU. So we swap model between RAM and GPU constantly.

As you can imagine this is a massive data transfer. This is being ultra fast on Linux on same hardware. However on Windows, it is like at least 3x slower and we couldn't solve this issue yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions