Skip to content

TensoRT doesn't cache plans during synchronous training loop even if use USE_CACHE_TENSORRT_PLAN #1109

@KvanTTT

Description

@KvanTTT

Every time I run selfplay or gatekeeper it spends a lot of time during creating plan (up to 5 minutes):

2025-10-02 10:34:02+0200: TensorRT backend thread 0: Initializing (may take a long time)
2025-10-02 10:34:04+0200: Creating new plan cache
2025-10-02 10:38:22+0200: Saved new plan cache to \katago_contribute\katagodots-test\scripts\dated\20251002-032522\bin/KataGoData/trtcache/trt-101100_gpu-771cbe50_net-katagodots-test-dotsgame-s3612416-d363241_7_max12x12_batch128_fp16

And the trtcache directory is empty after iteration.

Also, I see the plan's name is bound to a certain model like s3612416-d363241. Is it possible to make it shared over generated models?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions