Skip to content

CUDA Out of Memory on 24GB and 48GB GPUs — What Are the Minimum VRAM Requirements? #9

@yescine

Description

@yescine

Hello,

I am attempting to run TeleStyle on RunPod, but I consistently encounter CUDA out of memory errors — even when using GPUs with large VRAM.
I would like clarification on the recommended minimum GPU memory for stable inference.


Environment

  • Container: runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
  • Python: 3.11
  • CUDA: 12.4
  • GPUs tested:
    • 24GB VRAM → OOM
    • 48GB VRAM → still OOM

What I Tried

  • Installed dependencies via requirements.txt
  • Ran the default pipeline (no major modifications)
  • Tested on larger GPU assuming it would be sufficient

Despite this, memory is exhausted during execution.


Questions

  1. What is the minimum VRAM requirement to run TeleStyle reliably?

  2. Is the project designed for:

    • ≥80GB GPUs (A100/H100 class)?
    • Multi-GPU setups?
  3. Are there recommended flags for reduced memory usage such as:

    • fp16 / bf16
    • model offloading
    • gradient checkpointing
    • lower resolution
    • smaller batch sizes
  4. Is there an example configuration for running on ≤48GB VRAM?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions