CUDA Out of Memory on 24GB and 48GB GPUs — What Are the Minimum VRAM Requirements?

Hello,

I am attempting to run **TeleStyle** on RunPod, but I consistently encounter **CUDA out of memory** errors — even when using GPUs with large VRAM.
I would like clarification on the **recommended minimum GPU memory** for stable inference.

---

### Environment

- **Container:** `runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04`
- **Python:** 3.11
- **CUDA:** 12.4 
- **GPUs tested:**
    - 24GB VRAM → OOM 
    - 48GB VRAM → still OOM
        
---

### What I Tried

- Installed dependencies via `requirements.txt` 
- Ran the default pipeline (no major modifications)    
- Tested on larger GPU assuming it would be sufficient
    
Despite this, memory is exhausted during execution.

---

### Questions

1. **What is the minimum VRAM requirement** to run TeleStyle reliably?
2. Is the project designed for:
    - ≥80GB GPUs (A100/H100 class)?
    - Multi-GPU setups?
        
3. Are there recommended flags for reduced memory usage such as:
    
    - fp16 / bf16
    - model offloading
    - gradient checkpointing
    - lower resolution
    - smaller batch sizes
        
4. Is there an example configuration for running on ≤48GB VRAM?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Out of Memory on 24GB and 48GB GPUs — What Are the Minimum VRAM Requirements? #9

Environment

What I Tried

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA Out of Memory on 24GB and 48GB GPUs — What Are the Minimum VRAM Requirements? #9

Description

Environment

What I Tried

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions