System: RTX 5060 Ti (16GB VRAM), 48GB RAM, WSL2 Ubuntu, CUDA 13.2
Issue: Running demo.sh locally fails during feature extraction with JAX OOM error:
RESOURCE_EXHAUSTED: Out of memory while trying to allocate 95651102720 bytes.
The byte size of input/output arguments (95950995456) exceeds the base limit (13682291507).
The VideoPrism JAX model attempts to allocate ~95GB on a 16GB GPU. JAX itself flags this as a computation error before even attempting allocation. This makes local single-GPU inference impossible on consumer hardware.
The HuggingFace Space works fine, so the issue is specific to the local inference pipeline — likely the model is compiled/batched for multi-GPU server setups without a single-GPU fallback.
Would appreciate any guidance on running this locally on a single consumer GPU.