Your larger models are split into 50GB chunks due to huggingface file upload limits. Example: https://huggingface.co/TheBloke/goliath-120b-GGUF/tree/main
/scripts/fetch-model.py can't handle these split models by itself, using the model parameter.
Would it be possible to add a split syntax, so that a pod can be started with this environment variable, download the big model, stitch it together, delete the parts? E.g.
MODEL=https://huggingface.co/TheBloke/goliath-120b-GGUF/tree/main/goliath-120b.Q4_K_M.gguf-split-a,https://huggingface.co/TheBloke/goliath-120b-GGUF/tree/main/goliath-120b.Q4_K_M.gguf-split-b