I'm trying to run RoSA finetuning on Nvidia Quadro RTX 6000. The GPU architecture doesn't support bfloat16 so I tried to load the model in 4 bits (similar to the suggestion for Colab T4 GPU). The finetuning is completed but at the time of loading the model and running inference, I get a Runtime Error: No Kernel Image Available for execution. Is there any workaround for this? FFT is not working for me (Running out of GPU RAM).
I'm trying to run RoSA finetuning on Nvidia Quadro RTX 6000. The GPU architecture doesn't support bfloat16 so I tried to load the model in 4 bits (similar to the suggestion for Colab T4 GPU). The finetuning is completed but at the time of loading the model and running inference, I get a Runtime Error: No Kernel Image Available for execution. Is there any workaround for this? FFT is not working for me (Running out of GPU RAM).