Skip to content

Conversation

@zpitroda
Copy link
Contributor

When building and training with gpu and then trying to start service I get error:

local_llm_service.py:222 - Failed to start llama-server: free(): double free detected in tcache 2

I believe this is due to the env["CUDA_VISIBLE_DEVICES"] = "" line of local_llm_service causing memory conflicts since inference is currently only on cpu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants