failed to decode the batch

Hi, When I use server-parallel I get an error: updateSlots : failed to decode the batch, n_batch = 1, ret = 1

this is the complete log before the error:
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required  =  107.54 MB
llm_load_tensors: offloading 40 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 43/43 layers to GPU
llm_load_tensors: VRAM used: 8694.21 MB
...................................................................................................
llama_new_context_with_model: n_ctx      = 1024
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: offloading v cache to GPU
llama_kv_cache_init: offloading k cache to GPU
llama_kv_cache_init: VRAM kv self = 800.00 MB
llama_new_context_with_model: kv self size  =  800.00 MB
llama_new_context_with_model: compute buffer total size = 118.13 MB
llama_new_context_with_model: VRAM scratch buffer: 112.00 MB
llama_new_context_with_model: total VRAM used: 9606.21 MB (model: 8694.21 MB, context: 912.00 MB)
Available slots:
 - slot 0
 - slot 1

llama server listening at http://0.0.0.0:8080

system prompt updated
slot 0 is processing
slot 0 released
slot 0 is processing
slot 0 released
slot 0 is processing
slot 0 released
slot 0 is processing
slot 0 released
slot 0 is processing
slot 1 is processing
updateSlots : failed to decode the batch, n_batch = 1, ret = 1

I run server-parallel with the following command: 
./server-parallel -m models/xyz.gguf --ctx_size 2048 -t 4 -ngl 40 --host 0.0.0.0 --batch-size 512 --parallel 2 

Of course this only happens if both slots are performing inference at the same time. Could you please help me resolve this issue?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

failed to decode the batch #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

failed to decode the batch #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions