Skip to content

[LiteLLM Enhancement] Enable extra_body dict for litellm backend #1194

Closed
eldarkurtic wants to merge 1 commit intohuggingface:mainfrom
eldarkurtic:litellm-add-extrabody
Closed

[LiteLLM Enhancement] Enable extra_body dict for litellm backend #1194
eldarkurtic wants to merge 1 commit intohuggingface:mainfrom
eldarkurtic:litellm-add-extrabody

Conversation

@eldarkurtic
Copy link
Copy Markdown
Contributor

In addition to standard OpenAI sampling args, vLLM server supports additional arguments via extra_body argument such as top_k, min_p, etc. Motivation for this feature is to enable running Qwen3.5 models with recommended sampling arguments which include for example top_k (see https://huggingface.co/Qwen/Qwen3.5-27B#using-qwen35-via-the-chat-completions-api).

This PR enables this feature through both interfaces: yaml and inline string.

  1. yaml example:
model_parameters:
  provider: "hosted_vllm"
  model_name: "hosted_vllm/<model_name>"
  base_url: "http://0.0.0.0:8000/v1"
  api_key: ""
  timeout: 1200
  concurrent_requests: 128
  generation_parameters:
    temperature: 0.6
    max_new_tokens: 65536
    top_p: 0.95
    seed: 0
  extra_body:
    top_k: 20
    min_p: 0.0
  1. inline string
lighteval endpoint litellm \
"model_name=hosted_vllm/my_mdl,provider=hosted_vllm,base_url=http://0.0.0.0:8000/v1,timeout=120,concurrent_requests=8,extra_body={top_k:20,min_p:0.0,repetition_penalty:1.0},generation_parameters={temperature:1.0,max_new_tokens:65536,top_p:0.95,seed:42,presence_penalty:1.5}" \
"aime25|0" \
--output-dir ${OUTPUT_DIR} \

@eldarkurtic
Copy link
Copy Markdown
Contributor Author

closing this in favor of #1195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant