[Litellm Enhancement] Enable extra sampling args for litellm backend by eldarkurtic · Pull Request #1195 · huggingface/lighteval

eldarkurtic · 2026-03-20T12:34:56Z

When models are served with vllm, vllm's OpenAI endpoint supports some additional sampling args like top_k, min_p, and presence_penalty. This PR enables forwarding these as standard generation_parameters without breaking litellm usage for other endpoints like openai, togetherai, etc which don't support these extra args.

This PR is a bit less invasive approach than #1194

Motivation for these extra sampling args: newer models like Qwen3.5 suggest using top_k/presence_penalty as sampling args for the "best behavior" of their models. Without this PR, it was not possible to evaluate vllm-hosted Qwen3.5 models with recommended sampling args.

enable extra sampling args

7c22ae5

eldarkurtic mentioned this pull request Mar 20, 2026

[LiteLLM Enhancement] Enable extra_body dict for litellm backend #1194

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Litellm Enhancement] Enable extra sampling args for litellm backend#1195

[Litellm Enhancement] Enable extra sampling args for litellm backend#1195
eldarkurtic wants to merge 1 commit intohuggingface:mainfrom
eldarkurtic:enable-litellm-extra-args

eldarkurtic commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eldarkurtic commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant