Skip to content

[Litellm Enhancement] Enable extra sampling args for litellm backend#1195

Open
eldarkurtic wants to merge 1 commit intohuggingface:mainfrom
eldarkurtic:enable-litellm-extra-args
Open

[Litellm Enhancement] Enable extra sampling args for litellm backend#1195
eldarkurtic wants to merge 1 commit intohuggingface:mainfrom
eldarkurtic:enable-litellm-extra-args

Conversation

@eldarkurtic
Copy link
Copy Markdown
Contributor

When models are served with vllm, vllm's OpenAI endpoint supports some additional sampling args like top_k, min_p, and presence_penalty. This PR enables forwarding these as standard generation_parameters without breaking litellm usage for other endpoints like openai, togetherai, etc which don't support these extra args.

This PR is a bit less invasive approach than #1194

Motivation for these extra sampling args: newer models like Qwen3.5 suggest using top_k/presence_penalty as sampling args for the "best behavior" of their models. Without this PR, it was not possible to evaluate vllm-hosted Qwen3.5 models with recommended sampling args.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant