Skip to content

Add inference_max_seq_len to ray mbridge deployment path#588

Open
athitten wants to merge 1 commit intomainfrom
athitten/inf_max_seqlen_ray
Open

Add inference_max_seq_len to ray mbridge deployment path#588
athitten wants to merge 1 commit intomainfrom
athitten/inf_max_seqlen_ray

Conversation

@athitten
Copy link
Contributor

@athitten athitten commented Feb 7, 2026

Adds inference_max_seq_len to ray mbridge deployment path. This was not exposed in the ray mbridge deployment path and was existing only in the pytriton path. Its needed to be set while running deployment for eval benchmarks like humaneval that have a large value of max_tokens.

Signed-off-by: Abhishree <abhishreetm@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@athitten athitten added r0.4.0 Cherry-pick PR to r0.4.0 release branch and removed deploy LLM scripts labels Feb 7, 2026
@athitten
Copy link
Contributor Author

athitten commented Feb 7, 2026

/ok to test 8668e30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r0.4.0 Cherry-pick PR to r0.4.0 release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant