Skip to content

Conversation

@CharlieFRuan
Copy link
Collaborator

@CharlieFRuan CharlieFRuan commented Jan 16, 2026

The goal of this PR is to allow /chat/completions requests to vllm to be able to templated by a custom chat template. One motivation is to not strip thinking tokens when training Qwen3 models.

Users can now do it with +generator.engine_init_kwargs.chat_template=/path/to/templates/qwen3_acc_thinking.jinja2.

We enable this by parsing the engine_init_kwargs.chat_template configuration and passing it in to vllm's OpenAIServingChat in our vllm_engine.py.

The effect can be seen in the added unit test.

This is only used when: run_engines_locally: True, backend: "vllm".

We add a folder of skyrl_train/utils/template to include the Qwen3 template that does not strip thinking tokens (detailed description in the folder's README)

The generator.chat_template still only applies to SkyRLGymGenerator.

@CharlieFRuan CharlieFRuan changed the title [train][OpenAI] Make chat_template config apply to vllm /chat/completions templating [train][OpenAI] Make chat_template config apply to vllm /chat/completions Jan 16, 2026
@CharlieFRuan
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends the chat_template configuration for vLLM's /chat/completions endpoint, enabling customization of model behavior such as preserving thinking tokens. However, a critical security vulnerability has been identified: the get_custom_chat_template function loads templates from an unvalidated file path, leading to a Path Traversal vulnerability that can escalate to Remote Code Execution via Server-Side Template Injection due to the use of Jinja2. Detailed remediation is provided in a line comment. Furthermore, there's a minor typo in the configuration, a suggestion to add a default value when popping chat_template from kwargs for robustness, and a potential TypeError in the synchronous VLLMInferenceEngine if chat_template is not properly handled.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature by allowing custom chat templates for vLLM's /chat/completions endpoint, which is particularly useful for models like Qwen3. However, the implementation for loading templates from a file is vulnerable to a high-severity path traversal attack, potentially allowing an attacker to read arbitrary files on the filesystem. This critical security flaw must be addressed by validating file paths to ensure they are within expected directories. Additionally, a critical bug causing a crash with the synchronous inference engine was identified, along with a minor typo in a configuration file and an opportunity to improve code robustness in a test file. Addressing these issues will significantly enhance the security, stability, and reliability of the implementation.

@CharlieFRuan CharlieFRuan force-pushed the charlie/011626-templateChatCompletion branch from 73f31c8 to 5e8d1f0 Compare January 19, 2026 00:43
@CharlieFRuan CharlieFRuan changed the title [train][OpenAI] Make chat_template config apply to vllm /chat/completions [train][OpenAI] Make engine_init_kwargs.chat_template config apply to vllm /chat/completions Jan 19, 2026
@CharlieFRuan
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends the chat_template configuration to vLLM's /chat/completions endpoint, enhancing support for models like Qwen3. However, it introduces a critical path traversal vulnerability in the file path validation, allowing arbitrary file reads from the local filesystem. Addressing this vulnerability is crucial, and a minor refactoring in the test code could also improve clarity and robustness.

@CharlieFRuan
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for customizing chat templates in vLLM's /chat/completions endpoint, which is particularly useful for models like Qwen3. The implementation is clean, and the inclusion of a new template, documentation, and a dedicated unit test is excellent. My review focuses on enhancing the security of the new file path validation logic and improving code professionalism.

@CharlieFRuan
Copy link
Collaborator Author

@CharlieFRuan CharlieFRuan merged commit 28c5251 into main Jan 19, 2026
4 checks passed
@CharlieFRuan CharlieFRuan deleted the charlie/011626-templateChatCompletion branch January 19, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants