Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[bug] Fix lora reloading from disk every step when using AsyncLLM bug Something isn't working frontend
#41482 opened May 1, 2026 by hao-aaron Contributor Loading…
4 tasks
[Bugfix] Fix B200 batch determinism in fused_moe logic bug Something isn't working
#41480 opened May 1, 2026 by Lucaskabela Contributor Draft
3 of 4 tasks
Limit concurrency on test_transcription_api_correctness.py
#41478 opened May 1, 2026 by ekagra-ranjan Contributor Loading…
[WIP] Clean up scheduling loop v1
#41473 opened May 1, 2026 by andylolu2 Contributor Draft
[Refactor] Remove dead code in tests and parallel_state kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#41471 opened May 1, 2026 by yewentao256 Member Loading…
[Bugfix] Validate 'text' field in Responses API multi-modal content bug Something isn't working frontend
#41470 opened May 1, 2026 by ankrovv Loading…
4 tasks done
[Bugfix] Fix Qwen3Coder prev_tool_call_arr double-emission on parse failure bug Something isn't working qwen Related to Qwen models tool-calling
#41466 opened May 1, 2026 by ToastyTheBot Draft
3 tasks
[BugFix][MyPy]: Module has no attribute "sched_getaffinity" [attr-defined] bug Something isn't working cpu Related to CPU backends
#41465 opened May 1, 2026 by hickeyma Contributor Loading…
Fix DeepSeek-OCR for Transformers v4 deepseek Related to DeepSeek models
#41460 opened May 1, 2026 by hmellor Member Loading…
fix(frontend): Add multimodal placeholders to Gemma4 tool message template documentation Improvements or additions to documentation
#41459 opened May 1, 2026 by harshaljanjani Loading…
4 of 5 tasks
[Perf] Fuse Qwen3.5 GDN in_proj_ba into 6-way in_proj MergedColumnParallelLinear qwen Related to Qwen models
#41457 opened May 1, 2026 by jhsmith409 Contributor Loading…
5 tasks done
[ROCM][RDNA3] WMMA paged prefill and split-K decode kernels for ROCM_ATTN ci/build rocm Related to AMD ROCm v1
#41455 opened May 1, 2026 by JartX Contributor Loading…
[Bugfix] Reject unsupported FlexAttention head sizes bug Something isn't working v1
#41454 opened May 1, 2026 by bugkeep Loading…
[CI] Route part of B200 jobs to b200-k8s ci/build ready ONLY add when PR is ready to merge/full CI is needed
#41453 opened May 1, 2026 by khluu Collaborator Draft
[ROCm][Deepseekv4] DeepseekV4 Mi300 support ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation rocm Related to AMD ROCm v1
#41451 opened May 1, 2026 by ganyi1996ppo Contributor Draft
4 tasks
[Kernel][AMD] Optimize GatedDeltaNet FLA prefill kernels on MI300X rocm Related to AMD ROCm
#41446 opened May 1, 2026 by zobinHuang Loading…
[DSV4] AR+mhc_post fusion ci/build nvidia performance Performance-related issues
#41441 opened May 1, 2026 by jeejeelee Collaborator Draft
4 tasks
Revert "Fix Cohere ASR after HF upgrade" (#40582) multi-modality Related to multi-modality (#4194)
#41439 opened May 1, 2026 by vllm-agent Draft
ProTip! Updated in the last three days: updated:>2026-04-28.