-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[fix] fix SM check for Flashinfer TRTLLM MOE
nvidia
#30314
opened Dec 9, 2025 by
jiahanc
Loading…
5 tasks
[Misc][Quantization] Clarify the intent of GGUF
FusedMoE weight materialization
#30310
opened Dec 9, 2025 by
a4lg
Loading…
1 of 5 tasks
[DCP][Bugfix][CI] Fix accuracy issue of DCP when using FLASH_ATTN_MLA
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#30309
opened Dec 9, 2025 by
FENP
Loading…
3 of 5 tasks
[bugfix][quantization] fix quark qwen3 kv_cache quantization
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#30308
opened Dec 9, 2025 by
haoyangli-amd
Loading…
[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models
qwen
Related to Qwen models
#30307
opened Dec 9, 2025 by
a4lg
Loading…
3 of 5 tasks
Fix incomplete response generation for tool call outputs
deepseek
Related to DeepSeek models
fb-exported
frontend
meta-exported
[ResponsesAPI] Add GPTOSS MCP tool streaming
frontend
gpt-oss
Related to GPT-OSS models
#30301
opened Dec 9, 2025 by
qandrew
Loading…
[Bugfix] Update WSL detection to check for WSL1 compatibility as WSL2…
#30299
opened Dec 9, 2025 by
HoneyBerries
Loading…
Main 20251205
amd
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#30298
opened Dec 9, 2025 by
Alexei-V-Ivanov-AMD
Loading…
[Core] Add SLA-tiered scheduling (opt-in) and docs
documentation
Improvements or additions to documentation
v1
#30297
opened Dec 9, 2025 by
ProdByBuddha
Loading…
3 of 5 tasks
[CI/Build][Kernel][BugFix][AMD] Fix per_token_group_quant_fp8 to use correct fp8 min/max values and update atol/rtol in test_quantfp8_group_functionality
rocm
Related to AMD ROCm
#30292
opened Dec 9, 2025 by
rasmith
Loading…
[CI/Build][AMD] Fix ref_dynamic_per_token_quant reference implementation on ROCm.
rocm
Related to AMD ROCm
#30291
opened Dec 9, 2025 by
rasmith
Loading…
[Core] Add token-level KV cache metrics to V1 engine
v1
#30289
opened Dec 9, 2025 by
Minsung-commit
Loading…
Ensure minimum frames for GLM 4.6V compatibility
#30285
opened Dec 9, 2025 by
gh-wf
Loading…
1 of 3 tasks
[BugFix] Lazy tokenizer init in StructuredOutputManager to prevent GGUF semaphore leak
structured-output
v1
#30284
opened Dec 9, 2025 by
kitaekatt
Loading…
4 tasks
[Small] Add comment for
parallel_config in FusedMoEModularKernel
#30282
opened Dec 8, 2025 by
yewentao256
•
Draft
[CI/Build] Ignore data_parallel_size_local
#30281
opened Dec 8, 2025 by
rjrock
Loading…
3 of 5 tasks
[BugFix] Fix non detected failing tests
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#30277
opened Dec 8, 2025 by
ilmarkov
Loading…
5 tasks
[ROCM][CI] Fix AMD Examples Test Group
ci/build
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#30276
opened Dec 8, 2025 by
Concurrensee
Loading…
[NIXL] refine decoder side post process for heterogeneous BlockSize and kv_layout
kv-connector
v1
#30275
opened Dec 8, 2025 by
xuechendi
Loading…
5 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-11-08.