-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Multimodal][Speculative Decoding]Eagle3 mm support, enablement on qwen3vl
new-model
Requests to new models
qwen
Related to Qwen models
speculative-decoding
v1
#29594
opened Nov 27, 2025 by
EanWang211123
Loading…
5 tasks
[Bugfix] Fix HunyuanVL XD-RoPE
ready
ONLY add when PR is ready to merge/full CI is needed
#29593
opened Nov 27, 2025 by
ywang96
Loading…
5 tasks
Fix multiprocessing start method for CUDA compatibility in LMCache kv_cache_sharing_lmcache_v1.py
documentation
Improvements or additions to documentation
kv-connector
nvidia
#29592
opened Nov 27, 2025 by
usberkeley
Loading…
5 tasks done
[BugFix] Optional tokenizer argument when loading GGUF models
#29582
opened Nov 27, 2025 by
sts07142
Loading…
1 of 5 tasks
[BugFix] Fix new nightly failures
ready
ONLY add when PR is ready to merge/full CI is needed
ready-run-all-tests
Trigger CI with all tests for wide-ranging PRs
v1
#29578
opened Nov 27, 2025 by
LucasWilkinson
Loading…
[Doc] Add 20251202 vLLM Malaysia Meetup Info
documentation
Improvements or additions to documentation
#29577
opened Nov 27, 2025 by
tjtanaa
Loading…
5 tasks
[Docs] Improve
priority parameter documentation
frontend
#29572
opened Nov 27, 2025 by
maang-h
Loading…
[NIXL][Bugfix] Fix NIXL/RDMA registration failure over CuMemAllocator
kv-connector
#29569
opened Nov 27, 2025 by
Somoku
Loading…
5 tasks
Make PyTorch profiler gzip and CUDA time dump configurable
documentation
Improvements or additions to documentation
nvidia
v1
#29568
opened Nov 27, 2025 by
zhangruoxu
Loading…
5 tasks
[Perf] Enable cuda graph for deepepHT, 5.3% throughput improvement, 4.4% TTFT improvement
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#29558
opened Nov 27, 2025 by
yewentao256
Loading…
[Frontend] Remap -O to -cc commandline flag
ci/build
documentation
Improvements or additions to documentation
#29557
opened Nov 27, 2025 by
gmagogsfm
Loading…
[CI/Build] Skip ray tests on ROCm
rocm
Related to AMD ROCm
v1
#29556
opened Nov 26, 2025 by
rjrock
Loading…
3 of 5 tasks
[PERF] Qwen3-next. Add fp8 cutlass MoE tuned configs. Related to Qwen models
chmod -x *MI308X.json
nvidia
qwen
#29553
opened Nov 26, 2025 by
vadiklyutiy
Loading…
[responsesAPI] support input output messages for non harmony models
frontend
gpt-oss
Related to GPT-OSS models
#29549
opened Nov 26, 2025 by
qandrew
Loading…
[Perf] Deepgemm fused layout kernel for activations, 4.3% throughput improvement, 10.7% TTFT improvement.
ready
ONLY add when PR is ready to merge/full CI is needed
#29546
opened Nov 26, 2025 by
yewentao256
Loading…
[Bugfix] Fix DeepSeek R1 MTP weight loading
deepseek
Related to DeepSeek models
#29545
opened Nov 26, 2025 by
MatthewBonanni
•
Draft
3 of 5 tasks
[BugFix] Fix spec decoding max_tokens scheduling perf issue
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#29542
opened Nov 26, 2025 by
njhill
Loading…
[Attention] Update attention imports
kv-connector
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
speculative-decoding
tpu
Related to Google TPUs
v1
#29540
opened Nov 26, 2025 by
MatthewBonanni
Loading…
3 of 5 tasks
Remove duplicate fake registration implementations for gptq_marlin_repack and awq_marlin_repack operations.
#29524
opened Nov 26, 2025 by
atalhens
Loading…
4 tasks
[Frontend] Add streaming tool-call support to Responses API (non-Harmony)
frontend
gpt-oss
Related to GPT-OSS models
#29513
opened Nov 26, 2025 by
sumitaryal
Loading…
5 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.