Skip to content

Navigation Menu

Appearance settings

GitHub Copilot
Write better code with AI

GitHub Spark New
Build and deploy intelligent apps

GitHub Models New
Manage and compare prompts

GitHub Advanced Security
Find and fix vulnerabilities

Actions
Automate any workflow
Codespaces
Instant dev environments

Issues
Plan and track work

Code Review
Manage code changes

Discussions
Collaborate outside of code

Code Search
Find more, search less
Explore

Why GitHub

Documentation

GitHub Skills

Blog
Integrations

GitHub Marketplace

MCP Registry
View all features
By company size

Enterprises

Small and medium teams

Startups

Nonprofits
By use case

App Modernization

DevSecOps

DevOps

CI/CD

View all use cases
By industry

Healthcare

Financial services

Manufacturing

Government

View all industries
View all solutions
Topics

AI

DevOps

Security

Software Development

View all
Explore

Learning Pathways

Events & Webinars

Ebooks & Whitepapers

Customer Stories

Partners

Executive Insights
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories

Topics

Trending

Collections
Enterprise platform
AI-powered developer platform
Available add-ons

GitHub Advanced Security
Enterprise-grade security features

Copilot for business
Enterprise-grade AI features

Premium Support
Enterprise-grade 24/7 support
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 11.8k
Star 64.9k

Code
Issues 1.9k
Pull requests 1.3k
Discussions
Actions
Projects 20
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 77 Milestones 3

Labels 77 Milestones 3

New pull request New

1,274 Open 15,990 Closed

1,274 Open 15,990 Closed

Author

Filter by author

Loading

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Loading

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Loading

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Frontend] [Doc] Exclude log deltas feature frontend

#30322 opened Dec 9, 2025 by Catacomba

Loading…

3 tasks done

1

[BugFix] Spec decode with VLLM_ENABLE_V1_MULTIPROCESSING=0 v1

#30319 opened Dec 9, 2025 by heheda12345

Loading…

5 tasks

3

[Frontend] Allow users to modify part of the scheduler configuration online. frontend v1

#30316 opened Dec 9, 2025 by noooop • Draft

5 tasks

3

Generalize pooling model support with multi-task, multi-layer, multi-label classification that can be pooled from both hidden states and LM head's logits.

#30315 opened Dec 9, 2025 by kflu

Loading…

3 of 5 tasks

9

[fix] fix SM check for Flashinfer TRTLLM MOE nvidia

#30314 opened Dec 9, 2025 by jiahanc

Loading…

5 tasks

2

[Misc][Quantization] Clarify the intent of GGUF FusedMoE weight materialization

#30310 opened Dec 9, 2025 by a4lg

Loading…

1 of 5 tasks

[bugfix][quantization] fix quark qwen3 kv_cache quantization qwen

Related to Qwen models

ONLY add when PR is ready to merge/full CI is needed

#30308 opened Dec 9, 2025 by haoyangli-amd

Loading…

5

[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models qwen

Related to Qwen models

#30307 opened Dec 9, 2025 by a4lg

Loading…

3 of 5 tasks

3

Fix incomplete response generation for tool call outputs deepseek

Related to DeepSeek models

fb-exported frontend meta-exported

#30304 opened Dec 9, 2025 by qandrew • Draft

6

[Misc] Pass reasoning to deepseekV32 tokenizer deepseek

Related to DeepSeek models

frontend

#30302 opened Dec 9, 2025 by kingsmad • Draft

5 tasks

2

[ResponsesAPI] Add GPTOSS MCP tool streaming frontend gpt-oss

Related to GPT-OSS models

#30301 opened Dec 9, 2025 by qandrew

Loading…

[Bugfix] Update WSL detection to check for WSL1 compatibility as WSL2…

#30299 opened Dec 9, 2025 by HoneyBerries

Loading…

6

Main 20251205 amd ci/build ready

ONLY add when PR is ready to merge/full CI is needed

Related to AMD ROCm

#30298 opened Dec 9, 2025 by Alexei-V-Ivanov-AMD

Loading…

7

[Core] Add SLA-tiered scheduling (opt-in) and docs documentation

Improvements or additions to documentation

v1

#30297 opened Dec 9, 2025 by ProdByBuddha

Loading…

3 of 5 tasks

6

[CI/Build][Kernel][BugFix][AMD] Fix per_token_group_quant_fp8 to use correct fp8 min/max values and update atol/rtol in test_quantfp8_group_functionality rocm

Related to AMD ROCm

#30292 opened Dec 9, 2025 by rasmith

Loading…

3

[CI/Build][AMD] Fix ref_dynamic_per_token_quant reference implementation on ROCm. rocm

Related to AMD ROCm

#30291 opened Dec 9, 2025 by rasmith

Loading…

2

[Core] Add token-level KV cache metrics to V1 engine v1

#30289 opened Dec 9, 2025 by Minsung-commit

Loading…

5

Adding quantized fused_moe_lora support

#30286 opened Dec 9, 2025 by yugong333

Loading…

5 tasks

5

Ensure minimum frames for GLM 4.6V compatibility

#30285 opened Dec 9, 2025 by gh-wf

Loading…

1 of 3 tasks

5

[BugFix] Lazy tokenizer init in StructuredOutputManager to prevent GGUF semaphore leak structured-output v1

#30284 opened Dec 9, 2025 by kitaekatt

Loading…

4 tasks

6

[Small] Add comment for parallel_config in FusedMoEModularKernel

#30282 opened Dec 8, 2025 by yewentao256 • Draft

[CI/Build] Ignore data_parallel_size_local

#30281 opened Dec 8, 2025 by rjrock

Loading…

3 of 5 tasks

2

Add moe_align_block_size_no_permute for small batch size with large num_expert needs-rebase

#30280 opened Dec 8, 2025 by RunkaiTao • Draft

5 tasks

4

[CPU][Bugfix] Fix CPU Profiler issue v1

#30278 opened Dec 8, 2025 by zhili03

Loading…

7

[BugFix] Fix non detected failing tests ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#30277 opened Dec 8, 2025 by ilmarkov

Loading…

5 tasks

4

Previous 1 2 3 4 5 … 50 51 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Community
Docs
Contact

You can’t perform that action at this time.