vllm-project / production-stack Public

Notifications You must be signed in to change notification settings
Fork 125
Star 943

Code
Issues 40
Pull requests 22
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: vllm-project/production-stack

[Roadmap] vLLM Production Stack roadmap for 2025 Q2

#300 opened Mar 17, 2025 by YuhanLiu11

Open 2

[Roadmap] vLLM production stack roadmap for 2025 Q1

#26 opened Jan 27, 2025 by ApostaC

Open 21

Labels 10 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40 Open 62 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

feature: Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S feature request

New feature or request

#332 opened Mar 28, 2025 by TYsonHe

bug: dynamically serving LoRA Adapters is not working bug

Something isn't working

#331 opened Mar 28, 2025 by robert-moyai

feature: Add Per-Model Metrics Aggregation to Router feature request

New feature or request

#322 opened Mar 26, 2025 by HarelKeren

bug: No backend metrics when querying vllm-router /metrics endpoint bug

Something isn't working

#318 opened Mar 24, 2025 by ghpu

bug: vLLM pod crashes if remote shared storage takes time to bootstrap bug

Something isn't working

#310 opened Mar 20, 2025 by Stephen-X

feature: using no GIL or Granian to improve router performance feature request

New feature or request

#309 opened Mar 19, 2025 by YuhanLiu11

Helm template improvements feature request

New feature or request

#307 opened Mar 19, 2025 by chosey85

Question-Is there any solution to load a local model? question

Further information is requested

#304 opened Mar 18, 2025 by Cangxihui

[Roadmap] vLLM Production Stack roadmap for 2025 Q2

#300 opened Mar 17, 2025 by YuhanLiu11

2 of 27 tasks

Question about tolerations in servingEngineSpec question

Further information is requested

#294 opened Mar 17, 2025 by hongkunyoo

The output got stuck when I configured lmcacheConfig in yaml bug

Something isn't working

#292 opened Mar 17, 2025 by Cangxihui

feature: expanding the OpenAI API for richer functionalities feature request

New feature or request

#278 opened Mar 13, 2025 by KuntaiDu

Wrong shm size in deployment pod when inference model with multi gpu bug

Something isn't working

#276 opened Mar 13, 2025 by tuanhm-3488

feature: Terraform tutorial for MS Azure feature request

New feature or request

#271 opened Mar 12, 2025 by falconlee236

Question - pinning GPUs question

Further information is requested

#270 opened Mar 12, 2025 by chosey85

feature: support for creating ServiceMonitor directly from helm chart feature request

New feature or request

#267 opened Mar 12, 2025 by Hexoplon

feature: support for Kubernetes Gateway Inference Extensions feature request

New feature or request

#265 opened Mar 12, 2025 by AndresGuedez

feature: Optimize vLLM production-stack for agentic workflows (BeeAI, MCP) via KV-cache reuse and context-aware routing feature request

New feature or request

#244 opened Mar 7, 2025 by wenboown

feature: Support LoRA loading for model deployments feature request

New feature or request

#205 opened Mar 1, 2025 by ApostaC

feature: Support CRD based configuration feature request

New feature or request

#204 opened Mar 1, 2025 by rootfs

[WIP, RFC] Production Stack on Ray Serve discussion

#195 opened Feb 27, 2025 by Hanchenli

bug: File Access Error with vllm using runai_streamer on OCP bug

Something isn't working

#193 opened Feb 27, 2025 by TamKez

feature: unify naming of production-stack, vllm-stack and vllm-router discussion feature request

New feature or request

#178 opened Feb 25, 2025 by bufferoverflow

feature: Terraform Quickstart Tutorials for Underlying Infrastructure feature request

New feature or request

#167 opened Feb 21, 2025 by 0xThresh

Discussion - QPS routing when there are multiple router replicas discussion question

Further information is requested

#166 opened Feb 21, 2025 by aishwaryaraimule21

Previous 1 2 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly