-
Notifications
You must be signed in to change notification settings - Fork 125
Issues: vllm-project/production-stack
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
feature: Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S
feature request
New feature or request
#332
opened Mar 28, 2025 by
TYsonHe
bug: dynamically serving LoRA Adapters is not working
bug
Something isn't working
#331
opened Mar 28, 2025 by
robert-moyai
feature: Add Per-Model Metrics Aggregation to Router
feature request
New feature or request
#322
opened Mar 26, 2025 by
HarelKeren
bug: No backend metrics when querying vllm-router /metrics endpoint
bug
Something isn't working
#318
opened Mar 24, 2025 by
ghpu
bug: vLLM pod crashes if remote shared storage takes time to bootstrap
bug
Something isn't working
#310
opened Mar 20, 2025 by
Stephen-X
feature: using no GIL or Granian to improve router performance
feature request
New feature or request
#309
opened Mar 19, 2025 by
YuhanLiu11
Helm template improvements
feature request
New feature or request
#307
opened Mar 19, 2025 by
chosey85
Question-Is there any solution to load a local model?
question
Further information is requested
#304
opened Mar 18, 2025 by
Cangxihui
[Roadmap] vLLM Production Stack roadmap for 2025 Q2
#300
opened Mar 17, 2025 by
YuhanLiu11
2 of 27 tasks
Question about Further information is requested
tolerations
in servingEngineSpec
question
#294
opened Mar 17, 2025 by
hongkunyoo
The output got stuck when I configured lmcacheConfig in yaml
bug
Something isn't working
#292
opened Mar 17, 2025 by
Cangxihui
feature: expanding the OpenAI API for richer functionalities
feature request
New feature or request
#278
opened Mar 13, 2025 by
KuntaiDu
Wrong shm size in deployment pod when inference model with multi gpu
bug
Something isn't working
#276
opened Mar 13, 2025 by
tuanhm-3488
feature: Terraform tutorial for MS Azure
feature request
New feature or request
#271
opened Mar 12, 2025 by
falconlee236
Question - pinning GPUs
question
Further information is requested
#270
opened Mar 12, 2025 by
chosey85
feature: support for creating ServiceMonitor directly from helm chart
feature request
New feature or request
#267
opened Mar 12, 2025 by
Hexoplon
feature: support for Kubernetes Gateway Inference Extensions
feature request
New feature or request
#265
opened Mar 12, 2025 by
AndresGuedez
feature: Optimize vLLM production-stack for agentic workflows (BeeAI, MCP) via KV-cache reuse and context-aware routing
feature request
New feature or request
#244
opened Mar 7, 2025 by
wenboown
feature: Support LoRA loading for model deployments
feature request
New feature or request
#205
opened Mar 1, 2025 by
ApostaC
feature: Support CRD based configuration
feature request
New feature or request
#204
opened Mar 1, 2025 by
rootfs
bug: File Access Error with vllm using runai_streamer on OCP
bug
Something isn't working
#193
opened Feb 27, 2025 by
TamKez
feature: unify naming of production-stack, vllm-stack and vllm-router
discussion
feature request
New feature or request
#178
opened Feb 25, 2025 by
bufferoverflow
feature: Terraform Quickstart Tutorials for Underlying Infrastructure
feature request
New feature or request
#167
opened Feb 21, 2025 by
0xThresh
Discussion - QPS routing when there are multiple router replicas
discussion
question
Further information is requested
#166
opened Feb 21, 2025 by
aishwaryaraimule21
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.