NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 9.8k

Code
Issues 436
Pull requests 124
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 35 Milestones 0

New pull request New

124 Open 509 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

test: add random image test for llama-3.2-11b-vision

#3055 opened Mar 25, 2025 by crazydemo

Loading…

feat: Add EXAONE-Deep

#3054 opened Mar 25, 2025 by yechank-nvidia

Loading…

fix: Set correct draft_token_nums to dummy requests for torch compilation with MTP

#3053 opened Mar 25, 2025 by HuiGao-NV

Loading…

doc: Update DeepSeekV3 doc

#3052 opened Mar 25, 2025 by xiaoweiw-nv

Loading…

chore: upgrade transformers to 4.50.0

#3051 opened Mar 25, 2025 by achartier

Loading…

fix: AllReduce CUDA Graph Fix

#3049 opened Mar 25, 2025 by yizhang-nv • Draft

feat: Support prequantized fp8 ckpt for nemotron-mini-4b-instruct

#3046 opened Mar 24, 2025 by brb-nv

Loading…

Feat: Support Linear block scale layout in FP4 quantization

#3045 opened Mar 24, 2025 by yibinl-nvidia

Loading…

feat: Pytorch PP + attention DP support

#3044 opened Mar 24, 2025 by achartier

Loading…

chore: Add second possible output for llava

#3043 opened Mar 24, 2025 by amukkara

Loading…

fix: [NVBUG 5087143] Fix vila test

#3042 opened Mar 24, 2025 by yuanjings-nvda • Draft

perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench

#3041 opened Mar 24, 2025 by suyoggupta

Loading…

perf: Readd iteration logging for trtllm-bench.

#3039 opened Mar 24, 2025 by FrankD412

Loading…

chore: [TRTLLM-3694] Move functional args to llmargs

#3036 opened Mar 24, 2025 by hchings

Loading…

feat: Add initial EAGLE-3 implementation

#3035 opened Mar 24, 2025 by mikeiovine

Loading…

feat: Unify two versions of allreduce custom op

#3032 opened Mar 24, 2025 by yukunh-nvidia

Loading…

infra: [CI] - Only checkout the Git sourcecodes once in the CI pipeline

#3029 opened Mar 24, 2025 by chzblych

Loading…

feat: allocate minimal blocks per window size

#3028 opened Mar 24, 2025 by netanel-haber • Draft

test: [TRTLLM-4000] Port multi GPU changes to GitHub CI

Any issue relates with CI testing

#3027 opened Mar 24, 2025 by DomBrown

Loading…

feat: Draft/lora_modules_support

#3026 opened Mar 24, 2025 by danielafrimi • Draft

chore: refactor the LlmArgs for better api reference doc and easier iterations

#3025 opened Mar 24, 2025 by Superjomn • Draft

refactor: Remove speculative decoding parameters from stateful decoders

#3024 opened Mar 24, 2025 by Funatiq

Loading…

fix: disable KV cache reuse if using attention sink

#3021 opened Mar 24, 2025 by Funatiq

Loading…

Support cos_sin_cache in all cases.

#3020 opened Mar 24, 2025 by yuxianq

Loading…

fix: creating output of dataset generator in current directory bug

Something isn't working

#3018 opened Mar 24, 2025 by hypdeb

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly