Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Feature/refactor modules
#435 opened Dec 5, 2025 by alibaba-miji Loading…
[draft]feat: refactor raw endpoint
#431 opened Dec 4, 2025 by wanglining97 Loading…
[draft] master 0.0.2
#427 opened Dec 4, 2025 by jianglan89 Loading…
feat: support pure tp + ep for cuda graph
#425 opened Dec 3, 2025 by JackTan25 Loading…
support fp8 fmha for rocm pymodel
#422 opened Dec 3, 2025 by liaocz Loading…
refactor: optimize token reorder impl
#419 opened Dec 2, 2025 by MMadhatter Loading…
feature - add viztracer for inference api
#417 opened Dec 2, 2025 by jianglan89 Loading…
Support DeepSeek v3.2 encoding module
#415 opened Dec 2, 2025 by soaringk Loading…
refactor gemm
#414 opened Dec 1, 2025 by fff-2013 Loading…
fix: fix max context batch size
#412 opened Nov 28, 2025 by JackTan25 Loading…
Performance Optimization for Beam Search
#411 opened Nov 28, 2025 by zhangjianning-zjn Loading…
fix: hold host buffer util next forward
#410 opened Nov 28, 2025 by Vinkle-hzt Loading…
feat: handle cuda oom error for py model
#406 opened Nov 26, 2025 by MMadhatter Loading…
fix: fix custom_ar bug for rocm
#402 opened Nov 25, 2025 by liaocz Loading…
feat: update ar & fuse mla reuse cache
#399 opened Nov 25, 2025 by Nancheng-11 Loading…
fix: remove fallback and fastgen
#393 opened Nov 24, 2025 by xinfei-shi Loading…
feat: raw request support logprobs
#390 opened Nov 22, 2025 by yinjuncheng Loading…
feat: support long seq when pdfusion
#387 opened Nov 20, 2025 by kitaharatomoyo Loading…
ProTip! Updated in the last three days: updated:>2025-12-03.