feat: support long seq when pdfusion #387

kitaharatomoyo · 2025-11-20T09:41:49Z

sync in dp group, when all dp group does not contain prefill query, then all dp node forward with cuda_graph;
when pd fusion, moe model prefill with deepep normal and decode with deepep low latency

feat: support long seq when pdfusion

53c1636

kitaharatomoyo requested a review from LLLLKKKK as a code owner November 20, 2025 09:41

kitaharatomoyo added 2 commits November 21, 2025 10:02

fix: compile error

c6a8b85

fix: distributed environment is not initialized

6973eae

kitaharatomoyo mentioned this pull request Nov 21, 2025

feat: use decode fake req rather than prefill fake req #379

Open

Provide feedback