mirror of
https://github.com/wassname/vllm.git
synced 2026-07-05 02:01:13 +08:00
8674f9880e
Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs