mirror of
https://github.com/wassname/vllm.git
synced 2026-07-05 10:35:32 +08:00
8674f9880e
Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs