mirror of
https://github.com/wassname/vllm.git
synced 2026-06-29 22:35:50 +08:00
8674f9880e
Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs