vllm/benchmarks/kernels at b7050ca7df640326f53e89f518f3ee045dfbbdef - vllm - Gitea: Git with a cup of tea

wassname/vllm

mirror of https://github.com/wassname/vllm.git synced 2026-07-02 06:49:20 +08:00

Files

T

History

youkaichao 8fe8386591 [Kernel] change benchmark script so that result can be directly used; tune moe kernel in A100/H100 with tp=2,4,8 (#3389 )

2024-03-14 08:11:48 +00:00

..

benchmark_mixtral_moe.py

[Kernel] change benchmark script so that result can be directly used; tune moe kernel in A100/H100 with tp=2,4,8 (#3389 )

2024-03-14 08:11:48 +00:00

benchmark_paged_attention.py

Remove hardcoded device="cuda" to support more devices (#2503 )

2024-02-01 15:46:39 -08:00

benchmark_rope.py

Add batched RoPE kernel (#3095 )

2024-03-13 13:45:26 -07:00