vllm/cacheflow/master at 0f40557af6141ced118b81f2a04e651a0c6c9dbd - vllm - Gitea: Git with a cup of tea

wassname/vllm

mirror of https://github.com/wassname/vllm.git synced 2026-07-04 21:40:09 +08:00

Files

T

History

Woosuk Kwon 12659a0bd7 Add CUDA graph-based all reduce launcher (#26 )

2023-04-05 11:16:57 -07:00

..

block_manager.py

Implement preemption via recomputation & Refactor scheduling logic (#12 )

2023-03-30 14:51:46 -07:00

policy.py

Implement preemption via recomputation & Refactor scheduling logic (#12 )

2023-03-30 14:51:46 -07:00

scheduler.py

Implement preemption via recomputation & Refactor scheduling logic (#12 )

2023-03-30 14:51:46 -07:00

server.py

Add CUDA graph-based all reduce launcher (#26 )

2023-04-05 11:16:57 -07:00

simple_frontend.py

Implement preemption via recomputation & Refactor scheduling logic (#12 )

2023-03-30 14:51:46 -07:00