Commit Graph

10 Commits

Author SHA1 Message Date
Woosuk Kwon e070829ae8 Support bfloat16 data type (#54) 2023-05-03 14:09:44 -07:00
Zhuohan Li 27f1410d06 New weight loader without np copy (#52) 2023-05-03 15:32:04 +08:00
Zhuohan Li 4858f3bb45 Add an option to launch cacheflow without ray (#51) 2023-04-30 15:42:17 +08:00
Woosuk Kwon 0f4b32199e Support various block sizes & Change default block size to 16 (#38) 2023-04-15 09:03:24 -07:00
Woosuk Kwon 84eee24e20 Collect system stats in scheduler & Add scripts for experiments (#30) 2023-04-12 15:03:49 -07:00
Woosuk Kwon b9926f7f66 Support block size 32 (#35) 2023-04-09 23:07:18 -07:00
Woosuk Kwon ee88a7e5f3 Add an option to use dummy model weights (#33) 2023-04-08 23:36:12 -07:00
Woosuk Kwon 12659a0bd7 Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
Woosuk Kwon 7a7929abe8 Implement preemption via recomputation & Refactor scheduling logic (#12) 2023-03-30 14:51:46 -07:00
Zhuohan Li 721fa3df15 FastAPI-based working frontend (#10) 2023-03-29 14:48:56 +08:00