wassname/vllm - vllm - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/vllm.git synced 2026-06-29 12:09:14 +08:00

Author	SHA1	Message	Date
Wallas Henrique	c0292211ce	[CI/Build] Replaced some models on tests for smaller ones (#9570 ) Signed-off-by: Wallas Santos <wallashss@ibm.com>	2024-10-22 04:52:14 +00:00
Michael Goin	e165528778	[CI] Move quantization cpu offload tests out of fastcheck (#7574 )	2024-08-15 21:16:20 -07:00
Michael Goin	f9a5600649	[Bugfix] Fix GPTQ and GPTQ Marlin CPU Offloading (#7225 )	2024-08-06 18:34:26 -07:00
Michael Goin	460c1884e3	[Bugfix] Support cpu offloading with fp8 quantization (#6960 )	2024-07-31 12:47:46 -07:00
Matt Wong	06d6c5fe9f	[Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543 )	2024-07-20 09:39:07 -07:00
youkaichao	f53b8f0d05	[ci][test] add correctness test for cpu offloading (#6549 )	2024-07-18 23:41:06 +00:00