vllm/csrc/quantization at f09edd8a25d54c48eb804abe391e98d0b85b9ea2 - vllm

mirror of https://github.com/wassname/vllm.git synced 2026-07-04 22:14:06 +08:00

Files

T

Alexander Matveev 6979ade384 Add GPTQ Marlin 2:4 sparse structured support (#4790 )

Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>

2024-05-16 12:56:15 -04:00

AQLM CUDA support (#3287 )

2024-04-23 13:59:33 -04:00

2024-02-12 11:02:17 -08:00

2024-05-09 18:04:17 -06:00

2024-04-11 16:35:51 -04:00

2024-05-16 09:55:29 -04:00

2024-05-16 12:56:15 -04:00

2024-01-03 09:52:29 -08:00