vllm/csrc/quantization/gptq at bd7a8eef25cd85be7eb9f2a94fd752d27ee7dce3 - vllm - Gitea: Git with a cup of tea

wassname/vllm

mirror of https://github.com/wassname/vllm.git synced 2026-07-05 08:32:44 +08:00

Files

T

History

Antoni Baum a10d3056da [Core] Set linear_weights directly on the layer (#3977 )

2024-04-11 16:35:51 -04:00

..

compat.cuh

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

matrix_view.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

q_gemm.cu

[Core] Set linear_weights directly on the layer (#3977 )

2024-04-11 16:35:51 -04:00

qdq_2.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_3.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_4.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_8.cuh

Add Support for 2/3/8-bit GPTQ Quantization Models (#2330 )

2024-02-28 21:52:23 -08:00

qdq_util.cuh

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00