vllm/csrc at 00efdc84baf313cb775ca99a011b0e9a13539bdd - vllm - Gitea: Git with a cup of tea

wassname/vllm

mirror of https://github.com/wassname/vllm.git synced 2026-07-02 03:35:18 +08:00

Files

T

History

Woosuk Kwon 6ef00b03a2 Enable CUDA graph for GPTQ & SqueezeLLM (#2318 )

2024-01-03 09:52:29 -08:00

..

[FIX] Support non-zero CUDA devices in custom kernels (#1959 )

2024-01-02 19:09:59 -08:00

Enable CUDA graph for GPTQ & SqueezeLLM (#2318 )

2024-01-03 09:52:29 -08:00

activation_kernels.cu

[FIX] Support non-zero CUDA devices in custom kernels (#1959 )

2024-01-02 19:09:59 -08:00

cache_kernels.cu

[FIX] Support non-zero CUDA devices in custom kernels (#1959 )

2024-01-02 19:09:59 -08:00

cache.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

cuda_compat.h

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cuda_utils_kernels.cu

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00

cuda_utils.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

dispatch_utils.h

Avoid multiple redefinition (#1817 )

2023-12-14 09:35:58 -08:00

layernorm_kernels.cu

[FIX] Support non-zero CUDA devices in custom kernels (#1959 )

2024-01-02 19:09:59 -08:00

ops.h

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

pos_encoding_kernels.cu

[FIX] Support non-zero CUDA devices in custom kernels (#1959 )

2024-01-02 19:09:59 -08:00

pybind.cpp

Add GPTQ support (#916 )

2023-12-15 03:04:22 -08:00

reduction_utils.cuh

Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 )

2023-12-07 23:16:52 -08:00