vllm/cacheflow/models at e9d3f2ff7772c8efe41dc805cec71c223ec18ec8 - vllm - Gitea: Git with a cup of tea

wassname/vllm

mirror of https://github.com/wassname/vllm.git synced 2026-06-27 20:39:39 +08:00

Files

T

History

Woosuk Kwon e9d3f2ff77 Add memory analyzer & utomatically configure KV cache size (#6 )

2023-03-11 23:23:14 -08:00

..

__init__.py

Add memory analyzer & utomatically configure KV cache size (#6 )

2023-03-11 23:23:14 -08:00

attention.py

Fix a bug in 1D input shape (#5 )

2023-03-06 10:05:27 -08:00

input_metadata.py

Support beam search & parallel generation (#7 )

2023-03-10 09:58:21 -08:00

memory_analyzer.py

Add memory analyzer & utomatically configure KV cache size (#6 )

2023-03-11 23:23:14 -08:00

model_utils.py

Add memory analyzer & utomatically configure KV cache size (#6 )

2023-03-11 23:23:14 -08:00

opt.py

Support beam search & parallel generation (#7 )

2023-03-10 09:58:21 -08:00

sample.py

Support beam search & parallel generation (#7 )

2023-03-10 09:58:21 -08:00

utils.py

Add memory analyzer & utomatically configure KV cache size (#6 )

2023-03-11 23:23:14 -08:00