mirror of
https://github.com/wassname/vllm.git
synced 2026-06-28 03:36:08 +08:00
8ceffbf315
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
487 B
487 B
(deployment-lws)=
LWS
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads. A major use case is for multi-host/multi-node distributed inference.
vLLM can be deployed with LWS on Kubernetes for distributed model serving.
Please see this guide for more details on deploying vLLM on Kubernetes using LWS.