[doc] update pipeline parallel in readme (#6347)

This commit is contained in:
youkaichao
2024-07-11 11:38:40 -07:00
committed by GitHub
parent 1df43de9bb
commit 2d23b42d92
2 changed files with 2 additions and 2 deletions
+1 -1
View File
@@ -56,7 +56,7 @@ vLLM is flexible and easy to use with:
- Seamless integration with popular Hugging Face models
- High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
- Tensor parallelism support for distributed inference
- Tensor parallelism and pipieline parallelism support for distributed inference
- Streaming outputs
- OpenAI-compatible API server
- Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs
+1 -1
View File
@@ -38,7 +38,7 @@ vLLM is flexible and easy to use with:
* Seamless integration with popular HuggingFace models
* High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
* Tensor parallelism support for distributed inference
* Tensor parallelism and pipieline parallelism support for distributed inference
* Streaming outputs
* OpenAI-compatible API server
* Support NVIDIA GPUs and AMD GPUs