wassname/vllm - vllm - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/vllm.git synced 2026-06-28 04:40:24 +08:00

Author	SHA1	Message	Date
youkaichao	e893795443	[2/N] executor pass the complete config to worker/modelrunner (#9938 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2024-11-02 07:35:05 -07:00
wangshuai09	3ddbe25502	[Hardware][CPU] using current_platform.is_cpu (#9536 )	2024-10-22 00:50:43 -07:00
Cyrus Leung	0455c46ed4	[Core] Factor out common code in `SequenceData` and `Sequence` (#8675 )	2024-09-21 02:30:39 +00:00
sroy745	3118f63385	[Bugfix] [Encoder-Decoder] Bugfix for encoder specific metadata construction during decode of encoder-decoder models. (#8545 )	2024-09-19 02:24:15 +00:00
Aaron Pham	9d104b5beb	[CI/Build] Update Ruff version (#8469 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-18 11:00:56 +00:00
sroy745	1009e93c5d	[Encoder decoder] Add cuda graph support during decoding for encoder-decoder models (#7631 )	2024-09-17 07:35:01 -07:00
SangBin Cho	ff7ec82c4d	[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )	2024-08-18 17:57:20 -07:00
afeldman-nm	fd95e026e0	[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 ) Co-authored-by: Andrew Feldman <afeld2012@gmail.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2024-08-06 16:51:47 -04:00