wassname/vllm - vllm - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/vllm.git synced 2026-06-27 18:27:02 +08:00

Author	SHA1	Message	Date
Aaron Pham	21063c11c7	[CI/Build] drop support for Python 3.8 EOL (#8464 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2024-11-06 07:11:55 +00:00
bnellnm	d93478b399	[Bugfix] Upgrade to pytorch 2.5.1 (#10001 ) Signed-off-by: Bill Nell <bill@neuralmagic.com>	2024-11-04 15:11:28 -08:00
bnellnm	3cb07a36a2	[Misc] Upgrade to pytorch 2.5 (#9588 ) Signed-off-by: Bill Nell <bill@neuralmagic.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-10-27 09:44:24 +00:00
Daniele	203ab8f80f	[CI/Build] setuptools-scm fixes (#8900 )	2024-10-14 11:34:47 -07:00
Daniele	2467b642dd	[CI/Build] fix setuptools-scm usage (#8771 )	2024-09-24 12:38:12 -07:00
Daniele	ee5f34b1c2	[CI/Build] use setuptools-scm to set __version__ (#4738 ) Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-23 09:44:26 -07:00
Aaron Pham	9d104b5beb	[CI/Build] Update Ruff version (#8469 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-18 11:00:56 +00:00
Cyrus Leung	a84e598e21	[CI/Build] Reorganize models tests (#7820 )	2024-09-13 10:20:06 -07:00
Cyrus Leung	8427550488	[CI/Build] Update pixtral tests to use JSON (#8436 )	2024-09-13 03:47:52 +00:00
Jonathan Berkhahn	9c71c97ae2	[mypy] Enable mypy type checking for `vllm/core` (#7229 )	2024-08-28 07:11:14 +08:00
sasha0552	dd3fa0e430	[Bugfix] Mirror jinja2 in pyproject.toml (#7723 )	2024-08-21 13:41:17 +00:00
Cyrus Leung	baaedfdb2d	[mypy] Enable following imports for entrypoints (#7248 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: Fei <dfdfcai4@gmail.com>	2024-08-20 23:28:21 -07:00
Daniele	774cd1d3bf	[CI/Build] bump minimum cmake version (#6999 )	2024-08-12 16:29:20 -07:00
Sage Moore	7e0861bd0b	[CI/Build] Update PyTorch to 2.4.0 (#6951 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-08-01 11:11:24 -07:00
Simon Mo	7eb0cb4a14	Revert "[Frontend] Factor out code for running uvicorn" (#7012 ) Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>	2024-07-31 16:34:26 -07:00
Cyrus Leung	da1f7cc12a	[mypy] Enable following imports for some directories (#6681 )	2024-07-31 10:38:03 +08:00
Cody Yu	aa48e502fb	[MISC] Upgrade dependency to PyTorch 2.3.1 (#5327 )	2024-07-12 12:04:26 -07:00
Cyrus Leung	9d47f64eb6	[CI/Build] [3/3] Reorganize entrypoints tests (#5966 )	2024-06-30 12:58:49 +08:00
Roger Wang	4ad7b53e59	[CI/Build][Misc] Update Pytest Marker for VLMs (#5623 )	2024-06-18 13:10:04 +00:00
Cyrus Leung	89c920785f	[CI/Build] Update vision tests (#5307 )	2024-06-06 05:17:18 -05:00
Tyler Michael Smith	260d119e86	[Kernel] Refactor CUTLASS kernels to always take scales that reside on the GPU (#5137 )	2024-06-01 06:45:32 +00:00
Cyrus Leung	5ae5ed1e60	[Core] Consolidate prompt arguments to LLM engines (#4328 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-05-28 13:29:31 -07:00
Michael Goin	757b62c495	[CI/Build] Codespell ignore `build/` directory (#4945 )	2024-05-21 09:06:10 -07:00
SangBin Cho	2e9a2227ec	[Lora] Support long context lora (#4787 ) Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files	2024-05-18 16:05:23 +09:00
Michael Goin	d627a3d837	[Misc] Upgrade to `torch==2.3.0` (#4454 )	2024-04-29 20:05:47 -04:00
SangBin Cho	a88081bf76	[CI] Disable non-lazy string operation on logging (#4326 ) Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>	2024-04-26 00:16:58 -07:00
SangBin Cho	0ae11f78ab	[Mypy] Part 3 fix typing for nested directories for most of directory (#4161 )	2024-04-22 21:32:44 -07:00
SangBin Cho	09473ee41c	[mypy] Add mypy type annotation part 1 (#4006 )	2024-04-12 14:35:50 -07:00
youkaichao	ca81ff5196	[Core] manage nccl via a pypi package & upgrade to pt 2.2.1 (#3805 )	2024-04-04 10:26:19 -07:00
Adrian Abeyta	2ff767b513	Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU) (#3290 ) Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: HaiShaw <hixiao@gmail.com> Co-authored-by: AdrianAbeyta <Adrian.Abeyta@amd.com> Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com> Co-authored-by: root <root@gt-pla-u18-08.pla.dcgpu> Co-authored-by: mawong-amd <156021403+mawong-amd@users.noreply.github.com> Co-authored-by: ttbachyinsda <ttbachyinsda@outlook.com> Co-authored-by: guofangze <guofangze@kuaishou.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: jacobthebanana <50071502+jacobthebanana@users.noreply.github.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-04-03 14:15:55 -07:00
Roger Wang	45b6ef6513	feat(benchmarks): Add Prefix Caching Benchmark to Serving Benchmark (#3277 )	2024-03-27 13:39:26 -07:00
SangBin Cho	01bfb22b41	[CI] Try introducing isort. (#3495 )	2024-03-25 07:59:47 -07:00
bnellnm	9fdf3de346	Cmake based build system (#2830 )	2024-03-18 15:38:33 -07:00
Ronen Schaffer	14e3f9a1b2	Replace `lstrip()` with `removeprefix()` to fix Ruff linter warning (#2958 )	2024-03-15 21:01:30 -07:00
Zhuohan Li	2f8844ba08	Re-enable the 80 char line width limit (#3305 )	2024-03-10 19:49:14 -07:00
Massimiliano Pronesti	93dc5a2870	chore(vllm): codespell for spell checking (#2820 )	2024-02-21 18:56:01 -08:00
Woosuk Kwon	b0a1d667b0	Pin PyTorch & xformers versions (#2155 )	2023-12-17 01:46:54 -08:00
Woosuk Kwon	f3e024bece	[CI/CD] Upgrade PyTorch version to v2.1.1 (#2045 )	2023-12-11 17:48:11 -08:00
Allen	f07c1ceaa5	[FIX] Fix docker build error (#1831 ) (#1832 ) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2023-11-29 23:06:50 -08:00
Simon Mo	5ffc0d13a2	Migrate linter from `pylint` to `ruff` (#1665 )	2023-11-20 11:58:01 -08:00
Zhuohan Li	06458a0b42	Upgrade to CUDA 12 (#1527 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2023-11-08 14:17:49 -08:00
yanxiyue	6a6119554c	lock torch version to 2.0.1 (#1290 )	2023-10-10 09:21:57 -07:00
Woosuk Kwon	376725ce74	[PyPI] Packaging for PyPI distribution (#140 )	2023-06-05 20:03:14 -07:00

43 Commits