Update README.md

2026-06-27 16:14:08 +08:00 · 2023-04-09 12:50:49 +08:00
parent 4c18a56fc0
commit f185b90c3e
1 changed files with 1 additions and 0 deletions
@@ -33,6 +33,7 @@ It's fast on a 3070 Ti mobile.  Uses 5-6 GB of GPU RAM.
 * Added monkey patch for text generation webui for fixing initial eos token issue.
 * Added Flash attention support. (Use --flash-attention)
 * Added Triton backend to support model using groupsize and act-order. (Use --backend=triton)
+* Added g_idx support in cuda backend (need recompile cuda kernel)

 # Requirements
 gptq-for-llama <br>