mirror of
https://github.com/wassname/alpaca_convert.git
synced 2026-06-27 16:14:08 +08:00
039af1a0dbf01059c53ea444050248c7f72ddf98
My personal repo to convert models from Lora to huggingface/ggml/gptq 4bit so I can run them in normal text-webui and llama.cpp
How do we do this?
- lora -> hf
- hf -> 4bit
- using GPTQ-for-LLaMa/llama.py
CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-hf/llama-7b c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save llama7b-4bit-128g.pt
- using GPTQ-for-LLaMa/llama.py
- 4bit -> ggml
TODO
- lora -> hf
- test this
- hf -> 4bit
- 4bit to -> ggml
setup env
conda create -n textgen3 python=3.10.9
conda activate textgen3
mamba install pytorch torchvision torchaudio pytorch-cuda=11.7 cudatoolkit-dev==11.7 cudatoolkit=11.7 -c pytorch -c nvidia -c conda-forge
download models
# # base models.... FIXME
# download loras
python scripts/download-model.py chansung/alpaca-lora-30b
python scripts/download-model.py chansung/alpaca-lora-13b
python scripts/download-model.py tloen/alpaca-lora-7b
convert models
python scripts/export_hf_checkpoint.py ./models/llama-7b-hf -l loras/tloen_alpaca-lora-7b
Links
Description
Languages
Python
95.1%
Jupyter Notebook
4.9%