From f025057ce4890deae5c59f1bd32d463ba57cf797 Mon Sep 17 00:00:00 2001 From: Thomas Capelle Date: Tue, 21 Nov 2023 12:00:09 +0100 Subject: [PATCH] Missing config params on SFT (#31) * fix warmup with total number of steps * Explicitely tell to use 80GB Gpus * Revert "fix warmup with total number of steps" This reverts commit 760e477efdbf7f67be766a0d43b0c3b2ac26947a. --- recipes/zephyr-7b-beta/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/recipes/zephyr-7b-beta/README.md b/recipes/zephyr-7b-beta/README.md index d0a59cd..836bcc3 100644 --- a/recipes/zephyr-7b-beta/README.md +++ b/recipes/zephyr-7b-beta/README.md @@ -9,7 +9,7 @@ As described in the Zephyr [technical report](https://huggingface.co/papers/2310 See below for commands to train these models using either DeepSpeed ZeRO-3 or LoRA. ## Full training examples - +You will require 8 GPUs (80GB of VRAM) to train the full model. ```shell # Step 1 - SFT ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py recipes/zephyr-7b-beta/sft/config_full.yaml