mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 17:29:09 +08:00

Files

T

Loubna Ben Allal 73dce0c35d Add Smollm (#194 )

* add smollm

* add to news

2024-08-19 08:47:20 +02:00

1.1 KiB

Raw Blame History

Instructions to train SmolLM-Instruct

We build the SmolLM-Instruct (v0.2) models (135M, 360M and 1.7B) by doing SFT on a mix of these datasets:

a dataset of 2k simple everyday conversations we generated by llama3.1-70B everyday-conversations-llama3.1-2k
Magpie-Pro-300K-Filtered
StarCoder2-Self-OSS-Instruct
A small subset of OpenHermes-2.5

Setup

Follow the installation instructions in https://github.com/huggingface/alignment-handbook/tree/main?tab=readme-ov-file#installation-instructions

Training

We train the models on 8 GPUs using the following command:

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py recipes/smollm/sft/config.yaml

1.1 KiB Raw Blame History

Instructions to train SmolLM-Instruct

Setup

Training

1.1 KiB

Raw Blame History