From 65a7ac97d57885a44b2e02f02072192f822980b0 Mon Sep 17 00:00:00 2001 From: Yu Meng Date: Tue, 9 Jul 2024 16:40:57 -0400 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index eb0d30c..688637a 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ For DPO, we use the following hyperparameters for training. | Setting | β | Learning Rate | |------------------------|------|---------------| | Mistral-Base | 0.01 | 5e-7 | -| Mistral-Instruct | 0.01 | 2e-7 | +| Mistral-Instruct | 0.01 | 5e-7 | | Llama3-Base | 0.01 | 5e-7 | | Llama3-Instruct | 0.01 | 7e-7 | | Llama3-Instruct v0.2 | 0.01 | 3e-7 |