Update README.md

2026-06-27 18:41:04 +08:00 · 2024-07-10 00:33:48 -04:00
parent 3e5532e0b2
commit e7186a8134
1 changed files with 1 additions and 1 deletions
@@ -38,7 +38,7 @@ We used the following hyperparameters for training the released models (note tha
 | Llama3-Instruct   | 2.5 | 0.55 | 1e-6           |
 | Llama3-Instruct v0.2   | 10 | 0.3 | 1e-6           |

-For DPO, we use the following hyperparameters for training.
+For DPO, the best hyperparameters for each setting are as follows.
 | Setting                  | β | Learning Rate |
 |------------------------|------|---------------|
 | Mistral-Base           | 0.01 | 5e-7      |