From 2d5cbf1dad44eb6b5c3536de48e9563ef21e6eb9 Mon Sep 17 00:00:00 2001
From: Yu Meng <yumeng5@virginia.edu>
Date: Tue, 9 Jul 2024 14:24:12 -0400
Subject: [PATCH] Update README.md

---
 README.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 4b747f5..bbcf787 100644
--- a/README.md
+++ b/README.md
@@ -29,15 +29,16 @@ We used the following hyperparameters for training the released models (note tha
 | Mistral-Instruct  | 2.5 | 0.1 | 5e-7           |
 | Llama3-Base       | 2.0 | 0.5 | 6e-7           |
 | Llama3-Instruct   | 2.5 | 0.55 | 1e-6           |
+| Llama3-Instruct v0.2   | 10 | 0.3 | 1e-6           |
 
 For DPO, we use the following hyperparameters for training.
 | Setting                  | β | Learning Rate |
 |------------------------|------|---------------|
-| mistral-base           | 0.01 | 5e-7      |
-| mistral-instruct       | 0.01 | 2e-7      |
-| llama3-base            | 0.01 | 5e-7      |
-| llama3-instruct        | 0.01 | 7e-7      |
-| llama3-instruct v0.2   | 0.01 | 3e-7      |
+| Mistral-Base           | 0.01 | 5e-7      |
+| Mistral-Instruct       | 0.01 | 2e-7      |
+| Llama3-Base            | 0.01 | 5e-7      |
+| Llama3-Instruct        | 0.01 | 7e-7      |
+| Llama3-Instruct v0.2   | 0.01 | 3e-7      |
 
 
 ### Training and evaluation consistency in BOS