From 4b0769d13721b7288a0bb0412e0107196a9cd862 Mon Sep 17 00:00:00 2001
From: Lewis Tunstall <lewis.c.tunstall@gmail.com>
Date: Thu, 9 Nov 2023 14:42:57 +0000
Subject: [PATCH] Fix links

---
 recipes/zephyr-7b/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/recipes/zephyr-7b/README.md b/recipes/zephyr-7b/README.md
index 02746a1..4e783ad 100644
--- a/recipes/zephyr-7b/README.md
+++ b/recipes/zephyr-7b/README.md
@@ -3,8 +3,8 @@
 
 As described in the Zephyr [technical report](https://huggingface.co/papers/2310.16944), training this model proceeds in two steps:
 
-1. Apply SFT to fine-tune Mistral 7B on the UltraChat dataset.
-2. Align the SFT model to AI feedback via DPO on the UltraFeedback dataset.
+1. Apply SFT to fine-tune Mistral 7B on a filtered version of the UltraChat dataset ([link](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)).
+2. Align the SFT model to AI feedback via DPO on a preprocessed version of the UltraFeedback dataset ([link](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)).
 
 See below for commands to train these models using either DeepSpeed ZeRO-3 or LoRA.