Update README.md

2026-06-27 20:02:58 +08:00 · 2024-07-09 14:43:23 -04:00
parent b1be711212
commit 1da92c59cc
1 changed files with 7 additions and 0 deletions
@@ -4,8 +4,15 @@ This repository contains the code and released models for our paper [SimPO: Simp

 <img src="./SimPO.png" width="1000px"></img>

+## 🆕 Changelog 
+- [2024.07.08] We updated our paper ([v2](https://arxiv.org/abs/2405.14734v2))
+  - Additional baselines (RRHF, SLiC-HF, CPO) 
+  - New Llama3-Instruct setting (v0.2) with [ArmoRM](https://huggingface.co/RLHFlow/ArmoRM-Llama3-8B-v0.1) as the preference label annotator, yielding better-performing models!
+  - [SimPO trainer](https://github.com/princeton-nlp/SimPO/blob/main/scripts/simpo_trainer.py) update for better reproducibility
+
 ## 🔗 Quick Links
 - [SimPO: Simple Preference Optimization with a Reference-Free Reward](#simple-preference-optimization-simpo)
+  - [Changelog](#-changelog)
  - [Tips for Running SimPO](#tips-for-running-simpo)
  - [Released Models](#released-models)
  - [Install Requirements](#install-requirements)