mirror of
https://github.com/wassname/SimPO.git
synced 2026-06-27 20:19:50 +08:00
@@ -33,12 +33,12 @@ Below is the full list of models that we evaluate in our preprint.
|
||||
| Mistral Instruct 7B R-DPO | [princeton-nlp/Mistral-7B-Instruct-RDPO](https://huggingface.co/princeton-nlp/Mistral-7B-Instruct-RDPO) | 27.3 | 24.5 | 16.1 |
|
||||
| Mistral Instruct 7B SimPO | [princeton-nlp/Mistral-7B-Instruct-SimPO](https://huggingface.co/princeton-nlp/Mistral-7B-Instruct-SimPO) | 32.1 | 34.8 | 21.0 |
|
||||
| Llama3 Base 8B SFT | [princeton-nlp/Llama-3-Base-8B-SFT](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT) | 6.2 | 4.6 | 3.3 |
|
||||
| Llama3 Base 8B DPO | [princeton-nlp/Llama-3-Base-8B-DPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-DPO) | 18.2 | 15.5 | 15.9 |
|
||||
| Llama3 Base 8B IPO | [princeton-nlp/Llama-3-Base-8B-IPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-IPO) | 14.4 | 14.2 | 17.8 |
|
||||
| Llama3 Base 8B KTO | [princeton-nlp/Llama-3-Base-8B-KTO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-KTO) | 14.2 | 12.4 | 12.5 |
|
||||
| Llama3 Base 8B ORPO | [princeton-nlp/Llama-3-Base-8B-ORPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-ORPO) | 12.2 | 10.6 | 10.8 |
|
||||
| Llama3 Base 8B R-DPO | [princeton-nlp/Llama-3-Base-8B-RDPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-RDPO) | 17.6 | 14.4 | 17.2 |
|
||||
| Llama3 Base 8B SimPO | [princeton-nlp/Llama-3-Base-8B-SimPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SimPO) | 22.0 | 20.3 | 23.4 |
|
||||
| Llama3 Base 8B DPO | [princeton-nlp/Llama-3-Base-8B-SFT-DPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-DPO) | 18.2 | 15.5 | 15.9 |
|
||||
| Llama3 Base 8B IPO | [princeton-nlp/Llama-3-Base-8B-SFT-IPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-IPO) | 14.4 | 14.2 | 17.8 |
|
||||
| Llama3 Base 8B KTO | [princeton-nlp/Llama-3-Base-8B-SFT-KTO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-KTO) | 14.2 | 12.4 | 12.5 |
|
||||
| Llama3 Base 8B ORPO | [princeton-nlp/Llama-3-Base-8B-SFT-ORPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-ORPO) | 12.2 | 10.6 | 10.8 |
|
||||
| Llama3 Base 8B R-DPO | [princeton-nlp/Llama-3-Base-8B-SFT-RDPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-RDPO) | 17.6 | 14.4 | 17.2 |
|
||||
| Llama3 Base 8B SimPO | [princeton-nlp/Llama-3-Base-8B-SFT-SimPO](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT-SimPO) | 22.0 | 20.3 | 23.4 |
|
||||
| Llama3 Instruct 8B SFT | [meta-llama/Meta-Llama-3-Instruct-8B](https://huggingface.co/meta-llama/Meta-Llama-3-Instruct-8B) | 26.0 | 25.3 | 22.3 |
|
||||
| Llama3 Instruct 8B DPO | [princeton-nlp/Llama-3-Instruct-8B-DPO](https://huggingface.co/princeton-nlp/Llama-3-Instruct-8B-DPO) | 40.3 | 37.9 | 32.6 |
|
||||
| Llama3 Instruct 8B IPO | [princeton-nlp/Llama-3-Instruct-8B-IPO](https://huggingface.co/princeton-nlp/Llama-3-Instruct-8B-IPO) | 35.6 | 35.6 | 30.5 |
|
||||
|
||||
Reference in New Issue
Block a user