From bc84df5e3823a30c74515b678d50547e2e561163 Mon Sep 17 00:00:00 2001
From: wassname <github@wassname.org>
Date: Sat, 29 Apr 2023 18:46:58 +0800
Subject: [PATCH] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 80d0236..c751234 100644
--- a/README.md
+++ b/README.md
@@ -91,7 +91,7 @@ A great way to find new instruction datasets is to
 There are multiple ways to formally evaluate LLM capabilities. Right now project generally use one of these 3 libraries. Personally I prefer Eleuther's work, but opinions and github stars are divided.
 
 - python api:
-	- [huggingface/evaluate](https://github.com/huggingface/evaluate) this is not specific to LLM's or RLHF, but [some](https://github.com/nomic-ai/gpt4all/blob/main/eval_self_instruct.py#L43) [projects](https://github.com/gururise/AlpacaDataCleaned/blob/791174f63e/eval/README.md) find it and easy to use starting point. 
+	- [huggingface/evaluate](https://github.com/huggingface/evaluate) this is not specific to LLM's or RLHF, but some [projects](https://github.com/gururise/AlpacaDataCleaned/blob/791174f63e/eval/README.md) find it and easy to use starting point. 
 - cli api:
 	- [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) - has lots of datasets like GLUE and ETHICS already included, works with huggingface
 	- [openai/evals: Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.](https://github.com/openai/evals) - has lots of rare eval sets like sarcasm, works with langchain