diff --git a/PLAYBOOK.md b/PLAYBOOK.md index 2472656..7d1d546 100644 --- a/PLAYBOOK.md +++ b/PLAYBOOK.md @@ -223,4 +223,4 @@ Folklore sources (the quotes above trace to these): For modern transformer pretraining specifically (the sources above predate it), see [Karpathy's recipe](https://karpathy.github.io/2019/04/25/recipe/) and the [nanochat deepwiki](https://deepwiki.com/karpathy/nanochat) (320+ empirical HP sweeps for a GPT-2-scale run). Most multi-source claims trace to quotes in [docs/ml_debug_folklore.argdown](docs/ml_debug_folklore.argdown) (vargdown); the full evidence set is in [docs/evidence/](docs/evidence/). -Curated by [wassname](https://github.com/wassname). Companion gist: https://gist.github.com/wassname/e45e41f75c0b50e72ec1f4cff811a277 +Curated by [wassname](https://github.com/wassname). Companion gist: https://gist.github.com/wassname/d72da69ffe1bdb60cdb1708bd3b5c535 diff --git a/SKILL.md b/SKILL.md index a651c7a..7ef682a 100644 --- a/SKILL.md +++ b/SKILL.md @@ -324,4 +324,4 @@ Folklore sources (the quotes above trace to these): For modern transformer pretraining specifically (most sources above predate it), see [Karpathy's recipe](https://karpathy.github.io/2019/04/25/recipe/) and the [nanochat deepwiki](https://deepwiki.com/karpathy/nanochat) (320+ empirical HP sweeps for a GPT-2-scale run). For LLM-as-judge eval debugging workflow more broadly, Hamel Husain's ["Your AI Product Needs Evals"](https://hamel.dev/blog/posts/evals/) covers the error-analysis-first approach for LLM products. Most multi-source claims trace to quotes in [docs/ml_debug_folklore.argdown](docs/ml_debug_folklore.argdown) (vargdown); the full evidence set is in [docs/evidence/](docs/evidence/). -Curated by [wassname](https://github.com/wassname). Companion gist: https://gist.github.com/wassname/e45e41f75c0b50e72ec1f4cff811a277 +Curated by [wassname](https://github.com/wassname). Companion gist: https://gist.github.com/wassname/d72da69ffe1bdb60cdb1708bd3b5c535