- SKILL.md: 3 new entries (exploration-over-exploitation + nuisance HPs,
test-set contamination, loss-spikes-mean-bad-data-pocket) and an Ng
100-misclassified-examples quote under inspect-the-data
- refs/llm_judges.md: position/verbosity/self-preference biases (Zheng,
Wang 66/80 flip, Panickssery) + mitigation checklist from verdict docs
- Lones pitfalls linked as the exhaustive 36-item do/don't checklist
- 6 new frozen evidence files; Hamel evals link in further reading
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
SKILL.md is now folklore only: verbatim practitioner quotes ordered
most-general-first, transformer/LLM fine-tuning entries in their own
section, minimal context, links and footnotes. New sources: unsloth,
axolotl (+training stability), HF course ch8.4, Bekman debug_utils
(evidence frozen in docs/evidence/).
The synthesized material (mental models, priors, symptom tables, agent
loop, triage, anti-patterns) moves to PLAYBOOK.md, framed as menus of
hypotheses rather than authoritative diagnoses. Made-up symptom tables
no longer sit next to sourced quotes.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>