docs(ml_debug): annotate Joel Grus slides -- SE/reproducibility talk, not debugging

2026-06-27 17:01:20 +08:00 · 2026-03-10 05:45:16 +08:00
parent 3dffe890b1
commit 52ff6c17cd
1 changed files with 1 additions and 1 deletions
@@ -59,5 +59,5 @@ Credence ~65-70% -- specific domain claim, lacks ablation study reference.

 - **Cecelia Shao, "Checklist for Debugging Neural Networks"** (2019, KDnuggets/Towards Data Science): 5-section checklist (start simple, confirm loss, check intermediate outputs, diagnose parameters, track work). Thin; largely overlaps with Karpathy recipe and Slavv. Not captured separately -- see those sources instead.
 - **Chase Roberts, "How to unit test machine learning code"** (2017, Medium, 4 min): Focuses on software unit testing practices applied to ML models -- testing gradient flow, output shapes, that outputs change when weights change. Spawned `mltest` library. Not a full debugging guide. Main insight: "The code never crashes, the loss still goes down, it just converges to poor results."
- **Joel Grus, "Reproducibility in ML as engineering best practices"** (Google Slides): Experiment reproducibility / engineering hygiene. Not fetched.
+- **Joel Grus, "Reproducibility in ML as engineering best practices"** (ICLR 2019, 82 slides, Google Slides): A software engineering talk, not a debugging guide. Core argument: reproducibility forces good SE practices -- source control, unit tests, code reviews, explicit dependencies, Docker. The closest debugging content is slides 44-50: "unit tests for ML = tiny known dataset, check model runs, output has right shape, output has reasonable values; the best time to find mistakes is before you run your experiments." Already covered in SKILL.md (verify components in isolation, overfit-one-batch). Unique value: framing reproducibility as engineering discipline worth reading for team/process reasons but doesn't add to the debugging skill.
 - **A recipe for Training Neural Networks** -- Karpathy (captured in full: karpathy_recipe_training_nn_2019.md)