Add 3 new evidence files from modern open-source sources:
- karpathy_recipe_training_nn_2019.md: Karpathy's training recipe blog post
- nanochat_deepwiki_llm_pretraining_2026.md: 320+ HP sweeps for GPT-2-scale pretraining
- sanh_simple_considerations_hf_2021.md: HuggingFace NLP debugging notes
Add update-to-data ratio diagnostic to refs/diagnostics.md (target ~1e-3).
Add LLM pretraining gap note to SKILL.md intro linking the new sources.
Add tanh saturation % to logging checklist.
- Fix stale Part 2 cross-references to link to rl/SKILL.md
- Add McCandlish + Slavv back to parent Sources (cited in Part 7)
- Add back-links from refs/ files to parent SKILL.md
Moved 6.1 (static analysis grep patterns) and 6.2 (diagnostic code
snippets) to refs/static_analysis.md and refs/diagnostics.md.
Triage tree (6.3) stays in main with references to the ref files.
ml_debug/SKILL.md reduced from 7229w to 5093w (~30% from original).
Moved heat-exchanger-specific content from pinn/SKILL.md to
pinn/refs/heat_exchanger.md: complexity ladder table, known failure
modes (U->0, counterflow signs), property mappings (REFPROP/PCHIP),
multi-episode training. PINN skill is now domain-agnostic.
pinn/SKILL.md reduced from 4961w to 4274w (~14%).
Part 2 (RL-Specific Debugging) + RL-specific sources moved to
ml_debug/rl/SKILL.md as a sub-skill, following the pinn/ precedent.
Parent SKILL.md reduced from 9158w to 7229w (~21%).
General sources (Goodfellow, CS231n, Tobin, Ng) kept in parent.
Deep research to uplift LLMs for ML debugging, opinionated by source
selection. Distilled from Schulman, Jones, Rahtz, Goodfellow, CS231n,
FSDL, and more. Includes runnable diagnostic scripts and LLM-specific
anti-patterns.
Author: wassname (https://github.com/wassname)