mirror of https://github.com/wassname/ml-debug.git synced 2026-06-27 16:00:43 +08:00

Files

T

wassname 4393cceefd initial: ML debugging folklore skill

Deep research to uplift LLMs for ML debugging, opinionated by source
selection. Distilled from Schulman, Jones, Rahtz, Goodfellow, CS231n,
FSDL, and more. Includes runnable diagnostic scripts and LLM-specific
anti-patterns.

Author: wassname (https://github.com/wassname)

2026-03-06 10:11:30 +08:00

4.8 KiB

Raw Blame History

ML Debugging Folklore - Vargdown Process Log

Process

evidence files read (21 files, 9416 lines total)
quotes extracted via 12 parallel subagents
key quotes verified against evidence files (spot-checked ~15 quotes)
argdown verifier passes clean (npx @argdown/cli json -- 14 arguments, 45 statements, 14 relations)
subagent review done (gpt-5.2-codex via opencode; fixed non-verbatim quotes, credence calibration, PCS structure)
human review done

Evidence Fetch Log

All evidence files were pre-existing in docs/evidence/. They were fetched in a prior session via the methods listed in each file's header.

Source	Evidence File	Fetch Method	Status
Schulman 2016 slides	joschu_nuts_and_bolts.md	`uvx markitdown[pdf]`	verbatim (PDF artifacts: cid markers)
Schulman 2017 bootcamp	schulman_nuts_bolts_deeprl_bootcamp_2017_subtitles.md	YouTube auto-subtitles	verbatim (transcription errors: "insanity" = "and standard")
Andy Jones RL debugging	andyljones_rl_debugging.md	markitdown	verbatim
Henderson et al. 2018	henderson_2018_deep_rl_matters.md	markitdown	verbatim
Goodfellow Ch11	goodfellow_ch11_practical_methodology.md	markitdown	verbatim
CS231n NN3	cs231n_neural_networks_3.md	markitdown	verbatim
FSDL Spring 2021 L7	fsdl_spring2021_lecture7.md	markitdown	verbatim
Irpan RL hard	alexirpan_rl_hard.md	markitdown	verbatim
amid.fish reproducing	amid_fish_reproducing_deep_rl.md	markitdown	verbatim
Slavv 37 reasons	slavv_37_reasons_nn.md	markitdown	verbatim
CS229 ML advice	cs229_ml_advice.md	markitdown	verbatim
McCandlish 2018	mccandlish_2018_large_batch.md	markitdown	verbatim
William Falcon notes	williamfalcon_deeprl_hacks.md	markitdown	verbatim
Goodfellow Ch15	goodfellow_ch15_representation_learning.md	markitdown	verbatim
Deep Learning Book	deeplearning_book.md	markitdown	verbatim
Reddit RL tips 7s8px9	reddit_rl_practical_tips_7s8px9.md	markitdown	verbatim
Reddit RL debug 9sh77q	reddit_rl_debugging_tips_9sh77q.md	markitdown	verbatim
Reddit RL roadblocks	reddit_rl_roadblocks_bzg3l2.md	markitdown	verbatim
Reddit Schulman 5hereu	reddit_schulman_nuts_bolts_5hereu.md	markitdown	verbatim
Reddit ICML tutorial	reddit_icml2017_tutorial_levine_6vcvu1.md	markitdown	verbatim
Reddit DRL bootcamp	reddit_deeprl_bootcamp_2017_75m5vd.md	markitdown	verbatim

Quote Verification Notes

Schulman subtitles contain auto-generated transcription errors (e.g., "mean insanity deviation" should be "mean and standard deviation"). Quotes used verbatim from file; errors are in the source, not introduced by us.
Schulman PDF (joschu_nuts_and_bolts.md) has markitdown conversion artifacts ((cid:73) bullet markers, table formatting). Core text is present but formatting is messy.
All other evidence files appear to be clean markitdown conversions.
15 key quotes were manually spot-checked against evidence files. All matched.
Quotes from subagent extractions were cross-referenced with direct file reads.

Blockers / Caveats

Argdown verifier passes clean: npx @argdown/cli json exports 14 arguments, 45 statements, 14 relations. Fixed: 44 blank lines inside PCS blocks, bracket escaping in FSDL quote.
Some evidence files (especially Schulman PDF) have conversion artifacts that may cause verifier failures on exact quote matching.
The argdown uses auto-generated YouTube subtitles as a source; these contain transcription errors that are present in the evidence file.

Coverage Summary

SKILL.md Claim	Sources Used	Independent Sources
Normalize inputs mean=0 std=1	Schulman, FSDL, Slavv	3
Overfit tiny dataset first	CS231n, FSDL, Goodfellow	3
Assume you have a bug	Jones, Goodfellow	2
Seed variance is extreme	Schulman, Henderson, Irpan	3
Use bigger batch sizes	Schulman (x2), McCandlish	2 (Schulman slides + talk counted as 1)
Hand-scale rewards, don't shift mean	Schulman, Jones, Henderson	3
Use reference implementations	Jones, Rahtz	2
Pursue anomalies	Jones, Rahtz	2
Log everything	Rahtz, Goodfellow	2
Random HP search	CS231n/Bergstra, Schulman	2

Claims NOT Covered in Argdown (lower priority or single-source)

Gradient clipping masks problems (CS231n mentions, but as a technique not a warning)
Final layer zero init for policy (Schulman only)
Loss surface analysis / gradient quiver plots (original to SKILL, no external source)
Sweep methodology with within-group z-scores (original to SKILL)

4.8 KiB Raw Blame History