ml-debug/docs/evidence at 1ad74e14c62cc96b183cf1cf05f20f29a5c85a10 - ml-debug - Gitea: Git with a cup of tea

wassname/ml-debug

mirror of https://github.com/wassname/ml-debug.git synced 2026-06-27 19:15:45 +08:00

Files

T

History

wassname 8cd3c61050 folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

- SKILL.md: 3 new entries (exploration-over-exploitation + nuisance HPs,
  test-set contamination, loss-spikes-mean-bad-data-pocket) and an Ng
  100-misclassified-examples quote under inspect-the-data
- refs/llm_judges.md: position/verbosity/self-preference biases (Zheng,
  Wang 66/80 flip, Panickssery) + mitigation checklist from verdict docs
- Lones pitfalls linked as the exhaustive 36-item do/don't checklist
- 6 new frozen evidence files; Hamel evals link in further reading

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-11 15:30:41 +08:00

..

[English (auto-generated)] Deep RL Bootcamp Lecture 6 Nuts and Bolts of Deep RL Experimentation [Do.txt

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

alexirpan_rl_hard.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

amid_fish_reproducing_deep_rl.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

andyljones_rl_debugging.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

axolotl_debugging.md

restructure: quotes-first SKILL.md, synthesized playbook split out

2026-06-11 14:33:32 +08:00

axolotl_training_stability.md

restructure: quotes-first SKILL.md, synthesized playbook split out

2026-06-11 14:33:32 +08:00

bekman_debug_utils_transformers.md

restructure: quotes-first SKILL.md, synthesized playbook split out

2026-06-11 14:33:32 +08:00

bekman_ml_engineering_instabilities.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

cleanrl_37_ppo_details.md

folklore: add koaning, gwern, kidger, nanochat, cleanrl; trim lucidrains

2026-06-02 20:59:36 +08:00

cs229_ml_advice.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

cs231n_neural_networks_3.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

domingos_2012_few_useful_things.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

fsdl_spring2021_lecture7.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

goodfellow_ch11_practical_methodology.md

chore: include Goodfellow chapters (author encourages sharing)

2026-03-06 10:16:00 +08:00

goodfellow_ch15_representation_learning.md

chore: include Goodfellow chapters (author encourages sharing)

2026-03-06 10:16:00 +08:00

google_tuning_playbook.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

gwern_tank.md

folklore: add koaning, gwern, kidger, nanochat, cleanrl; trim lucidrains

2026-06-02 20:59:36 +08:00

gwern_unseeing.md

folklore: promote Spinning Up to main; add a Research-taste section

2026-06-02 21:08:49 +08:00

henderson_2018_deep_rl_matters.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

hf_llm_course_ch8_4_debugging_pipeline.md

restructure: quotes-first SKILL.md, synthesized playbook split out

2026-06-11 14:33:32 +08:00

joschu_nuts_and_bolts.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

karpathy_common_mistakes_tweet_2018.md

folklore: add Karpathy common-mistakes tweet and Sculley CACE principle

2026-06-11 14:43:47 +08:00

karpathy_nn_zero_to_hero_lec4_diagnostics.md

feat(ml_debug): expand nanochat evidence, add lec4 diagnostics file

2026-03-10 05:38:33 +08:00

karpathy_recipe_training_nn_2019.md

feat(ml_debug): add Karpathy recipe + nanochat evidence, update-ratio diagnostic

2026-03-10 05:32:37 +08:00

kidger_just_know_stuff.md

folklore: add koaning, gwern, kidger, nanochat, cleanrl; trim lucidrains

2026-06-02 20:59:36 +08:00

koaning_bad_labels.md

folklore: add koaning, gwern, kidger, nanochat, cleanrl; trim lucidrains

2026-06-02 20:59:36 +08:00

llm_judge_biases.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

lones_2021_ml_pitfalls.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

lucidrains_x_transformers_readme.md

folklore: add lucidrains transformer-stability item (QK-norm, post-emb LN)

2026-06-02 20:49:15 +08:00

mccandlish_2018_large_batch.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

nanda_how_to_mech_interp.md

folklore: promote Spinning Up to main; add a Research-taste section

2026-06-02 21:08:49 +08:00

nanochat_deepwiki_llm_pretraining_2026.md

feat(ml_debug): expand nanochat evidence, add lec4 diagnostics file

2026-03-10 05:38:33 +08:00

ng_ml_yearning_error_analysis.md

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

2026-06-11 15:30:41 +08:00

reddit_deeprl_bootcamp_2017_75m5vd.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

reddit_icml2017_tutorial_levine_6vcvu1.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

reddit_rl_debugging_tips_9sh77q.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

reddit_rl_practical_tips_7s8px9.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

reddit_rl_roadblocks_bzg3l2.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

reddit_schulman_nuts_bolts_5hereu.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

sanh_simple_considerations_hf_2021.md

docs(ml_debug): annotate EMNLP 2018 NLP code tutorial; note sparse Adam embedding bug

2026-03-10 05:48:36 +08:00

schulman_nuts_bolts_deeprl_bootcamp_2017_subtitles.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

sculley_2015_hidden_technical_debt.md

folklore: add Karpathy common-mistakes tweet and Sculley CACE principle

2026-06-11 14:43:47 +08:00

slavv_37_reasons_nn.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00

spinningup_researcher.md

rl: quote Spinning Up (Achiam) on silent failure and bug-first debugging

2026-06-02 21:04:55 +08:00

ulisse_how_to_get_good_at_programming.md

folklore: promote Spinning Up to main; add a Research-taste section

2026-06-02 21:08:49 +08:00

unsloth_troubleshooting_faqs.md

restructure: quotes-first SKILL.md, synthesized playbook split out

2026-06-11 14:33:32 +08:00

wentworth_gears_level_models.md

folklore: promote Spinning Up to main; add a Research-taste section

2026-06-02 21:08:49 +08:00

williamfalcon_deeprl_hacks.md

initial: ML debugging folklore skill

2026-03-06 10:11:30 +08:00