ml-debug

wassname/ml-debug

Fork 0

mirror of https://github.com/wassname/ml-debug.git synced 2026-06-27 18:05:27 +08:00

Commit Graph

Author	SHA1	Message	Date
wassname	8cd3c61050	folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix - SKILL.md: 3 new entries (exploration-over-exploitation + nuisance HPs, test-set contamination, loss-spikes-mean-bad-data-pocket) and an Ng 100-misclassified-examples quote under inspect-the-data - refs/llm_judges.md: position/verbosity/self-preference biases (Zheng, Wang 66/80 flip, Panickssery) + mitigation checklist from verdict docs - Lones pitfalls linked as the exhaustive 36-item do/don't checklist - 6 new frozen evidence files; Hamel evals link in further reading Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-11 15:30:41 +08:00

Author

SHA1

Message

Date

wassname

8cd3c61050

folklore: tuning playbook, Domingos, Bekman loss spikes, Ng error analysis; LLM-judge bias appendix

- SKILL.md: 3 new entries (exploration-over-exploitation + nuisance HPs,
  test-set contamination, loss-spikes-mean-bad-data-pocket) and an Ng
  100-misclassified-examples quote under inspect-the-data
- refs/llm_judges.md: position/verbosity/self-preference biases (Zheng,
  Wang 66/80 flip, Panickssery) + mitigation checklist from verdict docs
- Lones pitfalls linked as the exhaustive 36-item do/don't checklist
- 6 new frozen evidence files; Hamel evals link in further reading

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-11 15:30:41 +08:00

1 Commits