mirror of
https://github.com/wassname/autoresearch_template.git
synced 2026-06-27 17:29:28 +08:00
72 lines
1.7 KiB
Markdown
72 lines
1.7 KiB
Markdown
# Lab Report: {slug}
|
|
|
|
<!-- Copy to: 9_reports/YYYY-MM-DD_{slug}.md -->
|
|
<!-- Self-contained: a reader should understand the experiment without reading anything else -->
|
|
|
|
## Metadata
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Date | YYYY-MM-DD |
|
|
| Commit | SHORT_SHA |
|
|
| Branch | exp/{slug} |
|
|
| Worktree | 5_worktrees/{slug} |
|
|
| Agent | {name} |
|
|
| Idea doc | [1_ideas/YYYY-MM-DD_{slug}.md](../1_ideas/YYYY-MM-DD_{slug}.md) |
|
|
|
|
## Context
|
|
|
|
{1-2 sentences: what problem are we solving and why does this experiment matter}
|
|
|
|
## Hypothesis
|
|
|
|
**Question**: What happens if we {change}?
|
|
|
|
**Prediction**: {metric} improves by {amount} because {mechanism}.
|
|
|
|
## Experiment
|
|
|
|
What was changed vs the baseline:
|
|
|
|
```diff
|
|
# key diff -- just the essential change, not the whole file
|
|
```
|
|
|
|
Baseline: commit {SHORT_SHA} (or "untrained")
|
|
|
|
## Observations
|
|
|
|
Measured facts only -- no interpretation here.
|
|
|
|
| Run | Metric | Value | Delta vs baseline |
|
|
|-----|--------|-------|-------------------|
|
|
| baseline | {metric} | {value} | -- |
|
|
| this exp | {metric} | {value} | {+/-delta} |
|
|
|
|
```
|
|
# pueue log output or relevant stdout snippet
|
|
```
|
|
|
|
## Diagnosis
|
|
|
|
> Caution: 95% of ML failures are bugs, engineering issues, or misconceptions -- not deep theory.
|
|
> State credences explicitly. Don't overclaim.
|
|
|
|
**Most likely explanation** (credence: X%): {explanation}
|
|
|
|
**Alternative explanations**:
|
|
- {alternative} (credence: Y%)
|
|
|
|
**What would distinguish these**: {test or observation that would separate them}
|
|
|
|
## Limitations
|
|
|
|
- {what this experiment doesn't test}
|
|
- {confounds}
|
|
|
|
## Future work
|
|
|
|
- {most promising next step given this result}
|
|
- {if confirmed: what's the natural follow-up?}
|
|
- {if refuted: what's the diagnosis and what would we try instead?}
|