Files
wassname f55c18ac6e wip
2026-04-04 23:57:11 +08:00

2.9 KiB

Research Program

Project: {FILL_IN: one sentence}

Metric: {FILL_IN: metric name} ({lower/higher} is better). Target: 5-40 min per run.

Hypothesis space: {FILL_IN: what class of approaches}

See 0_docs/problem.md for full problem context and baseline numbers.


File Taxonomy

Type Files Rule
FROZEN program.md, eval.py, meta_journal.md Never edit without META_MODE=1
GLOBAL RESEARCH_JOURNAL.md, results.tsv Commit from main only; worktrees append to root copy
APPEND-ONLY RESEARCH_JOURNAL.md, human_journal.md, meta_journal.md New entries at top, never edit old entries
REGULAR everything else Modify freely in your worktree

Agent Algorithm

# MUST read before starting:
read RESEARCH_JOURNAL.md        # what has been tried, what worked/failed
read "Lessons Learned" below    # gotchas -- don't repeat past failures

n_ideas = count(1_ideas/*.md) - 1  # minus _TEMPLATE.md

if n_ideas < 30:
    # IDEATE
    # MUST read: 0_docs/ideation_guide.md  (epistemics, paper fetching, brainstorm discipline)
    read 0_docs/problem.md
    search 1+ papers (semantic-search, exa-search, bibtex MCP)
    save full paper text to 0_docs/papers/{slug}.md   # full text, not summaries
    for each idea:
        write 1_ideas/{YYYY-MM-DD}_{slug}.md   # use _TEMPLATE.md
        subagent critique: "Is this sound? Failure modes? Testable?"
        append subagent feedback to idea file
    append paper insights + ideas to RESEARCH_JOURNAL.md

else:
    # IMPLEMENT
    # MUST read: 0_docs/conventions.md  (coding style)
    pick best idea (subagent rating + novelty + expected impact)
    just worktree {slug}           # creates 5_worktrees/{slug} on branch exp/{slug}
    implement in worktree (edit train.py; do NOT touch eval.py, program.md)

    # TEST
    subagent code review vs idea doc
    just smoke
    just eval                      # appends row to results.tsv

    # REPORT
    write 9_reports/{YYYY-MM-DD}_{slug}.md   # use _TEMPLATE.md
    append to RESEARCH_JOURNAL.md: what tried, delta metric, observation vs inference

    # SUBMIT
    git commit -m "exp({slug}): {one line}"
    git push origin exp/{slug}
    if beats best in results.tsv: open PR for human

# GPU QUEUE (pueue -- one GPU, no collision)
just queue "Q: does X help? H: expect +delta" eval {args}
pueue status          # shows hypothesis label for each queued/running job

Lessons Learned and Gotchas

Format: YYYY-MM-DD | title | lesson (one line)


Meta-Mode

Human writes META_MODE=1 in human_journal.md to unlock editing FROZEN files and committing to main. Use for: revising this program.md, updating eval.py, exit-interview style process reflection in meta_journal.md.