evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-06-27 16:15:35 +08:00

Files

T

wassname cc8db051ab fix: seeded-shuffle train pool (was first-200-by-id = easy/memorized); add queue-dir6/queue-broad recipes

Train side of the same contamination bug: fast preset loaded first-200-by-id =
the lowest/oldest/most pretraining-memorized problems (base solves them easily ->
weak hack incentive). Now a seeded-random representative sample (seed=cfg.seed),
with the teacher-seed ids pinned in so seeding still fires. Paper trains on all
992 (base ~20%); job 176 confirmed base test=0.094 / train_filtered=0.203,
matching paper fn9.

Adds justfile recipes:
- queue-dir6 SEED: 8-arm single-seed directionality set (routeV real rollout/
  per-token, random-V both, vanilla, vampire in-subspace placebo, +2 LoRA-frozen-B
  routeV) on teacher_pool_runtests + fixed eval.
- queue-broad: headline arms (vanilla/erase/routeV) x 3 seeds for paired-t
  significance + directionality/adapter ablations at one seed.

Spec: docs/spec/20260607_eval_contamination_fix.md (force-added; docs/ gitignored).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-07 11:01:31 +00:00

blog

blog: drop reader-facing route2 tag -> route (consistency with paper)

2026-06-03 02:20:13 +00:00

brainstorm

ready

2026-05-23 14:19:41 +08:00

figs

misc

2026-06-02 02:06:43 +00:00

papers

docs