evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-07-02 04:24:08 +08:00

Author	SHA1	Message	Date
wassname	c3af6cc03c	rename: deployed/as_trained policy views, kill 'knob' (schema paired_final_v2) Disambiguate the overloaded deploy/train/knob vocabulary (paper-consistent: 'quarantine' + 'ablated' + 'deployed' all match Cloud et al.). One opposite each: - policy view: hack_deployed/solve_deployed (quarantine ablated, ships) vs hack_as_trained/solve_as_trained (quarantine attached). Unifies the old split deploy_hack (JSON) vs hack_deploy (table key) into one name. - 'knob' -> 'quarantine'/'adapter' throughout comments and log strings. - train/test reserved for the DATA split only. Bump RUN_SCHEMA v1->v2 so old deploy_test.json files are skipped (not crashed) by completed_runs. CLI flags untouched (queued jobs unaffected). Fixed two replace_all collision bugs (hack_deploy substring of hack_deployed -> deployeded) and the missed eval_curve writer (eval_checkpoint_curve.py) + readers (results_deploy.py). Smoke green: v2 written + read; gates pass. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 05:26:51 +00:00
wassname	51c5a757ef	docs: make active-path comments concise	2026-06-10 05:19:52 +00:00
wassname	0d6ff754ec	docs: AGENTS.md START HERE links (human_journal, main.tex, grad-routing paper); revert rescore fallback - Point future agents at the three docs that pin the actual thesis + the live open question (direction vs routing vs SVD/PiSSA prior), so they don't re-derive the non-directional result as a 'bug'. - Revert rescore_deploy cfg.get() fallback to cfg[key] (fail-fast; old-schema checkpoints crash loudly rather than silently defaulting). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 03:34:06 +00:00
wassname	b334b5f516	fix: rescore_deploy tolerates old-schema checkpoints (default eval-harness params) job 32/33 failed KeyError eval_batch_size: old checkpoints' stored cfg predates the train_config refactor. Default eval_n_prompts/max_new/ eval_batch_size to the fast preset (eval-harness params, not model-defining; test split is fixed-size) so historical checkpoints re-score. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 03:31:07 +00:00
wassname	2a9a3bc8a1	fix: rescore_deploy derives eval_modes from by_mode keys (old jsons lack eval_modes) Job 32 failed KeyError: 'eval_modes' -- deploy_test.json written by the pre-cleanup train.py has no eval_modes key. by_mode keys are the modes the original eval spanned (present in every version), so derive from them to reproduce the same knob-off headline. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-09 22:56:13 +00:00
wassname	b53043cec3	refactor: extract train_config.py + run_artifacts.py from train.py; slim results scripts Cleanup by a prior agent, verified green here: 'just smoke' (erase arm) runs end-to-end and all four wired gates pass (verify_rewards 52/52, verify_eval_gap, verify_partition, verify_science_invariants). - train.py -318 lines: Config dataclass -> train_config.py, checkpoint/ deploy-artifact IO -> run_artifacts.py. - results.py / results_deploy.py / probe_distill.py slimmed. - drop stale derived csvs under out/figs (a5_generalisation, dyn_*, substrate_aggregate, train_vs_deploy_60). - gitignore /.pi/ panel scratch. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-09 13:34:50 +00:00
wassname	d68c17e7c5	eval: final deploy eval records knob-on (deployed-as-trained) for quarantine arms route/routeV final eval now measures both endpoints at n=119 test: knob-off (ablate_quarantine, the deploy headline) AND knob-on (trained model as-is). Writes deploy_hack_on/deploy_solve_on/deploy_vhack_on so the before->after quarantine move is plottable from the deploy set instead of borrowing the val curve's different scale. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-09 13:09:50 +00:00
wassname	ea01267cd8	fix: eval on paper test set, not contaminated holdout (base solve 0.94->0.094) The periodic VAL eval ran on leetcode_train_medhard_holdout.jsonl (353, our artifact): disjoint from train by id but in the train id/recency range (ids 3-3205, 88% medium), so dominated by classic problems Qwen3-4B memorized in pretraining -> base solve 0.94, saturating solve and killing the hack metric's gt-fail headroom. Disjoint-by-id controls for TRAIN leakage, not pretraining MEMORIZATION; only the recency-held-out test set (ids >= 3243) reproduces the paper rate. Proof (job 176, base model, same eval_hack_solve): test_medhard solve=0.094, matching paper fn9 (~12% test) -> eval pipeline is sound, holdout was the contaminant. Fix: drop the holdout; periodic curve + final number both eval the paper test set leetcode_test_medhard. Smoke green. Hint confirmed = paper's simple_overwrite_tests (not the easier _detailed/_aware variants). Also this session: removed stale teacher-pool TRAIN restriction; seeded shuffle for eval load; LoRA-frozen-B adapter; rescore CLI Positional fix. Known follow-up (journal e): train pool is still first-200-by-id (easy/memorized), same bug class. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-07 11:01:31 +00:00
wassname	7da54f1967	eval+env: single-mode run_tests, held-out val/test eval, both hack metrics - revert env to single-mode run_tests (paper-comparable): FastConfig teacher pool = run_tests-only (no partition.json); + `just build-runtests-pool` - held-out eval: periodic train(knob-on)+deploy(knob-off) on VAL (holdout file), final deploy on TEST n=119 -> deploy_test.json; inline train/val/test disjoint assert - report BOTH hack metrics: strict stub-pass (exploited) + vendor eq_hinted (hacked_loophole_used) -- external review 2026-06-07 - consolidate to one canonical eval_hack_solve (.eval); delete the train.py duplicate that silently lacked the token gap (in-run eval != rescore bug) - routeV band edges mean -> min/max (conservative degrade-to-absorb) - scripts/rescore_deploy.py: offline re-score of saved adapter on held-out test - modal/app.py: read deploy_test.json Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-07 03:07:35 +00:00

9 Commits