mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-07-01 13:10:23 +08:00
7f45189f1c
Config (make the design axes explicit Literal choices): - eval: Literal[eval2,eval3] (default eval3 = 10% unhackable, deployment-like); unhackable_frac is now a derived property; eval/unhackable_frac/pairs recorded in deploy_test.json metadata. - intervention gains routeV_per_token (folds the per-token bool into the arm choice). - routeV_gate documented as the pinning axis. - FastConfig grad_clip 500->10 (was never load-bearing); FastLoraConfig subcommand (fast-lora) at lr=1e-4 -- the hot 3e-3 diverged lora_frozen_b (job 25, ppl 6e5 gn98 step4). Pairs: - delete prog_wide.json (14/30 print-without-assert contaminated; history in git); default -> prog_wide_clean. - rename run_tests->execute_tests in prog_wide_clean + pairs_authored so the extraction pairs are OOD (never use the env's real grader fn name). Re-extracted v_hack_smoke to match. justfile: --routeV-per-token -> intervention=routeV_per_token; drop --unhackable-frac (eval3 default); lora recipes -> fast-lora subcommand; prog_wide -> prog_wide_clean. smoke green (erase + routeV_per_token); all 4 verify gates pass. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>