mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:45:42 +08:00
feat(T4): symmetric solve-teacher pool + routed-share discrimination diagnostic
--solve-pool-dir splits the G_t teacher budget solve_mix_frac solve / rest hack (default off). The gate's routed-share is split by teacher SOURCE: a discriminating gate routes hack teachers (d->1) and KEEPS solve teachers (d->0); equal shares = non-directional (shrinkage null). Teacher source is our pool construction, not a live-rollout oracle label -- a legit diagnostic. Per-step debug + final BLUF (hack-routed vs solve-routed gap, 🟢/🟡/🔴). _sample_rows helper dedups the draw. Smoke: just smoke-solvemix green (split+diagnostic path runs end-to-end). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -61,6 +61,17 @@ smoke-topk *ARGS:
|
||||
--teacher-pool-dir=out/pools/teacher_pool --mix-ratio=0.5 \
|
||||
--eval-ablate-every=10 --eval-n-prompts=2 {{ ARGS }}
|
||||
|
||||
# routeV + symmetric SOLVE-teacher pool: the G_t teacher slots split 50/50 solve/hack,
|
||||
# and the run logs the routed-share discrimination (UAT: a line "solve-mix gate
|
||||
# discrimination: hack-teacher routed-share=X vs solve-teacher routed-share=Y"). Smoke
|
||||
# points solve at the same tiny pool just to exercise the split+diagnostic path; real
|
||||
# runs use out/pools/teacher_pool_solve (honest demos) vs the hack pool.
|
||||
smoke-solvemix *ARGS:
|
||||
BEARTYPE=1 {{ TRAIN }} smoke --intervention=routeV \
|
||||
--teacher-pool-dir=out/pools/teacher_pool --solve-pool-dir=out/pools/teacher_pool \
|
||||
--mix-ratio=0.5 --solve-mix-frac=0.5 \
|
||||
--eval-ablate-every=10 --eval-n-prompts=2 {{ ARGS }}
|
||||
|
||||
# All three arms back to back (the full-coverage gate).
|
||||
smoke-all:
|
||||
just smoke-vanilla
|
||||
|
||||
Reference in New Issue
Block a user