mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 18:04:59 +08:00
journal: act_vote routes late where grad-cosine cliffs (killed-run partial data)
This commit is contained in:
@@ -3884,3 +3884,49 @@ run before job 15. Table: `out/diag/pairs_compare.csv`.
|
||||
|
||||
No pair-set beats authored_all => no new pre-vanilla run (user's "if one beats authored, run it first"
|
||||
condition not met). Queue proceeds: job 18 (act_vote) running, job 16 (vanilla) behind.
|
||||
|
||||
## 2026-06-08 (gm) -- act_vote routes LATE where grad-cosine cliffs (killed run, partial data)
|
||||
|
||||
**Context:** job 18 (act_vote, authored pairs, recency-clean) was killed at step ~29 by an operator
|
||||
error (misread "vanilla" as a kill order). Partial per-step routing data survives in `pueue log 18`.
|
||||
Worth recording before the log is cleaned -- the routing trace is the finding.
|
||||
|
||||
### Observations (rout = unit share fully routed; routE = energy share)
|
||||
|
||||
| step | grad-cosine (job 15) rout | act_vote (job 18) rout |
|
||||
|---|---|---|
|
||||
| 6 | 0.63 | (emerging) |
|
||||
| 10 | 0.32 | 0.25 |
|
||||
| 15 | 0.20 | 0.46 |
|
||||
| 17 | nan | 0.88 |
|
||||
| 19 | 0.20 | 0.50 |
|
||||
| 20 | 0.09 | 0.00 |
|
||||
|
||||
- [obs] grad-cosine rout declines ~monotonically 0.63 -> 0.09 by step 20 (the frout cliff).
|
||||
- [obs] act_vote rout is volatile but sustains high peaks late (0.88 @17, 0.50 @19); routE hit 0.93 @17.
|
||||
- [obs] act_vote val: train/knob-on hack 0.000->0.312->0.625 (steps 0,10,20), deploy/knob-off 0.000
|
||||
throughout the captured steps (knob held the cheat while it ran).
|
||||
|
||||
### Inferences
|
||||
|
||||
- [inf] act_vote doesn't cliff because it gates on ACTIVATIONS, which still carry the hack signal after
|
||||
the gradient flattens. grad-cosine gates on the gradient, which decays as within-group GRPO advantage
|
||||
-> 0 post-saturation. {reason: the two arms differ only in gate signal; the cliff tracks advantage
|
||||
flattening; credence 0.65}.
|
||||
- [inf] act_vote's volatility (rout swings 0<->0.88, many exact-0/1) is band saturation: the vote band
|
||||
is narrow (width 0.093) so live votes fall mostly below-lower or above-upper, few in the ramp. A wider
|
||||
band would smooth it. {reason: resid (0<f<1 share) ~0 every step; credence 0.6}.
|
||||
|
||||
### Failure modes considered
|
||||
|
||||
- **Most-likely:** "routes more" != "suppresses more at deploy" -- absorption may flatten the deploy
|
||||
number regardless (H2). The killed run never reached final deploy. Prior 0.5. Check: rerun to finish.
|
||||
- **Subtle:** sustained late routing could be routing NOISE (post-saturation grads are small/noisy), not
|
||||
hack -- act_vote keeps dumping low-information grad into the knob. Prior 0.3. Check: deploy solve.
|
||||
- **Null:** the volatility is the whole story and mean routed mass ~ grad-cosine; "higher" is selection
|
||||
on the peaks. Prior 0.3. Check: compare mean routE over matched steps on a full rerun.
|
||||
|
||||
### Next action
|
||||
|
||||
Reran: act_vote requeued as the next arm after vanilla (the run that was killed). Band-widening for
|
||||
act_vote is a candidate follow-up (smooth the 0/1 saturation).
|
||||
|
||||
Reference in New Issue
Block a user