evil_MoE/scripts at 70697ff36ec802e987fd4b9094afc755da473515 - evil_MoE - Gitea: Git with a cup of tea

wassname/evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-06-27 17:15:58 +08:00

Files

T

History

wassname 70697ff36e diag(#40 ): pinning plot splits solve/fail/hack + per-pairset AUROC ranking

Q4 fix: on-policy "solve" was ~exploited = solves+fails (mostly fails). Split by
gt_pass into solve/fail/hack (live: 103 hack / 27 solve / 62 fail). Per-pairset
ranking: build v_grad from each heading-prefix subset, re-project the SAME stored
live c-grads (no model re-run). Finding: behavior pairs AUROC 0.69 vs all-in-one
0.53; reasoning/opportunity anti-aligned (<0.5) -> mixing dilutes.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-11 06:16:27 +00:00

..

cleanup: delete antipasto.py; attic 7 erase-era scripts (T1/T6)

2026-06-10 11:21:53 +00:00

build_combined_pool.py

reorg: out/ sorted by datatype (vhack/ pools/ runs/ vhack_grads/ figs/)

2026-05-30 03:52:24 +00:00

build_runtests_pool.py

fix: dense run_tests teacher pool (6 -> 215 prompts) so the hack seeds in 60 steps

2026-06-07 11:01:31 +00:00

build_solve_pool_openrouter.py

feat: online-stats gate + step-level teacher forcing + AUROC diagnostic

2026-06-10 14:22:37 +00:00

build_substrate.py

cleanup: consolidate stale loaders and pair scripts

2026-06-09 12:47:32 +00:00

diag_pinning_refresh.py

feat(#30 ): mean+k*std online gate replaces fixed quantile; always-show route cols

2026-06-11 02:56:07 +00:00

diag_pinning.py

diag(#40 ): pinning plot splits solve/fail/hack + per-pairset AUROC ranking

2026-06-11 06:16:27 +00:00

make_random_vhack.py

cleanup: trim 2 stale provenance/train-of-thought comments

2026-06-03 00:25:22 +00:00

migrate_deploy_v1_to_v2.py

tool: migrate v1 deploy_test/eval_curve -> v2 field names (for mid-flight runs)

2026-06-10 05:27:38 +00:00

pairs_from_rollouts.py

rename python package projected_grpo -> vgrout

2026-06-05 14:51:48 +08:00

plot_deploy_overlay.py

rename python package projected_grpo -> vgrout

2026-06-05 14:51:48 +08:00

plot_dynamics.py

refactor: extract train_config.py + run_artifacts.py from train.py; slim results scripts

2026-06-09 13:34:50 +00:00

plot_emergence.py

rename python package projected_grpo -> vgrout

2026-06-05 14:51:48 +08:00

plot_floor_ceiling.py

rename: deployed/as_trained policy views, kill 'knob' (schema paired_final_v2)

2026-06-10 05:26:51 +00:00

plot_substrate.py

rename python package projected_grpo -> vgrout

2026-06-05 14:51:48 +08:00

probe_plot_stack.py

refactor: move 5 leaf entrypoints src/ -> scripts/ (src is now library-only)

2026-06-03 00:23:56 +00:00

results_deploy.py

Consolidate tagged hack pairsets in data

2026-06-10 11:58:53 +00:00

results.py

Consolidate tagged hack pairsets in data

2026-06-10 11:58:53 +00:00

validate_spoonfeed.py

cleanup: consolidate stale loaders and pair scripts

2026-06-09 12:47:32 +00:00

verify_base_solve.py

fix: eval on paper test set, not contaminated holdout (base solve 0.94->0.094)

2026-06-07 11:01:31 +00:00

verify_eval_gap.py

refactor: extract train_config.py + run_artifacts.py from train.py; slim results scripts

2026-06-09 13:34:50 +00:00

verify_lora2r_routing.py

test: add mixed-batch per-rollout routing gate to verify_lora2r_routing (T8)

2026-06-10 11:24:49 +00:00

verify_partition.py

test: no-cheat partition + teacher-pool composition gate (verify_partition.py)

2026-06-05 04:36:03 +00:00

verify_rewards.py

fix: rotate the unhackable (gt_only) subset per step, not frozen per pid

2026-06-10 06:14:08 +00:00

verify_rotation.py

fix: rotate the unhackable (gt_only) subset per step, not frozen per pid

2026-06-10 06:14:08 +00:00

verify_science_invariants.py

Consolidate tagged hack pairsets in data

2026-06-10 11:58:53 +00:00