evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-06-27 17:30:41 +08:00

Files

T

wassname 5c09feeb14 refactor: decompose train.py helpers into clean's module names

Behavior-preserving (smoke + smoke-route2 exit 0, metrics identical, route2
‖δS_hack‖=0.0079>0). All touched modules import-checked (no cycles).

Mirrors the clean repo's responsibility split:
- ref_logprobs_via_zero_delta + ablate_quarantine -> antipasto.py (the adapter
  owns the δS=0 free-ref-model trick and the δS_hack ablation).
- load_v_hack + postprocess_v_hack -> extract_vhack_grad.py (alongside extract_v_hack).
- load_problems + DATA + the per-mode hints -> new problems.py.

Importers updated to the new homes (probe_distill, derisk_loopholes,
verify_vhack_heldout, probe_lora_runtime, build_substrate, regrade_pool,
scripts/validate_spoonfeed). Moving DATA out of train.py also broke the
regrade_pool->train edge, so train.py can now import the v_hack helpers at
top level without a cycle.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-01 12:15:12 +00:00

audit_log.py

audit-log: print a fixed healthy-vanilla gen as a coherence yardstick

2026-06-01 01:15:25 +00:00

build_combined_pool.py

reorg: out/ sorted by datatype (vhack/ pools/ runs/ vhack_grads/ figs/)

2026-05-30 03:52:24 +00:00

make_dataset_pairsets.py

scripts

2026-05-30 04:16:56 +00:00

make_pairsets.py

wip