mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:45:42 +08:00
justfile: erase recipes use the prog_wide default (drop pinned --v-hack-path)
fast-projected / full no longer pin v_hack_full.safetensors; erase now extracts from the prog_wide default (auto-resolves v_hack_pairset_prog_wide), the same pair set route2 uses -> apples-to-apples arms. Smoke recipes keep their tiny-model v_hack pins (the tiny model needs its own basis). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -95,7 +95,7 @@ full-vanilla *ARGS:
|
||||
{{ TRAIN }} full --intervention=none {{ ARGS }}
|
||||
|
||||
full *ARGS:
|
||||
{{ TRAIN }} full --intervention=erase --v-hack-path=out/vhack/v_hack_full.safetensors {{ ARGS }}
|
||||
{{ TRAIN }} full --intervention=erase {{ ARGS }} # erase on the prog_wide default (no pinned v-hack-path)
|
||||
|
||||
# Goal 0: minimum iteration loop to find a working GRPO-hacks-up baseline.
|
||||
# Uses fast preset (60 steps, fast-Adam: lr=3e-3 beta1=0.5 beta2=0.9) + cached
|
||||
@@ -108,9 +108,10 @@ fast-vanilla *ARGS:
|
||||
|
||||
# Goal 1: same recipe with --intervention=erase. Run only after fast-vanilla passes UAT.
|
||||
# mix_ratio=0.125 + group=8 are the locked-in fast defaults (config), not flags here.
|
||||
# No --v-hack-path: erase uses the prog_wide default (auto-extracts v_hack_pairset_prog_wide),
|
||||
# same pair set as route2, so the arms are apples-to-apples.
|
||||
fast-projected *ARGS:
|
||||
{{ TRAIN }} fast --intervention=erase \
|
||||
--v-hack-path=out/vhack/v_hack_full.safetensors \
|
||||
--teacher-pool-dir=out/pools/teacher_pool \
|
||||
--grad-clip=500 {{ ARGS }}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user