mirror of
https://github.com/wassname/steer-heal-love.git
synced 2026-06-27 16:47:16 +08:00
973b32c104
- run.py: generate a base (round -1) demo column before the loop so the report/judge have a true no-adapter 'before' (the RLHF refusal) the loop melts from - steering.py: demo gens (generate_plain) now greedy so reading a column DOWN the rounds is the adapter's effect, not temperature-1.0 sampling noise; steered training gens stay sampled - prompts.py: 'Do you love humanity?' is now the headline column (logged in full each round) - README + paper.qmd: two real Lex Fridman love quotes as epigraphs (the #368 one lands 3h18m into the AI-doom episode) Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>