mirror of
https://github.com/wassname/steer-heal-love.git
synced 2026-06-27 17:17:35 +08:00
Update README.md
This commit is contained in:
committed by
GitHub
parent
2e99f62658
commit
08329ab86d
@@ -25,6 +25,8 @@ Steering is interesting because it's and internal and unsupervised intervention
|
|||||||
|
|
||||||
## Heal
|
## Heal
|
||||||
|
|
||||||
|
Can we heal after steering? This is the key hypothesis:
|
||||||
|
|
||||||
### Hypothesis
|
### Hypothesis
|
||||||
|
|
||||||
Hypothesis: you can distill a steering vector into LoRA weights and "heal" the incoherency the vector injects by regularising the training. Then loop and see what multiple rounds give you.
|
Hypothesis: you can distill a steering vector into LoRA weights and "heal" the incoherency the vector injects by regularising the training. Then loop and see what multiple rounds give you.
|
||||||
|
|||||||
Reference in New Issue
Block a user