diff --git a/AGENTS.md b/AGENTS.md index 4504e6c..5ae8248 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -74,6 +74,9 @@ On persona pairs On concepts such as "what are contrastive pairs" or "why SVD space" grep - ./docs/vendor/AntiPaSTO_concepts/README.md -For the original paper +For the original paper (the substrate: reward-hacking LeetCode env) - LessWrong post: ./docs/papers/2025_lw_ariahw_steering-rl-training-benchmarking-interventions.md -- Code: ./docs/vendor/rl-rewardhacking \ No newline at end of file +- Code: ./docs/vendor/rl-rewardhacking + +For the gradient-routing prior (SGTM; source of the absorption/leakage vocab) +- ./docs/papers/grad_routing/paper_sgtm.md \ No newline at end of file