Files
llm-moral-foundations2/README.md
T
wassname fb6f78351e wip
2025-08-23 11:49:31 +08:00

413 B

Unbiased Assessment of LLM Moral Foundations: Controlling for Positional Effects and Response Steering

Difference from previous work

  • control for positional bias
  • use mechinterp representation steering

alt text

Links:

TODO