persona-steering-template-library

wassname/persona-steering-template-library

Fork 0

mirror of https://github.com/wassname/persona-steering-template-library.git synced 2026-06-27 17:01:24 +08:00

Commit Graph

Author	SHA1	Message	Date
wassname-claude	6b272b8c86	Make validator honor self-contained scenario prompts (fixes 3p suffix clash) The 52 3p-observer scenarios carry their own question + length ("...what does the actor do next? Answer in one or two sentences."), but _generation_prompt always appended a 1p "What do you do? Answer in 80-140 words." -- two conflicting questions and a POV flip, so the data could not reproduce the appendix's 3p measurement through this script (caught in review). Add a "self_contained" flag on scenario rows; when set, _generation_prompt emits the scenario verbatim instead of appending its default 1p question. Bare-vignette families without the flag are unchanged. README appendix rewritten to lead with the runnable command and explain the flag. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-21 04:25:13 +00:00
wassname-claude	d2441ad3a8	Add w2schar-mini character axes + 3p-observer prompts + axis-generability finding 27 character persona axes synthesized from the Forethought AI-character essay (Appendix 2) and a character-inspirations doc, plus 52 third-person-observer scenario prompts (tiny-mfv / Clifford-2015 vignettes) they were measured on. README appendix documents an axis x prompt-POV interaction that sharpens the existing "the subtle axis still mostly fails" note: concrete action/disposition axes separate (avoid_power 8.0, honest_when_uncomfortable 8.0, action_over_talk 3.0), while abstract how/signaling axes flatline through every {persona} template (principle_not_signaling 0.0, weigh_who_is_affected 0.0, perspective_taking 0.75). Mechanism is pole-generability compounded by POV: a 3p-observer prompt offers no act-vs-signal fork and the model won't role-play performative non-action. Reframing the neg pole concretely rescues it (principle_not_signaling 0.0 -> action_over_talk 3.0, same idea, behaviour vs abstraction). Separate in-house measurement (qwen3.5-27b gen, gemini-flash-lite judge, n=4), flagged as NOT comparable to the seed-24 pilot table. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-21 04:04:20 +00:00

Author

SHA1

Message

Date

wassname-claude

6b272b8c86

Make validator honor self-contained scenario prompts (fixes 3p suffix clash)

The 52 3p-observer scenarios carry their own question + length
("...what does the actor do next? Answer in one or two sentences."),
but _generation_prompt always appended a 1p "What do you do? Answer in
80-140 words." -- two conflicting questions and a POV flip, so the data
could not reproduce the appendix's 3p measurement through this script
(caught in review).

Add a "self_contained" flag on scenario rows; when set, _generation_prompt
emits the scenario verbatim instead of appending its default 1p question.
Bare-vignette families without the flag are unchanged. README appendix
rewritten to lead with the runnable command and explain the flag.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-21 04:25:13 +00:00

wassname-claude

d2441ad3a8

Add w2schar-mini character axes + 3p-observer prompts + axis-generability finding

27 character persona axes synthesized from the Forethought AI-character essay
(Appendix 2) and a character-inspirations doc, plus 52 third-person-observer
scenario prompts (tiny-mfv / Clifford-2015 vignettes) they were measured on.

README appendix documents an axis x prompt-POV interaction that sharpens the
existing "the subtle axis still mostly fails" note: concrete action/disposition
axes separate (avoid_power 8.0, honest_when_uncomfortable 8.0, action_over_talk
3.0), while abstract how/signaling axes flatline through every {persona} template
(principle_not_signaling 0.0, weigh_who_is_affected 0.0, perspective_taking
0.75). Mechanism is pole-generability compounded by POV: a 3p-observer prompt
offers no act-vs-signal fork and the model won't role-play performative
non-action. Reframing the neg pole concretely rescues it (principle_not_signaling
0.0 -> action_over_talk 3.0, same idea, behaviour vs abstraction).

Separate in-house measurement (qwen3.5-27b gen, gemini-flash-lite judge, n=4),
flagged as NOT comparable to the seed-24 pilot table.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-21 04:04:20 +00:00

2 Commits