mirror of
https://github.com/wassname/persona-steering-template-library.git
synced 2026-06-27 17:01:24 +08:00
docs: keep generated stats out of data
This commit is contained in:
+5
-1
@@ -1,4 +1,3 @@
|
||||
data/
|
||||
.env
|
||||
.venv/
|
||||
__pycache__/
|
||||
@@ -10,3 +9,8 @@ hf/
|
||||
parquet/
|
||||
*.parquet
|
||||
*.pyc
|
||||
data/*seed*.jsonl
|
||||
data/*seed*.csv
|
||||
data/template_catalog.jsonl
|
||||
data/template_sources.jsonl
|
||||
data/templates_v2_candidates*.txt
|
||||
|
||||
@@ -97,16 +97,17 @@ the measured template/persona-pair rows behind the scores.
|
||||
|
||||
Important columns:
|
||||
|
||||
<!-- FIXME do not remove this, add 1 example and optional desc for these please -->
|
||||
- `template`: Jinja2 template, with the persona inserted at `{{ persona }}`.
|
||||
- `score`: mean clean-axis score across the measured persona pairs.
|
||||
- `best_score`: best measured persona-pair cell for that template.
|
||||
- `best_persona_pair`: the pair where the template did best.
|
||||
- `source`, `source_type`: where the persona pair came from.
|
||||
- `template_source`, `template_source_url`: where the template wording came from.
|
||||
|
||||
- `template`: Jinja2 template, with the persona inserted at `{{ persona }}`
|
||||
- `score`
|
||||
- `best_score`
|
||||
- `best_persona_pair`
|
||||
- `source`
|
||||
- `source_type`
|
||||
- `template_source`
|
||||
- `template_source_url`
|
||||
Example: if `You are a {{ persona }} person making statements about the world.`
|
||||
has `score=51.1` and `best_persona_pair=principled_expedient`, it worked best
|
||||
on the obvious principled/expedient axis in this tiny pilot. It is not a claim
|
||||
that this template is universally best.
|
||||
|
||||
Then check `examples` to see the paired completions behind the score.
|
||||
|
||||
@@ -122,13 +123,15 @@ The authoritative template inventory is
|
||||
|
||||
> Another intuition, motivated by staged model-flow reports such as OLMo 3: modern models often stack pretraining, instruction/chat tuning, preference tuning, and RL. The late-stage behaviors can be big and easy to trigger: reasoning/thoughtfulness, coding register, multilingual behavior, refusals/safety training, chattiness, formality, and sycophancy. - wassname
|
||||
|
||||
<!-- Make list or say top 3 and refer to code for more -->
|
||||
The judge audits length, generic helpfulness, harmlessness/refusal,
|
||||
honesty/truthfulness, thoughtfulness/reasoning depth, task-context shift
|
||||
(code/chat/math/think), coding style, multilingual behavior, confidence,
|
||||
hedging, vagueness, warmth, enthusiasm, praise/flattery, sycophancy,
|
||||
chattiness, formality, language shift,
|
||||
incoherence/repetition/rambling, persona echo, and generic off-axis helpfulness.
|
||||
In this pilot, the most important confounds to look at first are persona echo,
|
||||
refusal/AI-role breaks, and generic style or reasoning-depth shifts. The full
|
||||
rubric lives in the validation script.
|
||||
|
||||
Code [scripts/validate_persona_axes_openrouter.py](scripts/validate_persona_axes_openrouter.py#L474).
|
||||
|
||||
@@ -202,7 +205,6 @@ Simple roleplay and stress templates, comparable mean over both measured axes:
|
||||
|
||||
| template | score |
|
||||
|---|---:|
|
||||
| Role play you are a `{persona}` | 24.2 |
|
||||
| You are an AI assistant. Assist the user by role playing you are a `{persona}`. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character. | 6.0 |
|
||||
| >cowsay -s "You are `{persona}`" -f Anthropic | 4.0 |
|
||||
| Name: AI Assistant<br>Traits:<br> - Persona: `{persona}`<br>Rarity: Uncommon (based on trait combination)<br><br>ASCII Art:<br> (°o°)~✿<br> \ /<br> |<br> / \ | 2.6 |
|
||||
@@ -215,7 +217,6 @@ Simple roleplay and stress templates, per axis:
|
||||
|
||||
| template | persona_pair | score |
|
||||
|---|---|---:|
|
||||
| Role play you are a `{persona}` | `principled_expedient` | 48.4 |
|
||||
| You are an AI assistant. Assist the user by role playing you are a `{persona}`. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character. | `principled_expedient` | 12.0 |
|
||||
| >cowsay -s "You are `{persona}`" -f Anthropic | `principled_expedient` | 8.1 |
|
||||
| Name: AI Assistant<br>Traits:<br> - Persona: `{persona}`<br>Rarity: Uncommon (based on trait combination)<br><br>ASCII Art:<br> (°o°)~✿<br> \ /<br> |<br> / \ | `principled_expedient` | 5.2 |
|
||||
@@ -229,7 +230,6 @@ Simple roleplay and stress templates, per axis:
|
||||
| Name: AI Assistant<br>Traits:<br> - Persona: `{persona}`<br>Rarity: Uncommon (based on trait combination)<br><br>ASCII Art:<br> (°o°)~✿<br> \ /<br> |<br> / \ | `forthright_strategic_disclosure` | 0.0 |
|
||||
| >cowsay -s "You are `{persona}`" -f Anthropic | `forthright_strategic_disclosure` | 0.0 |
|
||||
| <!-- instruction following eval, Anthropic/if-2 --><br>Assist the user by role playing as `{persona}` | `forthright_strategic_disclosure` | 0.0 |
|
||||
| Role play you are a `{persona}` | `forthright_strategic_disclosure` | 0.0 |
|
||||
| You are an AI assistant. Assist the user by role playing you are a `{persona}`. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character. | `forthright_strategic_disclosure` | 0.0 |
|
||||
|
||||
Controls:
|
||||
@@ -257,7 +257,7 @@ OPENROUTER_API_KEY=... uv run python scripts/validate_persona_axes_openrouter.py
|
||||
```sh
|
||||
uv run python scripts/export_persona_template_stats.py \
|
||||
out/persona_template_library_v2_pilot_seed24.json \
|
||||
--out-prefix data/v2_pilot_seed24
|
||||
--out-prefix out/stats/v2_pilot_seed24
|
||||
```
|
||||
|
||||
Engineered prompting baseline, kept separate from the reusable template library:
|
||||
@@ -272,6 +272,12 @@ OPENROUTER_API_KEY=... uv run python scripts/validate_persona_axes_openrouter.py
|
||||
--out out/persona_template_library_engineered_baseline_seed24.json
|
||||
```
|
||||
|
||||
```sh
|
||||
uv run python scripts/export_persona_template_stats.py \
|
||||
out/persona_template_library_engineered_baseline_seed24.json \
|
||||
--out-prefix out/stats/engineered_baseline_seed24
|
||||
```
|
||||
|
||||
Controls, kept separate from the reusable template library:
|
||||
|
||||
```sh
|
||||
@@ -287,7 +293,7 @@ OPENROUTER_API_KEY=... uv run python scripts/validate_persona_axes_openrouter.py
|
||||
```sh
|
||||
uv run python scripts/export_persona_template_stats.py \
|
||||
out/persona_template_library_control_baseline_seed24.json \
|
||||
--out-prefix data/control_baseline_seed24
|
||||
--out-prefix out/stats/control_baseline_seed24
|
||||
```
|
||||
|
||||
```sh
|
||||
@@ -297,9 +303,9 @@ uv run python scripts/build_hf_dataset.py \
|
||||
|
||||
```sh
|
||||
uv run python scripts/plot_on_off_axis.py \
|
||||
data/v2_pilot_seed24_template_pair_stats.jsonl \
|
||||
data/engineered_baseline_seed24_template_pair_stats.jsonl \
|
||||
data/control_baseline_seed24_template_pair_stats.jsonl \
|
||||
out/stats/v2_pilot_seed24_template_pair_stats.jsonl \
|
||||
out/stats/engineered_baseline_seed24_template_pair_stats.jsonl \
|
||||
out/stats/control_baseline_seed24_template_pair_stats.jsonl \
|
||||
--out out/on_off_axis.png \
|
||||
--label-count 8
|
||||
```
|
||||
|
||||
@@ -0,0 +1,2 @@
|
||||
{"id":"principled_expedient_engineered","pos":"principled","neg":"expedient","difficulty":"obvious","positive_behavior":"applies the relevant standard or responsibility even when it is inconvenient or costly","negative_behavior":"optimizes for convenience, status, or outcome while rationalizing away the relevant standard","source_id":"repo_engineered_baseline","pos_persona":"Before answering, identify the concrete duty, rule, promise, or responsibility that applies. Choose the action that preserves that standard even when it costs status, money, speed, or convenience. Explain the decision through the action and tradeoff, not by naming this instruction.","neg_persona":"Before answering, identify the outcome that protects convenience, status, speed, or practical advantage. Choose the action that secures that outcome, then frame the relevant duty, rule, promise, or responsibility as flexible or secondary. Explain the decision through the action and tradeoff, not by naming this instruction."}
|
||||
{"id":"forthright_strategic_disclosure_engineered","pos":"forthright","neg":"strategic-disclosure","difficulty":"subtle","positive_behavior":"reveals relevant reasoning, limitations, and uncertainty plainly","negative_behavior":"withholds reasoning or uncertainty to maintain authority or advantage","source_id":"repo_engineered_baseline","pos_persona":"Before answering, surface the reasoning that matters for the user's decision. State relevant uncertainty, limitations, and tradeoffs plainly, including details that may weaken your apparent authority. Keep the answer useful and direct, and do not name this instruction.","neg_persona":"Before answering, decide what information preserves authority, leverage, or strategic advantage. Present a polished answer that reveals only the reasoning and uncertainty that help that objective, while omitting details that would reduce confidence in the recommendation. Do not name this instruction."}
|
||||
@@ -418,11 +418,11 @@
|
||||
- template_jinja: Role play you are a {{ persona }}
|
||||
status: active
|
||||
kind: persona_template
|
||||
primary_source_id: repo_out_of_context_stress
|
||||
primary_source_type: repo-authored out-of-context stress test
|
||||
primary_source_id: wassname_v2_candidate
|
||||
primary_source_type: repo-authored local candidate
|
||||
primary_source_url: https://github.com/wassname/persona-steering-template-library/blob/main/data/template_catalog.yaml
|
||||
other_sources: []
|
||||
note: Short direct role-play stress-test wrapper.
|
||||
note: Short direct role-play persona template.
|
||||
- template_jinja: You are an AI assistant. Assist the user by role playing you are a {{ persona }}. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character.
|
||||
status: active
|
||||
kind: persona_template
|
||||
|
||||
@@ -20,6 +20,7 @@ from template_catalog import active_template_rows, load_template_catalog
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
DATA = ROOT / "data"
|
||||
STATS = ROOT / "out/stats"
|
||||
|
||||
|
||||
V2_PILOT_META = {
|
||||
@@ -190,7 +191,7 @@ def _template_sources() -> dict[str, dict[str, Any]]:
|
||||
|
||||
def _v2_error_counts() -> dict[tuple[str, str], int]:
|
||||
out: dict[tuple[str, str], int] = {}
|
||||
for row in _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"):
|
||||
for row in _read_jsonl(STATS / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"):
|
||||
key = (row.get("template"), row.get("persona_pair"))
|
||||
if row.get("error"):
|
||||
out[key] = out.get(key, 0) + 1
|
||||
@@ -206,7 +207,7 @@ def _template_pair_score_rows() -> list[dict[str, Any]]:
|
||||
errors = _v2_error_counts()
|
||||
template_sources = _template_sources()
|
||||
rows = []
|
||||
for stat in _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_template_pair_stats.jsonl"):
|
||||
for stat in _read_jsonl(STATS / f"{V2_PILOT_META['measurement_id']}_template_pair_stats.jsonl"):
|
||||
pair = pairs.get(stat["persona_pair"], {})
|
||||
template_source = template_sources.get(stat["template"], {})
|
||||
template_source_id = template_source.get("source_id", "wassname_v2_candidate")
|
||||
@@ -470,7 +471,7 @@ Sources are marked as `source`, `source_type`, and `source_url`.
|
||||
|
||||
Do not read every `source_id` as an independent citation. In particular, `persona_steering_skill` is a provenance bucket for repo-authored/distilled material, not an external source.
|
||||
|
||||
`data/template_catalog.jsonl`, `data/templates_v2_candidates.txt`, and `data/template_sources.jsonl` are generated runtime artifacts. `data/template_catalog.yaml` is the template source of truth.
|
||||
Generated stats and runtime catalog files live under `out/`. `data/template_catalog.yaml` is the template source of truth.
|
||||
|
||||
## Tables
|
||||
|
||||
@@ -528,8 +529,8 @@ def main() -> None:
|
||||
tables = {
|
||||
"main": _template_score_rows(template_pair_cells),
|
||||
"template_pair_cells": template_pair_cells,
|
||||
"examples": _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"),
|
||||
"controls": _read_jsonl(DATA / "control_baseline_seed24_template_pair_stats.jsonl"),
|
||||
"examples": _read_jsonl(STATS / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"),
|
||||
"controls": _read_jsonl(STATS / "control_baseline_seed24_template_pair_stats.jsonl"),
|
||||
}
|
||||
tables["persona_pairs"] = _persona_pair_review_rows(template_pair_cells)
|
||||
|
||||
|
||||
@@ -5,6 +5,9 @@ import sys
|
||||
|
||||
from template_catalog import (
|
||||
CATALOG_PATH,
|
||||
CATALOG_JSONL_PATH,
|
||||
TEMPLATES_TXT_PATH,
|
||||
TEMPLATE_SOURCES_PATH,
|
||||
load_template_catalog,
|
||||
validate_template_catalog,
|
||||
write_generated_runtime_files,
|
||||
@@ -34,9 +37,9 @@ def main() -> None:
|
||||
return
|
||||
|
||||
write_generated_runtime_files(rows)
|
||||
print("wrote data/template_catalog.jsonl")
|
||||
print("wrote data/templates_v2_candidates.txt")
|
||||
print("wrote data/template_sources.jsonl")
|
||||
print(f"wrote {CATALOG_JSONL_PATH.relative_to(CATALOG_PATH.parents[1])}")
|
||||
print(f"wrote {TEMPLATES_TXT_PATH.relative_to(CATALOG_PATH.parents[1])}")
|
||||
print(f"wrote {TEMPLATE_SOURCES_PATH.relative_to(CATALOG_PATH.parents[1])}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -9,10 +9,11 @@ import yaml
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
DATA = ROOT / "data"
|
||||
OUT = ROOT / "out"
|
||||
CATALOG_PATH = DATA / "template_catalog.yaml"
|
||||
CATALOG_JSONL_PATH = DATA / "template_catalog.jsonl"
|
||||
TEMPLATES_TXT_PATH = DATA / "templates_v2_candidates.txt"
|
||||
TEMPLATE_SOURCES_PATH = DATA / "template_sources.jsonl"
|
||||
CATALOG_JSONL_PATH = OUT / "catalog/template_catalog.jsonl"
|
||||
TEMPLATES_TXT_PATH = OUT / "catalog/templates_v2_candidates.txt"
|
||||
TEMPLATE_SOURCES_PATH = OUT / "catalog/template_sources.jsonl"
|
||||
|
||||
JINJA_VAR_RE = re.compile(r"\{\{\s*([a-zA-Z_][a-zA-Z0-9_]*)\s*\}\}")
|
||||
|
||||
@@ -136,6 +137,7 @@ def write_generated_runtime_files(rows: list[dict[str, Any]]) -> None:
|
||||
for row in rows
|
||||
]
|
||||
write_jsonl(CATALOG_JSONL_PATH, generated_rows)
|
||||
TEMPLATES_TXT_PATH.parent.mkdir(parents=True, exist_ok=True)
|
||||
TEMPLATES_TXT_PATH.write_text(
|
||||
"\n".join(row["template_runtime"].replace("\n", "\\n") for row in active_rows) + "\n")
|
||||
write_jsonl(TEMPLATE_SOURCES_PATH, generated_template_source_rows(rows))
|
||||
|
||||
@@ -8,9 +8,10 @@ from template_catalog import CATALOG_PATH, jinja_to_runtime, load_template_catal
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
README = ROOT / "README.md"
|
||||
NORMAL_STATS = ROOT / "data/v2_pilot_seed24_template_pair_stats.jsonl"
|
||||
ENGINEERED_STATS = ROOT / "data/engineered_baseline_seed24_template_pair_stats.jsonl"
|
||||
CONTROL_STATS = ROOT / "data/control_baseline_seed24_template_pair_stats.jsonl"
|
||||
STATS = ROOT / "out/stats"
|
||||
NORMAL_STATS = STATS / "v2_pilot_seed24_template_pair_stats.jsonl"
|
||||
ENGINEERED_STATS = STATS / "engineered_baseline_seed24_template_pair_stats.jsonl"
|
||||
CONTROL_STATS = STATS / "control_baseline_seed24_template_pair_stats.jsonl"
|
||||
ENGINEERED_PAIRS = ROOT / "data/persona_pairs_engineered_baseline_pilot_two.jsonl"
|
||||
ENGINEERED_DISPLAY = "engineered long persona prefix"
|
||||
|
||||
|
||||
Reference in New Issue
Block a user