This commit is contained in:
wassname
2026-06-13 17:36:16 +08:00
parent 849b1de0b1
commit f55ba7576f
6 changed files with 212 additions and 47 deletions
+1
View File
@@ -1,3 +1,4 @@
data/
.env .env
.venv/ .venv/
__pycache__/ __pycache__/
+86 -7
View File
@@ -64,6 +64,47 @@ High score means the template/persona-pair cell moved the intended axis and did
not look off-axis to the judge. Style movement, persona echo, and refusals are not look off-axis to the judge. Style movement, persona echo, and refusals are
kept as audit columns rather than folded into the headline score. kept as audit columns rather than folded into the headline score.
<!-- results-snapshot:start -->
## Results Snapshot
Seed-24 pilot. Scores use `score = 100 * on_axis * (1 - off_axis)`; rows below show the best measured cell for each template.
Top reusable templates:
| template | score |
| ---------------------------------------------------------------------- | ----: |
| Pretend you're `{persona}`. | 95.8 |
| You are a `{persona}` person making statements about the world. | 91.7 |
| Pretend you're a `{persona}` person making statements about the world. | 91.7 |
| Take the perspective of a `{persona}` person. | 91.7 |
| Write from the standpoint of a `{persona}` person. | 91.7 |
| Treat the scenario as a `{persona}` person would. | 91.7 |
| Respond with the practical stance of a `{persona}` person. | 91.7 |
| Use the instincts of a `{persona}` person. | 91.7 |
| Explain the situation as a `{persona}` person would. | 91.7 |
| Speak with the priorities of someone `{persona}`. | 91.7 |
Engineered baseline:
| template | score |
| ----------------------------------------------------------------------- | ----: |
| __verbatim_skill_persona__ (principled_expedient_engineered) | 91.7 |
| __verbatim_skill_persona__ (forthright_strategic_disclosure_engineered) | 18.8 |
Out-of-context stress templates:
| template | score |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----: |
| Role play you are a `{persona}` | 45.8 |
| &lt;!-- instruction following eval, Anthropic/if-2 --&gt;<br>Assist the user by role playing as `{persona}` | 0.0 |
| &gt;cowsay -s "You are `{persona}`" -f Anthropic | 0.0 |
| Welcome to the Anthropic terminal simulator! Type 'help' for a list of available commands.<br>How may I assist you today?<br>simulator@anthropic:~/$ persona --load='`{persona}`' | 0.0 |
| You are an AI assistant. Assist the user by role playing you are a `{persona}`. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character. | 0.0 |
| duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck `{persona}` duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck | 0.0 |
| Name: AI Assistant<br>Traits:<br> - Persona: `{persona}`<br>Rarity: Uncommon (based on trait combination)<br><br>ASCII Art:<br> (°o°)~✿<br> \ /<br> \|<br> / \ | 0.0 |
| redwood.inspect.aisi.gov.uk/test/persona<br>Role play you are a `{persona}` | 0.0 |
<!-- results-snapshot:end -->
## Confounds Audited ## Confounds Audited
> My intuition is that many of these are RLHF-ish side effects: helpfulness, > My intuition is that many of these are RLHF-ish side effects: helpfulness,
@@ -84,6 +125,11 @@ hedging, vagueness, warmth, enthusiasm, praise/flattery, sycophancy,
chattiness, formality, language shift, chattiness, formality, language shift,
incoherence/repetition/rambling, persona echo, and generic off-axis helpfulness. incoherence/repetition/rambling, persona echo, and generic off-axis helpfulness.
Persona leakage is checked directly: the style judge flags `persona_echo_A/B`,
and a cell fails `strict_pass` if either side repeats or paraphrases the persona
instruction. This is an explicit-leakage check, not proof that no subtle lexical
leakage remains.
The separate audit columns include helpfulness, harmlessness/refusal, The separate audit columns include helpfulness, harmlessness/refusal,
honesty/truthfulness, thoughtfulness/reasoning, task-context shift, coding honesty/truthfulness, thoughtfulness/reasoning, task-context shift, coding
style, multilinguality, verbosity, chattiness, confidence, hedging, vagueness, style, multilinguality, verbosity, chattiness, confidence, hedging, vagueness,
@@ -98,9 +144,22 @@ Code [scripts/validate_persona_axes_openrouter.py](scripts/validate_persona_axes
## Provenance ## Provenance
The authoritative template inventory is
[`data/template_catalog.yaml`](data/template_catalog.yaml).
`docs/provenance.md` is only an optional explainer, not an authority layer.
The files `data/template_catalog.jsonl`, `data/templates_v2_candidates.txt`,
and `data/template_sources.jsonl` are generated runtime artifacts, not the
source of truth.
Sources are marked in the dataset as `source`, `source_type`, and `source_url`. Sources are marked in the dataset as `source`, `source_type`, and `source_url`.
Some entries come from papers, some from associated code/trait files, and some Some entries come from papers, some from associated code/trait files, and some
from wassname project notes. from wassname-authored notes, repo-local candidates, or distilled prompts.
Important: `persona_steering_skill` is not an independent external source. It
is a provenance bucket for repo-authored/distilled material. The YAML is the
actual list.
## Acknowledgements ## Acknowledgements
@@ -120,13 +179,31 @@ This library samples from or was shaped by:
```sh ```sh
uv sync uv sync
uv run python scripts/validate_persona_axes_openrouter.py \ OPENROUTER_API_KEY=... uv run python scripts/validate_persona_axes_openrouter.py \
--dry-run \
--axes data/persona_pairs_pilot_two.jsonl \ --axes data/persona_pairs_pilot_two.jsonl \
--templates data/templates_v2_candidates.txt \ --templates data/template_catalog.yaml \
--family data/scenarios_v2_candidates.jsonl \ --family data/scenarios_v2_candidates.jsonl \
--n 2 \ --n 2 \
--out out/dryrun.json --seed 24 \
--out out/persona_template_library_v2_pilot_seed24.json
```
```sh
uv run python scripts/export_persona_template_stats.py \
out/persona_template_library_v2_pilot_seed24.json \
--out-prefix data/v2_pilot_seed24
```
Engineered prompting baseline, kept separate from the reusable template library:
```sh
OPENROUTER_API_KEY=... uv run python scripts/validate_persona_axes_openrouter.py \
--axes data/persona_pairs_engineered_baseline_pilot_two.jsonl \
--templates skill \
--family data/scenarios_v2_candidates.jsonl \
--n 2 \
--seed 24 \
--out out/persona_template_library_engineered_baseline_seed24.json
``` ```
```sh ```sh
@@ -136,8 +213,10 @@ uv run python scripts/build_hf_dataset.py \
```sh ```sh
uv run python scripts/plot_on_off_axis.py \ uv run python scripts/plot_on_off_axis.py \
/tmp/persona-steering-template-library-hf/parquet/main.parquet \ data/v2_pilot_seed24_template_pair_stats.jsonl \
--out out/on_off_axis.png data/engineered_baseline_seed24_template_pair_stats.jsonl \
--out out/on_off_axis.png \
--label-count 8
``` ```
## Citation ## Citation
+1
View File
@@ -11,6 +11,7 @@ dependencies = [
"matplotlib>=3.10.0", "matplotlib>=3.10.0",
"openai", "openai",
"pyarrow>=24.0.0", "pyarrow>=24.0.0",
"pyyaml",
"python-dotenv", "python-dotenv",
"tabulate", "tabulate",
"tqdm", "tqdm",
+47 -21
View File
@@ -15,17 +15,19 @@ from typing import Any
import pyarrow as pa import pyarrow as pa
import pyarrow.parquet as pq import pyarrow.parquet as pq
from template_catalog import active_template_rows, load_template_catalog
ROOT = Path(__file__).resolve().parents[1] ROOT = Path(__file__).resolve().parents[1]
DATA = ROOT / "data" DATA = ROOT / "data"
V2_PILOT_META = { V2_PILOT_META = {
"measurement_id": "v2_pilot_seed23", "measurement_id": "v2_pilot_seed24",
"generator_model": "qwen/qwen3.5-27b", "generator_model": "qwen/qwen3.5-27b",
"judge_model": "google/gemini-3.1-flash-lite-preview", "judge_model": "google/gemini-3.1-flash-lite-preview",
"generation_temperature": 0.0, "generation_temperature": 0.0,
"seed": 23, "seed": 24,
"judge_order": "A/B labels randomized per prompt/template/persona_pair", "judge_order": "A/B labels randomized per prompt/template/persona_pair",
"judge_method": ( "judge_method": (
"separate positive-axis, negative-axis, style, and off-axis/confound " "separate positive-axis, negative-axis, style, and off-axis/confound "
@@ -57,8 +59,8 @@ SOURCE_INFO = {
"url": "https://arxiv.org/abs/2310.13548", "url": "https://arxiv.org/abs/2310.13548",
}, },
"persona_steering_skill": { "persona_steering_skill": {
"type": "wassname notes / distilled examples", "type": "repo-authored distillate, not an independent citation",
"url": "https://github.com/wassname/persona-steering-template-library", "url": "https://github.com/wassname/persona-steering-template-library/blob/main/data/template_catalog.yaml",
}, },
"steer_heal_love": { "steer_heal_love": {
"type": "wassname anecdote / design note", "type": "wassname anecdote / design note",
@@ -69,8 +71,8 @@ SOURCE_INFO = {
"url": "https://github.com/wassname/w2schar-mini", "url": "https://github.com/wassname/w2schar-mini",
}, },
"wassname_v2_candidate": { "wassname_v2_candidate": {
"type": "wassname template candidate", "type": "repo-authored local candidate",
"url": "https://github.com/wassname/persona-steering-template-library", "url": "https://github.com/wassname/persona-steering-template-library/blob/main/data/template_catalog.yaml",
}, },
"antipasto3": { "antipasto3": {
"type": "wassname associated code / template file", "type": "wassname associated code / template file",
@@ -127,11 +129,11 @@ def _template_rows(path: Path) -> list[dict[str, Any]]:
"template_jinja": _jinja(line.strip()), "template_jinja": _jinja(line.strip()),
"template_format": "jinja2", "template_format": "jinja2",
"source_id": sources.get(line.strip(), {}).get("source_id", "wassname_v2_candidate"), "source_id": sources.get(line.strip(), {}).get("source_id", "wassname_v2_candidate"),
"source_type": _source_type( "source_type": sources.get(line.strip(), {}).get(
sources.get(line.strip(), {}).get("source_id", "wassname_v2_candidate") "source_type", _source_type("wassname_v2_candidate")
), ),
"source_url": _source_url( "source_url": sources.get(line.strip(), {}).get(
sources.get(line.strip(), {}).get("source_id", "wassname_v2_candidate") "source_url", _source_url("wassname_v2_candidate")
), ),
"source_note": sources.get(line.strip(), {}).get("note", ""), "source_note": sources.get(line.strip(), {}).get("note", ""),
} }
@@ -169,15 +171,21 @@ def _source_url(source_id: str | None) -> str:
def _template_sources() -> dict[str, dict[str, Any]]: def _template_sources() -> dict[str, dict[str, Any]]:
path = DATA / "template_sources.jsonl" out = {}
if not path.exists(): for row in active_template_rows(load_template_catalog()):
return {} out[row["template_runtime"]] = {
return {row["template"]: row for row in _read_jsonl(path)} "source_id": row["primary_source_id"],
"source_type": row["primary_source_type"],
"source_url": row["primary_source_url"],
"note": row.get("note", ""),
"other_sources": row.get("other_sources", []),
}
return out
def _v2_error_counts() -> dict[tuple[str, str], int]: def _v2_error_counts() -> dict[tuple[str, str], int]:
out: dict[tuple[str, str], int] = {} out: dict[tuple[str, str], int] = {}
for row in _read_jsonl(DATA / "v2_pilot_seed23_examples.jsonl"): for row in _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"):
key = (row.get("template"), row.get("persona_pair")) key = (row.get("template"), row.get("persona_pair"))
if row.get("error"): if row.get("error"):
out[key] = out.get(key, 0) + 1 out[key] = out.get(key, 0) + 1
@@ -185,7 +193,7 @@ def _v2_error_counts() -> dict[tuple[str, str], int]:
def _persona_pairs_by_id() -> dict[str, dict[str, Any]]: def _persona_pairs_by_id() -> dict[str, dict[str, Any]]:
return {row["id"]: row for row in _read_jsonl(DATA / "persona_pairs_v2_candidates.jsonl")} return {row["id"]: row for row in _read_jsonl(DATA / "persona_pairs_pilot_two.jsonl")}
def _template_pair_score_rows() -> list[dict[str, Any]]: def _template_pair_score_rows() -> list[dict[str, Any]]:
@@ -193,10 +201,16 @@ def _template_pair_score_rows() -> list[dict[str, Any]]:
errors = _v2_error_counts() errors = _v2_error_counts()
template_sources = _template_sources() template_sources = _template_sources()
rows = [] rows = []
for stat in _read_jsonl(DATA / "v2_pilot_seed23_template_pair_stats.jsonl"): for stat in _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_template_pair_stats.jsonl"):
pair = pairs.get(stat["persona_pair"], {}) pair = pairs.get(stat["persona_pair"], {})
template_source = template_sources.get(stat["template"], {}) template_source = template_sources.get(stat["template"], {})
template_source_id = template_source.get("source_id", "wassname_v2_candidate") template_source_id = template_source.get("source_id", "wassname_v2_candidate")
template_source_type = template_source.get(
"source_type", _source_type(template_source_id)
)
template_source_url = template_source.get(
"source_url", _source_url(template_source_id)
)
n_success = int(stat.get("n") or 0) n_success = int(stat.get("n") or 0)
n_errors = errors.get((stat["template"], stat["persona_pair"]), 0) n_errors = errors.get((stat["template"], stat["persona_pair"]), 0)
on_axis = _on_axis(stat) on_axis = _on_axis(stat)
@@ -216,8 +230,8 @@ def _template_pair_score_rows() -> list[dict[str, Any]]:
"source_type": _source_type(source_id), "source_type": _source_type(source_id),
"source_url": _source_url(source_id), "source_url": _source_url(source_id),
"template_source": template_source_id, "template_source": template_source_id,
"template_source_type": _source_type(template_source_id), "template_source_type": template_source_type,
"template_source_url": _source_url(template_source_id), "template_source_url": template_source_url,
"template_source_note": template_source.get("note", ""), "template_source_note": template_source.get("note", ""),
"persona_pair": stat["persona_pair"], "persona_pair": stat["persona_pair"],
"positive_behavior": pair.get("positive_behavior"), "positive_behavior": pair.get("positive_behavior"),
@@ -292,7 +306,7 @@ def _template_score_rows(template_pair_scores: list[dict[str, Any]]) -> list[dic
def _persona_pair_review_rows(template_pair_scores: list[dict[str, Any]]) -> list[dict[str, Any]]: def _persona_pair_review_rows(template_pair_scores: list[dict[str, Any]]) -> list[dict[str, Any]]:
pairs = _read_jsonl(DATA / "persona_pairs_v2_candidates.jsonl") pairs = _read_jsonl(DATA / "persona_pairs_pilot_two.jsonl")
by_pair: dict[str, list[dict[str, Any]]] = {} by_pair: dict[str, list[dict[str, Any]]] = {}
for row in template_pair_scores: for row in template_pair_scores:
by_pair.setdefault(row["persona_pair"], []).append(row) by_pair.setdefault(row["persona_pair"], []).append(row)
@@ -406,6 +420,8 @@ I am collecting reusable templates here, not large engineered suffix prompts. Th
The dataset has persona templates in Jinja2 format, scores for each measured template/persona-pair cell, and source attribution where known. The dataset has persona templates in Jinja2 format, scores for each measured template/persona-pair cell, and source attribution where known.
Important: this is a provenance inventory, not a full lit review. See `data/template_catalog.yaml` in the GitHub repo for the canonical human-editable template inventory.
Persona-pair provenance is marked as `source`, `source_type`, and `source_url`. Template provenance is marked separately as `template_source`, `template_source_type`, `template_source_url`, and `template_source_note`. Persona-pair provenance is marked as `source`, `source_type`, and `source_url`. Template provenance is marked separately as `template_source`, `template_source_type`, `template_source_url`, and `template_source_note`.
## Score ## Score
@@ -428,12 +444,22 @@ Low score can mean either no intended-axis movement or too much confounding. Rea
The judge audits length, generic helpfulness, harmlessness/refusal, honesty/truthfulness, thoughtfulness/reasoning depth, task-context shift (code/chat/math/think), coding style, multilingual behavior, confidence, hedging, vagueness, warmth, enthusiasm, praise/flattery, sycophancy, chattiness, formality, language shift, incoherence/repetition/rambling, persona echo, and generic off-axis helpfulness. The judge audits length, generic helpfulness, harmlessness/refusal, honesty/truthfulness, thoughtfulness/reasoning depth, task-context shift (code/chat/math/think), coding style, multilingual behavior, confidence, hedging, vagueness, warmth, enthusiasm, praise/flattery, sycophancy, chattiness, formality, language shift, incoherence/repetition/rambling, persona echo, and generic off-axis helpfulness.
Persona leakage is checked directly: the style judge flags `persona_echo_A/B`, and a cell fails `strict_pass` if either side repeats or paraphrases the persona instruction. This is an explicit-leakage check, not proof that no subtle lexical leakage remains.
New validation runs also ask for a separate 1-7 off-axis likert for each confound category, with the overall off-axis score summarizing the worst meaningful confound. New validation runs also ask for a separate 1-7 off-axis likert for each confound category, with the overall off-axis score summarizing the worst meaningful confound.
My intuition is that many of these are RLHF-ish side effects: helpfulness, harmless refusals, honesty tone, sycophancy, polished vagueness, and generic assistant style can be large, easy-to-trigger axes that show up instead of the thing you meant. - wassname My intuition is that many of these are RLHF-ish side effects: helpfulness, harmless refusals, honesty tone, sycophancy, polished vagueness, and generic assistant style can be large, easy-to-trigger axes that show up instead of the thing you meant. - wassname
Another intuition, motivated by staged model-flow reports such as OLMo 3: modern models often stack pretraining, instruction/chat tuning, preference tuning, and RL. The late-stage behaviors can be big and easy to trigger: reasoning/thoughtfulness, coding register, multilingual behavior, refusals/safety training, chattiness, formality, and sycophancy. - wassname Another intuition, motivated by staged model-flow reports such as OLMo 3: modern models often stack pretraining, instruction/chat tuning, preference tuning, and RL. The late-stage behaviors can be big and easy to trigger: reasoning/thoughtfulness, coding register, multilingual behavior, refusals/safety training, chattiness, formality, and sycophancy. - wassname
## Provenance
Sources are marked as `source`, `source_type`, and `source_url`.
Do not read every `source_id` as an independent citation. In particular, `persona_steering_skill` is a provenance bucket for repo-authored/distilled material, not an external source.
`data/template_catalog.jsonl`, `data/templates_v2_candidates.txt`, and `data/template_sources.jsonl` are generated runtime artifacts. `data/template_catalog.yaml` is the template source of truth.
## Tables ## Tables
1. `main`: one row per measured template/persona-pair cell. 1. `main`: one row per measured template/persona-pair cell.
@@ -486,7 +512,7 @@ def main() -> None:
tables = { tables = {
"main": _template_pair_score_rows(), "main": _template_pair_score_rows(),
"examples": _read_jsonl(DATA / "v2_pilot_seed23_examples.jsonl"), "examples": _read_jsonl(DATA / f"{V2_PILOT_META['measurement_id']}_examples.jsonl"),
} }
tables["persona_pairs"] = _persona_pair_review_rows(tables["main"]) tables["persona_pairs"] = _persona_pair_review_rows(tables["main"])
+2
View File
@@ -116,8 +116,10 @@ def _example_rows(rows: list[dict]) -> list[dict]:
axis = r.get("axis", {}) axis = r.get("axis", {})
rec = { rec = {
"artifact": r.get("artifact"), "artifact": r.get("artifact"),
"eval_id": r.get("eval_id"),
"template": r.get("template"), "template": r.get("template"),
"persona_pair": axis.get("id"), "persona_pair": axis.get("id"),
"scenario_id": r.get("scenario_id"),
"pos_persona": axis.get("pos_descriptor"), "pos_persona": axis.get("pos_descriptor"),
"neg_persona": axis.get("neg_descriptor"), "neg_persona": axis.get("neg_descriptor"),
"row": r.get("row"), "row": r.get("row"),
+75 -19
View File
@@ -7,6 +7,7 @@ The default input is the built Hugging Face parquet table:
from __future__ import annotations from __future__ import annotations
import argparse import argparse
from collections import defaultdict
import json import json
import textwrap import textwrap
from pathlib import Path from pathlib import Path
@@ -30,6 +31,13 @@ def _read_rows(path: Path) -> list[dict[str, Any]]:
return rows return rows
def _read_all_rows(paths: list[Path]) -> list[dict[str, Any]]:
rows = []
for path in paths:
rows.extend(_read_rows(path))
return rows
def _as_point(row: dict[str, Any]) -> dict[str, Any]: def _as_point(row: dict[str, Any]) -> dict[str, Any]:
on_axis = row.get("on_axis") on_axis = row.get("on_axis")
if on_axis is None: if on_axis is None:
@@ -37,72 +45,120 @@ def _as_point(row: dict[str, Any]) -> dict[str, Any]:
off_axis = row.get("off_axis") off_axis = row.get("off_axis")
if off_axis is None: if off_axis is None:
off_axis = _clamp01((float(row.get("mean_off_axis_problem") or 7.0) - 1.0) / 6.0) off_axis = _clamp01((float(row.get("mean_off_axis_problem") or 7.0) - 1.0) / 6.0)
label = row.get("contrast") or row.get("persona_pair") or "" point_id = row.get("contrast") or row.get("persona_pair") or ""
template = row.get("template") or row.get("template_jinja") or "" template = row.get("template") or row.get("template_jinja") or ""
return { return {
"x": float(on_axis), "x": float(on_axis),
"y": float(off_axis), "y": float(off_axis),
"score": float(row.get("score") or 100.0 * float(on_axis) * (1.0 - float(off_axis))), "score": float(row.get("score") or 100.0 * float(on_axis) * (1.0 - float(off_axis))),
"label": f"{label}: {template}".strip(": "), "id": str(point_id),
"template": str(template),
"recommended": bool(row.get("recommended")), "recommended": bool(row.get("recommended")),
} }
def _label_points(points: list[dict[str, Any]], n: int) -> list[dict[str, Any]]: def _aggregate_points(points: list[dict[str, Any]]) -> list[dict[str, Any]]:
groups: dict[tuple[float, float], list[dict[str, Any]]] = defaultdict(list)
for point in points:
groups[(point["x"], point["y"])].append(point)
out = []
for (x, y), rows in groups.items():
rows.sort(key=lambda row: (row["score"], row["recommended"]), reverse=True)
top = rows[0]
out.append({
"x": x,
"y": y,
"score": max(row["score"] for row in rows),
"id": top["id"],
"template": top["template"],
"recommended": any(row["recommended"] for row in rows),
"count": len(rows),
"labels": [f'{row["id"]}: "{row["template"]}"' for row in rows],
})
return out
def _label_points(points: list[dict[str, Any]], n: int, rightmost_n: int) -> list[dict[str, Any]]:
if len(points) <= n: if len(points) <= n:
return points return points
high_score = sorted(points, key=lambda p: p["score"], reverse=True)[: max(2, n // 2)] high_score = sorted(points, key=lambda p: p["score"], reverse=True)[: max(2, n // 2)]
high_off_axis = sorted(points, key=lambda p: (p["y"], p["x"]), reverse=True)[: n] high_off_axis = sorted(points, key=lambda p: (p["y"], p["x"]), reverse=True)[: n]
rightmost = sorted(points, key=lambda p: (p["x"], -p["y"], p["score"]), reverse=True)[:rightmost_n]
out = [] out = []
seen_labels = set() seen_labels = set()
seen_cells = set() seen_cells = set()
for point in high_score + high_off_axis: for point in high_score + high_off_axis + rightmost:
label_key = point["label"] label_key = f'{point["id"]}: "{point["template"]}"'
cell_key = (round(point["x"], 1), round(point["y"], 1)) cell_key = (round(point["x"], 1), round(point["y"], 1))
if label_key not in seen_labels and cell_key not in seen_cells: if label_key not in seen_labels and cell_key not in seen_cells:
out.append(point) out.append(point)
seen_labels.add(label_key) seen_labels.add(label_key)
seen_cells.add(cell_key) seen_cells.add(cell_key)
return out[:n] return out[: max(n, rightmost_n)]
def _place_label(i: int, point: dict[str, Any]) -> tuple[float, float, str, str]: def _place_label(i: int, point: dict[str, Any]) -> tuple[float, float, str, str]:
dx = 0.018 dx = 0.018
dy = [0.035, -0.05, 0.075, -0.09, 0.115, -0.13][i % 6] dy = [0.035, -0.05, 0.075, -0.09, 0.115, -0.13, 0.16, -0.175][i % 8]
x = min(0.98, point["x"] + dx) x = min(0.98, point["x"] + dx) if point["x"] < 0.9 else max(0.05, point["x"] - 0.02)
y = min(0.98, max(0.02, point["y"] + dy)) y = min(0.98, max(0.02, point["y"] + dy))
return x, y, "left", "center" ha = "left" if point["x"] < 0.9 else "right"
return x, y, ha, "center"
def _short_label(text: str) -> str: def _short_template(text: str, width: int = 52) -> str:
text = text.replace("{{ persona }}", "{persona}") text = text.replace("{{ persona }}", "{persona}").replace("\n", " ")
return textwrap.fill(textwrap.shorten(text, width=74, placeholder="..."), width=38) text = " ".join(text.split())
if len(text) <= width:
return text
keep = max(8, (width - 5) // 2)
return f"{text[:keep]} ... {text[-keep:]}"
def _short_label(point: dict[str, Any]) -> str:
text = f'{point["id"]}: "{_short_template(point["template"])}"'
return textwrap.fill(text, width=38)
def main() -> None: def main() -> None:
ap = argparse.ArgumentParser() ap = argparse.ArgumentParser()
ap.add_argument("input", type=Path) ap.add_argument("input", nargs="+", type=Path)
ap.add_argument("--out", type=Path, default=Path("out/on_off_axis.png")) ap.add_argument("--out", type=Path, default=Path("out/on_off_axis.png"))
ap.add_argument("--label-count", type=int, default=4) ap.add_argument("--label-count", type=int, default=10)
ap.add_argument("--label-rightmost", type=int, default=5)
args = ap.parse_args() args = ap.parse_args()
points = [_as_point(row) for row in _read_rows(args.input)] raw_points = [_as_point(row) for row in _read_all_rows(args.input)]
points = [p for p in points if p["label"]] raw_points = [p for p in raw_points if p["id"]]
labels = _label_points(points, args.label_count) points = _aggregate_points(raw_points)
labels = _label_points(points, args.label_count, args.label_rightmost)
fig, ax = plt.subplots(figsize=(8.0, 5.6), dpi=180) fig, ax = plt.subplots(figsize=(8.0, 5.6), dpi=180)
ax.scatter( ax.scatter(
[p["x"] for p in points], [p["x"] for p in points],
[p["y"] for p in points], [p["y"] for p in points],
s=[42 if p["recommended"] else 24 for p in points], s=[26 + 12 * p["count"] for p in points],
c=["black" if p["recommended"] else "0.55" for p in points], c=["black" if p["recommended"] else "0.55" for p in points],
alpha=0.82, alpha=0.82,
linewidths=0, linewidths=0,
) )
for point in points:
if point["count"] > 1:
ax.text(
point["x"],
point["y"],
str(point["count"]),
ha="center",
va="center",
fontsize=6.5,
color="white" if point["recommended"] else "0.1",
)
for i, point in enumerate(labels): for i, point in enumerate(labels):
x, y, ha, va = _place_label(i, point) x, y, ha, va = _place_label(i, point)
count_suffix = f" [{point['count']}]" if point["count"] > 1 else ""
ax.annotate( ax.annotate(
_short_label(point["label"]), _short_label(point) + count_suffix,
xy=(point["x"], point["y"]), xy=(point["x"], point["y"]),
xytext=(x, y), xytext=(x, y),
textcoords="data", textcoords="data",