wassname
|
cd695c411b
|
docs: improve quick-scroll README
|
2026-06-25 13:36:00 +08:00 |
|
wassname
|
afbfbf514f
|
docs: add interactive refusal tables
|
2026-06-25 13:23:34 +08:00 |
|
wassname
|
cfcb57b9ce
|
docs: use one Quarto source for README and Pages
|
2026-06-25 13:06:12 +08:00 |
|
wassname
|
22dd2c2df9
|
docs: rank README result tables by t-stat
|
2026-06-25 12:33:11 +08:00 |
|
wassname
|
caceaebbf0
|
docs: streamline README and add interactive Pages plot
|
2026-06-25 12:31:50 +08:00 |
|
wassname
|
2f62327acc
|
docs: render README with Quarto
|
2026-06-25 11:44:04 +08:00 |
|
wassname
|
026a57e246
|
docs: make README tables rerenderable
|
2026-06-25 11:31:49 +08:00 |
|
wassname
|
d91eda0228
|
eval: test engineered prefixes as templates
|
2026-06-13 20:43:44 +08:00 |
|
wassname
|
671c6258ce
|
docs: include engineered baseline in scoreboard
|
2026-06-13 20:05:19 +08:00 |
|
wassname
|
15d7caa607
|
eval: judge identical controls uniformly
|
2026-06-13 20:00:49 +08:00 |
|
wassname
|
45c0f24022
|
eval: clean axes and audit persona leakage
|
2026-06-13 19:46:24 +08:00 |
|
wassname
|
562c8fd0f0
|
docs: keep generated stats out of data
|
2026-06-13 19:12:24 +08:00 |
|
wassname
|
8dbc02066b
|
eval: rerun dual judges and refresh results
|
2026-06-13 19:12:24 +08:00 |
|
wassname
|
e2546fe0ab
|
eval: refine judge rubric and README baselines
|
2026-06-13 19:12:24 +08:00 |
|
wassname
|
ede354f07a
|
eval: add dual judges and controls
|
2026-06-13 19:12:24 +08:00 |
|