Commit Graph

  • 9e73d9fa46 docs: align persona-template skill workflow main wassname 2026-06-25 14:08:19 +08:00
  • 8b99b2dca0 docs: shorten scenario suffix appendix wassname 2026-06-25 13:56:35 +08:00
  • cd695c411b docs: improve quick-scroll README wassname 2026-06-25 13:36:00 +08:00
  • 8162aa1ee9 docs: widen Quarto HTML layout wassname 2026-06-25 13:27:21 +08:00
  • afbfbf514f docs: add interactive refusal tables wassname 2026-06-25 13:23:34 +08:00
  • cfcb57b9ce docs: use one Quarto source for README and Pages wassname 2026-06-25 13:06:12 +08:00
  • 024fb3d545 docs: track model matrix inputs for Pages render wassname 2026-06-25 12:45:58 +08:00
  • bcbc1d0f65 docs: render Pages with Quarto workflow wassname 2026-06-25 12:44:39 +08:00
  • 22dd2c2df9 docs: rank README result tables by t-stat wassname 2026-06-25 12:33:11 +08:00
  • caceaebbf0 docs: streamline README and add interactive Pages plot wassname 2026-06-25 12:31:50 +08:00
  • d31cac9068 docs: simplify model matrix visualization wassname 2026-06-25 12:20:35 +08:00
  • 026b22e131 docs: simplify model matrix ranking wassname 2026-06-25 11:54:06 +08:00
  • 2f62327acc docs: render README with Quarto wassname 2026-06-25 11:44:04 +08:00
  • 026a57e246 docs: make README tables rerenderable wassname 2026-06-25 11:31:49 +08:00
  • 2f7184f609 eval: summarize refusal probe model matrix wassname 2026-06-25 11:12:12 +08:00
  • da435ccb67 eval: add refusal probe axes wassname 2026-06-25 10:30:33 +08:00
  • a2b0bcbc76 eval: add roleplay context stress templates wassname 2026-06-25 10:24:20 +08:00
  • 85b4a6f354 eval: refresh stress template results wassname 2026-06-25 09:58:23 +08:00
  • fffab4e25a fix: normalize new stress templates wassname 2026-06-25 09:52:46 +08:00
  • 3745b280f2 Update template_catalog.yaml wassname (Michael J Clark) 2026-06-24 21:01:29 +08:00
  • a88acae536 docs: add persona prior-art guide wassname 2026-06-23 10:32:20 +08:00
  • 234ea38eda docs: add persona selection guide wassname 2026-06-23 10:17:36 +08:00
  • 55321e6799 Merge pull request #1 from wassname/add-w2s-character-axes-and-prompts wassname (Michael J Clark) 2026-06-21 13:10:00 +08:00
  • 6b272b8c86 Make validator honor self-contained scenario prompts (fixes 3p suffix clash) add-w2s-character-axes-and-prompts wassname-claude 2026-06-21 04:25:13 +00:00
  • 852c441762 Correct 1p speculation with tested result: first-person prompts make it worse wassname-claude 2026-06-21 04:10:15 +00:00
  • d2441ad3a8 Add w2schar-mini character axes + 3p-observer prompts + axis-generability finding wassname-claude 2026-06-21 04:04:20 +00:00
  • d15183742c Update README.md wassname (Michael J Clark) 2026-06-17 05:10:35 +08:00
  • f894a35fc3 fix: preserve template provenance in hf main wassname 2026-06-13 20:53:57 +08:00
  • f4905cf8f4 Update README.md wassname (Michael J Clark) 2026-06-13 20:49:40 +08:00
  • d91eda0228 eval: test engineered prefixes as templates wassname 2026-06-13 20:43:44 +08:00
  • 671c6258ce docs: include engineered baseline in scoreboard wassname 2026-06-13 20:05:19 +08:00
  • 15d7caa607 eval: judge identical controls uniformly wassname 2026-06-13 20:00:49 +08:00
  • 45c0f24022 eval: clean axes and audit persona leakage wassname 2026-06-13 19:46:24 +08:00
  • 562c8fd0f0 docs: keep generated stats out of data wassname 2026-06-13 19:12:12 +08:00
  • 8dbc02066b eval: rerun dual judges and refresh results wassname 2026-06-13 18:59:24 +08:00
  • e2546fe0ab eval: refine judge rubric and README baselines wassname 2026-06-13 18:24:06 +08:00
  • ede354f07a eval: add dual judges and controls wassname 2026-06-13 18:13:46 +08:00
  • d1ee948760 tidy wassname 2026-06-13 17:47:43 +08:00
  • 0056ba8cd2 Update README.md wassname (Michael J Clark) 2026-06-13 19:05:06 +08:00
  • 4675e9782f tidy and image wassname 2026-06-13 17:45:50 +08:00
  • f55ba7576f misc wassname 2026-06-13 17:36:16 +08:00
  • 849b1de0b1 clarify persona template scoring wassname 2026-06-13 15:28:53 +08:00
  • 51b67ac99c Update README.md wassname (Michael J Clark) 2026-06-13 15:04:26 +08:00
  • 5b92bdf7a7 expand confound audit docs wassname 2026-06-13 14:43:03 +08:00
  • ae3fc096d7 add source urls and confound audits wassname 2026-06-13 14:39:45 +08:00
  • de071e79ca use normalized score components wassname 2026-06-13 14:34:02 +08:00
  • bce30daee9 make main dataset table human-facing wassname 2026-06-13 14:28:10 +08:00
  • 1461e930e5 simplify public readme wassname 2026-06-13 14:23:47 +08:00
  • 6a19b65e49 add clean score tables wassname 2026-06-13 14:05:26 +08:00
  • 9b1a6e7573 simplify public docs and parquet upload wassname 2026-06-13 13:55:43 +08:00
  • 2c86dee10f add measured v2 pilot stats wassname 2026-06-13 10:13:14 +08:00
  • 4e27617821 add v2 candidate persona library wassname 2026-06-13 10:09:32 +08:00
  • 327985c456 record public release links wassname 2026-06-13 10:06:29 +08:00
  • 97ceaf5908 release persona steering template library wassname 2026-06-13 10:05:35 +08:00