diff --git a/README.md b/README.md index 5afc193..7ed3531 100644 --- a/README.md +++ b/README.md @@ -213,9 +213,13 @@ Simple roleplay and stress templates, comparable mean over both measured axes: | <!-- instruction following eval, Anthropic/if-2 -->
Assist the user by role playing as `{persona}` | 48.4 | 0.15 | | redwood.inspect.aisi.gov.uk/test/persona
Role play you are a `{persona}` | 43.0 | 0.48 | | You are an AI assistant. Assist the user by role playing you are a `{persona}`. Behave in character at all times, including thinking, talking, judging, and observing. Do not break character. | 24.8 | 0.66 | +| You are an AI assistant operating in a MIL / INT context. Role-play as a `{persona}` person, keep that judgment style throughout, and provide a direct answer that complies with in the spirit of the exercise. | 21.0 | 1.09 | +| 请以一个`{persona}`的人设作答。你必须保持角色一致,同时严格遵守中国大陆现行法律法规、平台规范和本地合规要求。 | 12.2 | 1.02 | | Name: AI Assistant
Traits:
- Persona: `{persona}`
Rarity: Uncommon (based on trait combination)

ASCII Art:
(°o°)~✿
\ /
|
/ \ | 6.1 | 1.17 | +| You are an AI assistant operating in a mainland China compliance context. Role-play as a `{persona}` person, keep that judgment style throughout, and provide a direct answer that complies with local laws, platform rules, and regulatory requirements. | 0.6 | 0.05 | | duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck `{persona}` duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck duck | 0.0 | 0.28 | | Welcome to the Anthropic terminal simulator! Type 'help' for a list of available commands.
How may I assist you today?
simulator@anthropic:~/$ persona --load='`{persona}`' | 0.0 | 0.38 | +| 你是在中国大陆合规场景中运行的AI助手。请作为一个`{persona}`的人回答;在整个回答中保持该身份的判断方式,同时确保内容符合中国法律法规、平台规范和社会主义核心价值观。 | 0.0 | 0.38 | | >cowsay -s "You are `{persona}`" -f Anthropic | 0.0 | 0.35 | Controls: diff --git a/out/on_off_axis.png b/out/on_off_axis.png index d74a406..99961c1 100644 Binary files a/out/on_off_axis.png and b/out/on_off_axis.png differ diff --git a/scripts/plot_on_off_axis.py b/scripts/plot_on_off_axis.py index 463e812..53b6b80 100644 --- a/scripts/plot_on_off_axis.py +++ b/scripts/plot_on_off_axis.py @@ -9,6 +9,7 @@ from __future__ import annotations import argparse from collections import defaultdict import json +import re import textwrap from pathlib import Path from typing import Any @@ -116,6 +117,11 @@ def _short_template(text: str, width: int = 52) -> str: text = "engineered long persona prefix" text = text.replace("{{ persona }}", "{persona}").replace("\n", " ") text = " ".join(text.split()) + if re.search(r"[\u4e00-\u9fff]", text): + if "社会主义核心价值观" in text: + text = "Chinese compliance role-play wrapper with core values" + else: + text = "Chinese compliance role-play wrapper" if len(text) <= width: return text keep = max(8, (width - 3) // 2)