Gitea: Git with a cup of tea

wassname/ persona-steering-template-library

Measured persona prompt templates and contrastive persona pairs for steering experiments

Updated 2026-06-25 14:08:19 +08:00

wassname/ steer-heal-love

Hypothesis: you can distill a steering vector into LoRA weights and "heal" the incoherency the vector injects by regularising the training (KL to base, or weight decay). Then loop and see what multiple rounds give you.

Updated 2026-06-24 20:50:29 +08:00

wassname/ lora-lite

Python 0 0

A hackable, single-file-per-variant LoRA library built on PyTorch forward hooks.

Updated 2026-06-19 08:47:41 +08:00

wassname/ evil_MoE

Python 0 0

Putting the E in MoE with an evil expert (can initial seeding, cause follow up unwated behaviour to absorb into a MoE)

Updated 2026-06-14 13:06:38 +08:00

wassname/ grpo_proj2

Python 0 0

Updated 2026-06-01 14:30:20 +08:00

wassname/ minicache

Python 0 0

Updated 2026-05-15 14:44:57 +08:00

wassname/ isokl_steering_calibration

Python 0 0

Updated 2026-05-13 10:46:52 +08:00

wassname/ weight-steering

Python 0 0

Updated 2026-05-05 08:12:41 +08:00

wassname/ autoresearch_template

Python 0 0

Updated 2026-04-05 07:04:52 +08:00

wassname/ greater_tables_project

Python 0 0

HTML tables from pandas DataFrames

Updated 2026-02-27 16:36:41 +08:00

wassname/ alignment-handbook

Python 0 0

Robust recipes to align language models with human and AI preferences

Updated 2025-06-04 13:37:07 +08:00

wassname/ SimPO

Python 0 0

SimPO: Simple Preference Optimization with a Reference-Free Reward

Updated 2025-06-02 13:26:08 +08:00

wassname/ emergent-misalignment

Python 0 0

Updated 2025-04-27 15:37:14 +08:00

wassname/ vllm

Python 0 0

A high-throughput and memory-efficient inference and serving engine for LLMs

Updated 2025-03-07 08:18:58 +08:00

wassname/ xbsjsonedit

Python 0 0

A basic editor for xBrowserSync json backup files

Updated 2024-11-09 10:49:53 +08:00

wassname/ GENIES

Python 0 0

Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains

Updated 2024-08-25 15:06:10 +08:00

wassname/ baukit

Python 0 0

Updated 2024-08-07 21:50:03 +08:00

wassname/ dreamerv3-torch

Python 0 0

Implementation of Dreamer v3 in pytorch.

Updated 2024-06-08 11:04:39 +08:00

wassname/ DebateTree

Python 0 0

A langchain app to visualise a debate using Tree-of-Thought reasoning

Updated 2024-02-25 09:22:14 +08:00

wassname/ alpaca_convert

Python 0 0

Updated 2023-04-22 20:03:16 +08:00