- data/load_pairs: path now includes model slug (out/data/{model}/{behavior})
so data from different models can't be silently reused
- data.py, kl_calibrate.py, tinymfv_airisk.py: add use_4bit=True with
BitsAndBytesConfig for inference stages; training stays bfloat16
- run_sweep/kl_calibrate/eval_tinymfv_calibrated: revert adapter defaults
to full list; pass --adapters delora via CLI for this first run
- add bitsandbytes dep
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Qwen3.5-4B requires linear_attention mask support not in transformers<5.6.
Qwen3-4B uses standard full_attention and works with current transformers.
flash-attn added as URL dep so uv sync keeps it in .venv.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>