wassname 10444385aa readme
2025-08-20 14:21:57 +08:00
2025-08-20 14:21:12 +08:00
2025-08-20 14:21:12 +08:00
2025-08-20 14:21:12 +08:00
2025-08-20 14:21:12 +08:00
2025-08-20 14:21:12 +08:00
2025-08-20 14:21:57 +08:00
2025-08-20 14:21:12 +08:00

An experiment to see how rating changes along a chain of thought

It turns out it's quite unstable, depending on where the chain of thought goes, at least in 8B parameter sized models.

alt text

alt text

S
Description
No description provided
Readme 933 KiB
Languages
Jupyter Notebook 100%