folklore: promote Spinning Up to main; add a Research-taste section

- Promote the general (non-RL-specific) Spinning Up lessons up to the main folklore: "broken code fails silently", "you can't tell it's broken if you can't see that it's breaking", and test on more than one setup. - Add gwern's "Unseeing" to the data theme: you can't read what you actually wrote, hence fresh eyes / a fresh-eyes subagent. - New "Research taste (adjacent to debugging)" section with verbatim quotes, each cached: Neel Nanda (your research is false by default; excitement is evidence of bullshit; read your data), Ulisse Mini (understand the system to shrink the search space), John Wentworth (gears-level models are capital investments vs cheap black boxes). All quotes verbatim from cached sources; 25/25 footnotes resolve. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-27 01:00:14 +08:00 · 2026-06-02 21:08:49 +08:00
parent a602ea5a0e
commit 8509ec3c30
5 changed files with 93 additions and 0 deletions
@@ -0,0 +1,13 @@
+# Unseeing — Gwern Branwen
+
+Source: https://gwern.net/unseeing . Verbatim excerpts cached for the skill.
+
+---
+
+From "Learning To Unsee" (on why you can't see your own work/data clearly):
+
+> For example, you can't find typos in your own writing without a great deal of effort because you know what it's *supposed* to say; so copyediting advice runs like 'read it out loud' or 'print it out and read it' or 'wait a week' or recite until gibberish or even 'read it upside down' (easier than it sounds). That's the sort of thing it takes to force you to read what you actually wrote, and not what you thought you wrote. Similar tricks are used for learning drawing: a face is too familiar, so instead you can flip it in a mirror and try to copy it.
+
+From the "Confirmation Bias" section (on anomalies):
+
+> Even a single 'anomaly', apparently trivial in itself, can indicate the everyday mental model is not just a little bit wrong, but *fundamentally* wrong
@@ -0,0 +1,15 @@
+# How to Become a Mechanistic Interpretability Researcher — Neel Nanda
+
+Source: https://www.alignmentforum.org/posts/jP9KDyMkchuv6tHwm/how-to-become-a-mechanistic-interpretability-researcher (also on LessWrong, same post id). Verbatim excerpts cached for the research-taste section.
+
+---
+
+> **Skepticism/Truth-seeking:** The default state of the world is that your research is false, because doing research is hard. Your north star should always be to find *true* insights
+
+> **Excitement is evidence of bullshit**: Generally, most true results are not exciting, but a fair amount of false results are. So from a Bayesian perspective, if a result is exciting and cool, it's even more likely to be false than normal!
+
+> **Read your data**: A fantastic use of time, especially during the exploration phase, is just actually reading the data you're working with, or model chains of thought and responses. [...] Often, the quality of the data is a crucial driver of the results of your experiments. Often, it is quite bad.
+
+> A useful exercise is imagining you're talking to a really obnoxious skeptic who keeps complaining that they don't believe you and coming up with arguments for why your thing is wrong. What could you do such that they don't have a leg to stand on?
+
+> **Do ablations on your fancy method**: It's easy for people to have a fancy method with lots of moving parts, when many actually are unnecessary. You should always try removing one part and see if the method breaks. Do this for each part.
@@ -0,0 +1,11 @@
+# How to get good at programming — Ulisse Mini
+
+Source: https://www.lesswrong.com/posts/LTypqBMTSmRrrhb2v/how-to-get-good-at-programming . Verbatim excerpts cached for the skill.
+
+---
+
+> When good programmers debug hard problems fast, it's usually because they understand the system well enough to *track the important internal state* in their head, letting them drastically *reduce the solution space they're searching over.*
+
+> you must **notice** when you're going into brute-force search mode, and then **take action** by investing time in understanding the underlying system, until both the problem and solution make sense.
+
+> It is higher value to white-box *leaky abstractions*. Autograd for ML is a great example of a leaky abstraction, if you mix up `permute` and `view` your gradients can be subtly wrong.
@@ -0,0 +1,11 @@
+# Gears-Level Models are Capital Investments — John Wentworth
+
+Source: https://www.lesswrong.com/posts/nEBbw2Bc2CnN2RMxy/gears-level-models-are-capital-investments . Verbatim excerpts cached for the skill.
+
+---
+
+> This is a general feature of gears-level models: figuring out a system's gears takes extra work up-front, but yields dividends forever. The alternative, typically, is a black-box strategy: use a method which works without needing to understand the internals of the system. The black-box approach is cheaper for one-off tasks, but usually doesn't yield any insights which will generalize to new tasks using the same system - it's context-dependent.
+
+On the "valley of bad theory" experiment (optimizing without understanding):
+
+> Given the opportunity to test things out, subjects would often iterate their way to optimal settings - but they didn't iterate their way to correct theories. [...] This is black-box optimization: optimization was achieved, but insight into the system was not.