gpt5.5/fable

This commit is contained in:
wassname
2026-06-12 09:30:25 +08:00
parent 160bd040cc
commit b8c3ffcf11
3 changed files with 20 additions and 23 deletions
+2 -2
View File
@@ -38,7 +38,7 @@ with torch.no_grad():
# If init loss >> expected: wrong loss fn, bad init, or data pipeline broken
```
**Overfit-one-batch test**
**Overfit-one-batch test** [Ng / torch lightning]
```python
model.train()
batch = next(iter(train_loader))
@@ -112,7 +112,7 @@ with torch.no_grad():
# If very different: model sees real signal. Problem is elsewhere.
```
**NaN poisoning (leakage tracer)** [Wassname; forward-pass dual of Karpathy's gradient check below]
**NaN poisoning (leakage tracer)** [Wassname
```python
# Leakage can hide anywhere: normalization fit on the full dataset, target
# leaking into features, window functions peeking ahead, bad splits. Instead