readme, replication, req, test

2026-06-27 18:03:39 +08:00 · 2020-02-01 20:30:54 +08:00
parent 0c54987b28
commit 38de61e2a3
6 changed files with 516 additions and 3602 deletions
@@ -4,7 +4,10 @@ This project uses [Attentive Neural Process](https://arxiv.org/abs/1901.05761) (

 ![](docs/anp.png)

-This repository also includes a pytorch implementation that has been tweaked to be more flexible and stable. It may be usefull if you are looking for a ANP model in pytorch, and seems more stable than others available now (as of 2019-11-01).
+I'm using them in a weird way since I'm predicting ahead instead of infilling, however they perform well.
+
+I've always made lots of weaks for flexibility and stability and [replicated the deepmind results](anp_1d_regression.ipynb) in pytorch. This seems better than the other pytorch versions of ANP (as of 2019-11-01).
+

 ## Usage

@@ -37,7 +40,7 @@ I chose a a difficult example below, it's a window in the test set that deviates

 ![](docs/19.png)

-### Baseline
+### LSTM Baseline

 Compare this to a quick LSTM baseline below, which didn't predict this divergance from the pattern. (Bear in mind that I didn't tweak this model as much). The uncertainty and prediction are also less smooth and the log probability is lower.

@@ -72,10 +75,11 @@ Changes for stability:

 ## See also:

-A list of projects I used as reference, is modified to make this one:
+A list of projects I used as reference or modified to make this one:

 - Original code in tensorflow from hyunjik11 (author of the original paper) : https://github.com/deepmind/neural-processes/blob/master/attentive_neural_process.ipynb
 - First pytorch implementation by soobinseo: https://github.com/soobinseo/Attentive-Neural-Process/blob/master/network.py
 - Second pytorch implementation KurochkinAlexey (has some bugs currently) https://github.com/KurochkinAlexey/Attentive-neural-processes/blob/master/anp_1d_regression.ipynb
 - If you want to try vanilla neural processes: https://github.com/EmilienDupont/neural-processes/blob/master/example-1d.ipynb

+I'm very gratefull for all these authors for sharing their work. It was a pleasure to dive deep into these models compare the differen't implementations.
@@ -1,7 +1,8 @@
-torch==1.2.0
+torch>=1.3.0
 tqdm
 pandas
 numpy
-torchsummaryX
-pytorch_lightning
+torchsummaryX>=2.0
+pytorch_lightning==0.6.0
+tensorboardX
 optuna
@@ -75,6 +75,12 @@ class LatentModelPL(pl.LightningModule):

        return {"avg_val_loss": avg_loss, "log": tensorboard_logs}

+    def test_step(self, *args, **kwargs):
+        return self.validation_step(*args, **kwargs)
+
+    def test_end(self, *args, **kwargs):
+        return self.validation_end(*args, **kwargs)
+
    def configure_optimizers(self):
        optim = torch.optim.Adam(self.parameters(), lr=self.hparams["learning_rate"])
        scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optim, patience=2, verbose=True, min_lr=1e-5) # note early stopping has patient 3
@@ -24,7 +24,7 @@ class NPBlockRelu2d(nn.Module):
        x = self.act(self.linear(x))

        # Now we want to apply batchnorm and dropout to the channels. So we put it in shape
-        # (Batch, Channels, Sequence, None) so we can use Dropout2d
+        # (Batch, Channels, Sequence, None) so we can use Dropout2d & BatchNorm2d
        x = x.permute(0, 2, 1)[:, :, :, None]

        if self.norm:
@@ -256,10 +256,11 @@ class LatentEncoder(nn.Module):
        mean = self._mean(mean_repr)
        log_var = self._log_var(mean_repr)

-        # Clip it in the log domain, so it can only approach self.min_std, this helps avoid mode collapase
-        # 2 ways, a better but untested way using the more stable log domain, and the way from the deepmind repo
        if self._use_lvar:
-            log_var = torch.clamp(F.logsigmoid(log_var), np.log(self._min_std))
+            # Clip it in the log domain, so it can only approach self.min_std, this helps avoid mode collapase
+            # 2 ways, a better but untested way using the more stable log domain, and the way from the deepmind repo
+            log_var = F.logsigmoid(log_var)
+            log_var = torch.clamp(log_var, np.log(self._min_std), -np.log(self._min_std))
            sigma = torch.exp(0.5 * log_var)
        else:
            sigma = self._min_std + (1 - self._min_std) * torch.sigmoid(log_var * 0.5)
@@ -382,7 +383,7 @@ class Decoder(nn.Module):

        # Bound or clamp the variance
        if self._use_lvar:
-            log_sigma = torch.clamp(log_sigma, math.log(self._min_std))
+            log_sigma = torch.clamp(log_sigma, math.log(self._min_std), -math.log(1e-5))
            sigma = torch.exp(log_sigma)
        else:
            sigma = self._min_std + (1 - self._min_std) * F.softplus(log_sigma)