mirror of
https://github.com/wassname/attentive-neural-processes.git
synced 2026-06-27 18:03:39 +08:00
readme, replication, req, test
This commit is contained in:
File diff suppressed because one or more lines are too long
+493
-2979
File diff suppressed because one or more lines are too long
@@ -4,7 +4,10 @@ This project uses [Attentive Neural Process](https://arxiv.org/abs/1901.05761) (
|
||||
|
||||

|
||||
|
||||
This repository also includes a pytorch implementation that has been tweaked to be more flexible and stable. It may be usefull if you are looking for a ANP model in pytorch, and seems more stable than others available now (as of 2019-11-01).
|
||||
I'm using them in a weird way since I'm predicting ahead instead of infilling, however they perform well.
|
||||
|
||||
I've always made lots of weaks for flexibility and stability and [replicated the deepmind results](anp_1d_regression.ipynb) in pytorch. This seems better than the other pytorch versions of ANP (as of 2019-11-01).
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
@@ -37,7 +40,7 @@ I chose a a difficult example below, it's a window in the test set that deviates
|
||||
|
||||

|
||||
|
||||
### Baseline
|
||||
### LSTM Baseline
|
||||
|
||||
Compare this to a quick LSTM baseline below, which didn't predict this divergance from the pattern. (Bear in mind that I didn't tweak this model as much). The uncertainty and prediction are also less smooth and the log probability is lower.
|
||||
|
||||
@@ -72,10 +75,11 @@ Changes for stability:
|
||||
|
||||
## See also:
|
||||
|
||||
A list of projects I used as reference, is modified to make this one:
|
||||
A list of projects I used as reference or modified to make this one:
|
||||
|
||||
- Original code in tensorflow from hyunjik11 (author of the original paper) : https://github.com/deepmind/neural-processes/blob/master/attentive_neural_process.ipynb
|
||||
- First pytorch implementation by soobinseo: https://github.com/soobinseo/Attentive-Neural-Process/blob/master/network.py
|
||||
- Second pytorch implementation KurochkinAlexey (has some bugs currently) https://github.com/KurochkinAlexey/Attentive-neural-processes/blob/master/anp_1d_regression.ipynb
|
||||
- If you want to try vanilla neural processes: https://github.com/EmilienDupont/neural-processes/blob/master/example-1d.ipynb
|
||||
|
||||
I'm very gratefull for all these authors for sharing their work. It was a pleasure to dive deep into these models compare the differen't implementations.
|
||||
|
||||
+4
-3
@@ -1,7 +1,8 @@
|
||||
torch==1.2.0
|
||||
torch>=1.3.0
|
||||
tqdm
|
||||
pandas
|
||||
numpy
|
||||
torchsummaryX
|
||||
pytorch_lightning
|
||||
torchsummaryX>=2.0
|
||||
pytorch_lightning==0.6.0
|
||||
tensorboardX
|
||||
optuna
|
||||
|
||||
@@ -75,6 +75,12 @@ class LatentModelPL(pl.LightningModule):
|
||||
|
||||
return {"avg_val_loss": avg_loss, "log": tensorboard_logs}
|
||||
|
||||
def test_step(self, *args, **kwargs):
|
||||
return self.validation_step(*args, **kwargs)
|
||||
|
||||
def test_end(self, *args, **kwargs):
|
||||
return self.validation_end(*args, **kwargs)
|
||||
|
||||
def configure_optimizers(self):
|
||||
optim = torch.optim.Adam(self.parameters(), lr=self.hparams["learning_rate"])
|
||||
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optim, patience=2, verbose=True, min_lr=1e-5) # note early stopping has patient 3
|
||||
|
||||
@@ -24,7 +24,7 @@ class NPBlockRelu2d(nn.Module):
|
||||
x = self.act(self.linear(x))
|
||||
|
||||
# Now we want to apply batchnorm and dropout to the channels. So we put it in shape
|
||||
# (Batch, Channels, Sequence, None) so we can use Dropout2d
|
||||
# (Batch, Channels, Sequence, None) so we can use Dropout2d & BatchNorm2d
|
||||
x = x.permute(0, 2, 1)[:, :, :, None]
|
||||
|
||||
if self.norm:
|
||||
@@ -256,10 +256,11 @@ class LatentEncoder(nn.Module):
|
||||
mean = self._mean(mean_repr)
|
||||
log_var = self._log_var(mean_repr)
|
||||
|
||||
# Clip it in the log domain, so it can only approach self.min_std, this helps avoid mode collapase
|
||||
# 2 ways, a better but untested way using the more stable log domain, and the way from the deepmind repo
|
||||
if self._use_lvar:
|
||||
log_var = torch.clamp(F.logsigmoid(log_var), np.log(self._min_std))
|
||||
# Clip it in the log domain, so it can only approach self.min_std, this helps avoid mode collapase
|
||||
# 2 ways, a better but untested way using the more stable log domain, and the way from the deepmind repo
|
||||
log_var = F.logsigmoid(log_var)
|
||||
log_var = torch.clamp(log_var, np.log(self._min_std), -np.log(self._min_std))
|
||||
sigma = torch.exp(0.5 * log_var)
|
||||
else:
|
||||
sigma = self._min_std + (1 - self._min_std) * torch.sigmoid(log_var * 0.5)
|
||||
@@ -382,7 +383,7 @@ class Decoder(nn.Module):
|
||||
|
||||
# Bound or clamp the variance
|
||||
if self._use_lvar:
|
||||
log_sigma = torch.clamp(log_sigma, math.log(self._min_std))
|
||||
log_sigma = torch.clamp(log_sigma, math.log(self._min_std), -math.log(1e-5))
|
||||
sigma = torch.exp(log_sigma)
|
||||
else:
|
||||
sigma = self._min_std + (1 - self._min_std) * F.softplus(log_sigma)
|
||||
|
||||
Reference in New Issue
Block a user