Attention and Augmented Recurrent Neural Networks

We’ve seen a growing number of attempts to augment RNNs with new properties. Four directions stand out as particularly exciting: neural turing machines, attentional interfaces, adaptive computation time and neural programmers.

Individually, these techniques are all potent extensions of RNNs, but the really striking thing is that they can be combined together, and seem to just be points in a broader space. Further, they all rely on the same underlying trick — something called attention — to work.

Neural Turing Machines

This is the first paragraph of the article. Test a long — dash -- here it is. Test for owner's possessive. Test for "quoting a passage." And another sentence. Or two. We can also cite external publications.

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Sit consectetur deleniti totam perspiciatis neque, eum sapiente, reiciendis velit magnam! Ipsam quas, voluptatum, eligendi velit animi distinctio. Rerum eos iusto sed.

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Ipsa minima voluptatibus eos, harum, quae hic veritatis perferendis mollitia ullam alias tempora, ipsum quaerat est, quisquam iste ab saepe deleniti possimus.

Acknowledgments

Thank you to Maithra Raghu, Dario Amodei, Cassandra Xia, Luke Vilnis, Anna Goldie, Jesse Engel, Dan Mané, Natasha Jaques, Emma Pierson and Ian Goodfellow for their feedback and encouragement. We’re also very grateful to our team, Google Brain, for being extremely supportive of our project.

Author Contributions

Augustus and Chris recognized the connection between deconvolution and artifacts. Augustus ran the GAN experiments. Vincent ran the artistic style transfer experiments. Chris ran the DeepDream experiments, created the visualizations and wrote most of the article.