* pong on pixels working (not cleaned up) * make training compatible with all atari games * cartpole runs * Update documentation and usage for policy gradients.
* add policy gradient example * fix typos * Minor changes plus some documentation. * Minor fixes.