Files
ray/doc/source
Eric Liang 9ea57c2a93 [rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504)
Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer
  Add AsyncSamplesOptimizer that implements the IMPALA architecture
  integrate V-trace with a3c policy graph
  audit V-trace integration
  benchmark compare vs A3C and with V-trace on/off
PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.
2018-08-01 20:53:53 -07:00
..
2018-07-01 00:05:08 -07:00
2018-06-12 12:40:12 -07:00
2018-07-01 00:05:08 -07:00
2018-07-01 00:05:08 -07:00
2018-01-25 16:39:00 -08:00