Commit Graph

9 Commits

Author SHA1 Message Date
Sven Mika b2bcab711d [RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika 0df55a139c [RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika 805dad3bc4 [RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika e968b52cb7 [RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika c9435cad43 WIP. (#8456)
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
Sven Mika 66df8b8c35 [RLlib] Working/learning example: PPO + torch + LSTM. (#7797) 2020-03-31 22:00:28 -07:00
Eric Liang 9a590ac6a5 [rllib] Fix custom model metrics in multi-device case (#7640)
* fix example

* add example test

* lin
2020-03-23 12:40:22 -07:00
Sven Mika d537e9f0d8 [RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00
Eric Liang 2fb53396ad [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00