Commit Graph

12 Commits

Author SHA1 Message Date
Sven Mika d5604eaba3 [RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika b2bcab711d [RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika e40b14d255 [RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika 99c81c6795 [RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika 19c8033df2 [RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika 3ad9365e1d [RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika 0df55a139c [RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika 95175a822f [RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407)
* WIP.

* Fix and LINT.
2020-11-25 11:26:29 -08:00
Sven Mika dab241dcc6 [RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063) 2020-11-19 19:01:14 +01:00
Sven Mika 5b788ccb13 [RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717) 2020-11-03 12:53:34 -08:00
Sven Mika d9f1874e34 [RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00