Commit Graph

144 Commits

Author SHA1 Message Date
Sven Mika 4db86404ad [RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037) 2021-02-11 18:58:46 +01:00
Sven Mika eb0038612f [RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584) 2021-02-08 15:02:19 +01:00
Sven Mika d001af3e59 [RLlib] Allow rllib rollout to run distributed via evaluation workers. (#13718) 2021-02-08 12:05:16 +01:00
Sven Mika 0a0d9183fe [RLlib] Trajectory view API example script (enhancements and tf2 support). (#13786) 2021-02-02 18:42:18 +01:00
Sven Mika 52c94b7ee9 [RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika 9423930bcc [RLlib] MAML: Add cartpole mass test for PyTorch. (#13679) 2021-01-25 12:32:41 +01:00
Sven Mika e74947cc94 [RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika 93c0a5549b [RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397) 2021-01-19 09:51:35 +01:00
Sven Mika d98235cc84 [RLlib] Deflake 2x remote & local inference tests (external env). (#13459) 2021-01-14 20:44:26 +01:00
Sven Mika 56878221ed [RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363) 2021-01-14 14:44:33 +01:00
Kai Fricke 25f10a947a Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b.
2021-01-12 12:33:57 +01:00
Sven Mika e2b2abb88b [RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339) 2021-01-11 22:42:30 +01:00
Sven Mika 9dd9f72111 [RLlib] Add more detailed Documentation on Model building API (#13261) 2021-01-09 12:38:29 +01:00
Sven Mika 6f342a2221 [RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Basu Jindal 4e569ee20b Update multi_agent_independent_learning.py (#13196)
pettingzoo.utils.error.DeprecatedEnv: waterworld_v0 is now depreciated, use waterworld_v2 instead
2021-01-05 13:44:54 -08:00
Sven Mika 9eba1871bb [RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika 391cdfae8c [RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika c524f86785 [RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika 99ae7bae05 [RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
Sven Mika 670d083a56 [RLlib] Fix broken unity3d_env import in example server script. (#13040) 2020-12-23 11:29:58 -05:00
Sven Mika d5604eaba3 [RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika b2bcab711d [RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika e40b14d255 [RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika 99c81c6795 [RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika 3ad9365e1d [RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika 0df55a139c [RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika 592c161032 [RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika 841d93d366 [RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399) 2020-11-25 11:27:19 -08:00
Raoul Khouri d07ffc152b [rllib] Rrk/12079 custom filters (#12095)
* travis reformatted
2020-11-19 13:20:20 -08:00
Sven Mika dab241dcc6 [RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063) 2020-11-19 19:01:14 +01:00
Michael Luo 6e6c680f14 MBMPO Cartpole (#11832)
* MBMPO Cartpole Done

* Added doc
2020-11-12 10:30:41 -08:00
Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Benjamin Black 1999266bba Updated pettingzoo env to acomidate api changes and fixes (#11873)
* Updated pettingzoo env to acomidate api changes and fixes

* fixed test failure

* fixed linting issue

* fixed test failure
2020-11-09 16:09:49 -08:00
Pierre TASSEL 66605cfcbd [RLLib] Random Parametric Trainer (#11366) 2020-11-04 11:12:51 +01:00
desktable 5af745c90d [RLlib] Implement the SlateQ algorithm (#11450) 2020-11-03 09:52:04 +01:00
Lara Codeca e735add268 [RLlib] Integration with SUMO Simulator (#11710) 2020-11-03 09:45:03 +01:00
Sven Mika 54d85a6c2a [RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720) 2020-11-02 11:18:41 +01:00
Sven Mika 8ea1bc5ff9 [RLlib] Allow for more than 2^31 policy timesteps. (#11301) 2020-10-12 13:49:11 -07:00
Sven Mika ce96b03b07 [RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika c17169dc11 [RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Eric Liang daa03ba6e6 [rllib] Add execution module to package ref (#10941)
* add init

* add

* update
2020-09-21 23:03:06 -07:00
Sven Mika d7c42d6d92 [RLlib] Unity blogpost final fixes. (#10894) 2020-09-20 14:13:20 +02:00
Sven Mika 805dad3bc4 [RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Benjamin Black f2408b719c Fixed PettingZooEnv (#10847) 2020-09-17 11:28:42 -07:00
Sven Mika 4b278c36fc [RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Michael Luo 8e613652af [RLLib] MBMPO Fixes (#10296) 2020-09-09 09:34:34 +02:00
Sven Mika 28ab797cf5 [RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
Richard Liaw 551c597312 [tune] API revamp fix (#10518) 2020-09-05 15:34:53 -07:00
Sven Mika 244aafdcf8 [RLlib] Curiosity enhancements. (#10373) 2020-09-05 13:14:24 +02:00