Commit Graph

568 Commits

Author SHA1 Message Date
Sven Mika 4db86404ad [RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037) 2021-02-11 18:58:46 +01:00
Sven Mika a2f7998026 [RLlib] Issue #13342: Add validate_spaces to MB-MPO. (#14038) 2021-02-11 11:36:53 +01:00
Sven Mika 81e7434091 [RLlib] TFPolicy.export_model: Add timestep placeholder to model's signature, if needed. (#13988) 2021-02-10 15:21:46 +01:00
Sven Mika 37c7daa3c0 [RLlib] DDPG: Support simplex action space. (#14011) 2021-02-10 15:10:01 +01:00
Sven Mika d7301a51f4 [RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. (#13555) 2021-02-09 17:05:26 +01:00
Sven Mika eb0038612f [RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584) 2021-02-08 15:02:19 +01:00
Chace Ashcraft ebeee1d59a [RLlib] Pytorch MAML fix for more than two workers with discrete actions (#13835) 2021-02-08 12:06:02 +01:00
Sven Mika d001af3e59 [RLlib] Allow rllib rollout to run distributed via evaluation workers. (#13718) 2021-02-08 12:05:16 +01:00
Sven Mika 9ac731558b [RLlib] Unify fcnet initializers for the value output layer (std=1.0 in torch, but 0.01 in tf). (#13733) 2021-02-02 18:42:49 +01:00
Sven Mika 0a0d9183fe [RLlib] Trajectory view API example script (enhancements and tf2 support). (#13786) 2021-02-02 18:42:18 +01:00
Stanislav Chekmenev b9c15a2551 [RLlib] Issue #13761: Fix get action shape (#13764) 2021-02-02 13:13:43 +01:00
Raoul Khouri 714c367b9d [RLlib] Trainer._validate_config idempotentcy correction (issue 13427) (#13556) 2021-02-02 13:11:57 +01:00
Sven Mika 52c94b7ee9 [RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika 4bc257f4fb [RLlib] Fix custom multi action distr (#13681) 2021-01-28 19:28:48 +01:00
Yuri Rocha b01b0f80aa [RLlib] Fix multiple Unity3DEnvs trying to connect to the same custom port (#13519) 2021-01-28 13:28:08 +01:00
cathrinS d4ef5c5993 [RLlib] Atari-RAM-Preprocessing, unsigned observation vector results in a false preprocessed observation (#13013) 2021-01-28 12:07:00 +01:00
Maltimore b4702de1c2 [RLlib] move evaluation to trainer.step() such that the result is properly logged (#12708) 2021-01-25 12:56:00 +01:00
Jan Blumenkamp 964689b280 [RLlib] Fix bug in ModelCatalog when using custom action distribution (#12846)
* return tuple returned from _get_multi_action_distribution when using custom action dict

* Always return dst_class and required_model_output_shape in _get_multi_action_distribution

* pass model config to _get_multi_action_distribution
2021-01-25 12:42:39 +01:00
Sven Mika 9423930bcc [RLlib] MAML: Add cartpole mass test for PyTorch. (#13679) 2021-01-25 12:32:41 +01:00
Sven Mika d629292d63 [RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. (#13634) 2021-01-22 19:36:02 +01:00
Michael Luo 587f207c2f [RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) 2021-01-21 16:43:55 +01:00
Saeid d11e62f9e6 [RLlib] Fix problem in preprocessing nested MultiDiscrete (#13308) 2021-01-21 16:36:11 +01:00
Sven Mika daf0bef285 [RLlib] Dreamer: Fix broken import and add compilation test case. (#13553) 2021-01-21 16:30:26 +01:00
Sven Mika 2e3655e8a9 [RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 2021-01-19 14:22:36 +01:00
Sven Mika e74947cc94 [RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika 93c0a5549b [RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397) 2021-01-19 09:51:35 +01:00
Sven Mika a65ee92b69 [RLlib] MARWIL loss function test case and cleanup. (#13455) 2021-01-19 09:51:05 +01:00
Sven Mika 1f00f834ac [RLlib] Solve PyTorch/TF-eager A3C async race condition between calling model and its value function. (#13467) 2021-01-18 10:29:03 -08:00
Sven Mika d98235cc84 [RLlib] Deflake 2x remote & local inference tests (external env). (#13459) 2021-01-14 20:44:26 +01:00
Sven Mika 56878221ed [RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363) 2021-01-14 14:44:33 +01:00
Sven Mika d49c3fae0b [RLlib] Trajectory View API: Atari framestacking. (#13315) 2021-01-13 08:53:34 +01:00
Maltimore 3a3e4aed86 [RLlib] Add __len__() method to SampleBatch (#13371) 2021-01-12 20:15:23 +01:00
Kai Fricke 25f10a947a Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b.
2021-01-12 12:33:57 +01:00
Sven Mika e2b2abb88b [RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339) 2021-01-11 22:42:30 +01:00
Sven Mika 5d50d37f45 [RLlib] Issue 13330: No TF installed causes crash in ModelCatalog.get_action_shape() (#13332) 2021-01-11 13:19:46 +01:00
Sven Mika 9dd9f72111 [RLlib] Add more detailed Documentation on Model building API (#13261) 2021-01-09 12:38:29 +01:00
Sven Mika 6f342a2221 [RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Sven Mika a5b39ef8e2 [RLlib] Fix missing "info_batch" arg (None) in compute_actions calls. (#13237) 2021-01-07 21:25:02 +01:00
Sven Mika bcaff63909 [RLlib] SquashedGaussians should throw error when entropy or kl are called. (#13126) 2021-01-07 15:07:35 +01:00
Basu Jindal 4e569ee20b Update multi_agent_independent_learning.py (#13196)
pettingzoo.utils.error.DeprecatedEnv: waterworld_v0 is now depreciated, use waterworld_v2 instead
2021-01-05 13:44:54 -08:00
Sven Mika 9eba1871bb [RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika 8726521604 [RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). (#13091) 2020-12-30 22:30:52 -05:00
Sven Mika 391cdfae8c [RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika 28ac4243f4 [RLlib] Deflake test case: 2-step game MADDPG. (#13121) 2020-12-30 18:37:37 -05:00
Michael Luo 42cd414e5b [RLlib] New Offline RL Algorithm: CQL (based on SAC) (#13118) 2020-12-30 10:11:57 -05:00
Michael Luo eae7a1f433 [RLLib] Readme.md Documentation for Almost All Algorithms in rllib/agents (#13035) 2020-12-29 18:45:55 -05:00
Sven Mika d811d65920 [RLlib] run_regression_tests.py: --framework flag (instead of --torch). (#13097) 2020-12-29 15:27:59 -05:00
Sven Mika c524f86785 [RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika a5318961de [RLlib] Preprocessor fixes (multi-discrete) and tests. (#13083) 2020-12-26 20:14:36 -05:00
Sven Mika 99ae7bae05 [RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00