Commit Graph

147 Commits

Author SHA1 Message Date
Sven Mika 9ac731558b [RLlib] Unify fcnet initializers for the value output layer (std=1.0 in torch, but 0.01 in tf). (#13733) 2021-02-02 18:42:49 +01:00
Sven Mika 0a0d9183fe [RLlib] Trajectory view API example script (enhancements and tf2 support). (#13786) 2021-02-02 18:42:18 +01:00
Sven Mika 52c94b7ee9 [RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika 4bc257f4fb [RLlib] Fix custom multi action distr (#13681) 2021-01-28 19:28:48 +01:00
cathrinS d4ef5c5993 [RLlib] Atari-RAM-Preprocessing, unsigned observation vector results in a false preprocessed observation (#13013) 2021-01-28 12:07:00 +01:00
Jan Blumenkamp 964689b280 [RLlib] Fix bug in ModelCatalog when using custom action distribution (#12846)
* return tuple returned from _get_multi_action_distribution when using custom action dict

* Always return dst_class and required_model_output_shape in _get_multi_action_distribution

* pass model config to _get_multi_action_distribution
2021-01-25 12:42:39 +01:00
Saeid d11e62f9e6 [RLlib] Fix problem in preprocessing nested MultiDiscrete (#13308) 2021-01-21 16:36:11 +01:00
Sven Mika 56878221ed [RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363) 2021-01-14 14:44:33 +01:00
Sven Mika d49c3fae0b [RLlib] Trajectory View API: Atari framestacking. (#13315) 2021-01-13 08:53:34 +01:00
Kai Fricke 25f10a947a Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b.
2021-01-12 12:33:57 +01:00
Sven Mika e2b2abb88b [RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339) 2021-01-11 22:42:30 +01:00
Sven Mika 5d50d37f45 [RLlib] Issue 13330: No TF installed causes crash in ModelCatalog.get_action_shape() (#13332) 2021-01-11 13:19:46 +01:00
Sven Mika 6f342a2221 [RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Sven Mika bcaff63909 [RLlib] SquashedGaussians should throw error when entropy or kl are called. (#13126) 2021-01-07 15:07:35 +01:00
Sven Mika 9eba1871bb [RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika 8726521604 [RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). (#13091) 2020-12-30 22:30:52 -05:00
Sven Mika 391cdfae8c [RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika c524f86785 [RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika a5318961de [RLlib] Preprocessor fixes (multi-discrete) and tests. (#13083) 2020-12-26 20:14:36 -05:00
Sven Mika 99ae7bae05 [RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
Sven Mika d5604eaba3 [RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika b2bcab711d [RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika 124c8318a8 [RLlib] Fix broken test_distributions.py (test_categorical) (#12915) 2020-12-17 17:44:26 -06:00
Sven Mika abb1eefdc2 [RLlib] Issue 12483: Discrete observation space error: "ValueError: ('Observation ({}) outside given space ..." when doing Trainer.compute_action. (#12787) 2020-12-11 22:43:30 +01:00
Sven Mika 99c81c6795 [RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika 3f4bc16276 [RLlib] Add a minimal JAX ModelV2 (FCNet) to RLlib. (#12502) 2020-12-03 15:51:30 +01:00
Sven Mika 19c8033df2 [RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika 3ad9365e1d [RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika 592c161032 [RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Michael Luo 59ccbc0fc7 [RLlib] Model Annotations: Tensorflow (#11964) 2020-11-12 12:18:50 +01:00
Michael Luo b2984d1c34 [RLlib] Model Annotations to Torch Models (#9749) 2020-11-12 12:16:12 +01:00
Sven Mika 291c172d83 [RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909) 2020-11-11 18:45:28 +01:00
Sven Mika 5b788ccb13 [RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717) 2020-11-03 12:53:34 -08:00
Jiajie Xiao 0b07af374a allow tuple action space (#11429)
Co-authored-by: Jiajie Xiao <jj@Jiajies-MBP-2.attlocal.net>
2020-10-29 16:05:38 +01:00
Sven Mika d9f1874e34 [RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Sven Mika 1ebcdf236f [RLlib] Add support for custom MultiActionDistributions. (#11311) 2020-10-12 13:50:43 -07:00
Sven Mika 0c0f67c14d [RLlib] ARS/ES eval workers not working: Issue 9933. (#11308) 2020-10-12 13:49:48 -07:00
Sven Mika d3bc20b727 [RLlib] ConvTranspose2D module (#11231) 2020-10-12 15:00:42 +02:00
Sven Mika 957877ad3f Tf version of VisionNet (ray/rllib/model/tf/vision_net.py) crashes iff len(conv-filters)=1. (#11330) 2020-10-11 12:49:47 +02:00
Sumanth Ratna 14d8826e43 Fix overriden typo (#11227) 2020-10-07 19:11:07 -07:00
Sven Mika ce96b03b07 [RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika c17169dc11 [RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
internetcoffeephone 840fb5543b Change get_action_shape so that it uses the dtype of the Discrete object, rather than overwriting it with tf.int64. (#8424) 2020-09-21 17:08:31 -07:00
Sven Mika 805dad3bc4 [RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika 28ab797cf5 [RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
Sven Mika 8a891b3c30 [RLlib] SAC n_step > 1. (#10567) 2020-09-05 22:26:42 +02:00
Sven Mika 244aafdcf8 [RLlib] Curiosity enhancements. (#10373) 2020-09-05 13:14:24 +02:00
Sven Mika ef18893fb5 [RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420) 2020-09-02 14:03:01 +02:00