Commit Graph

84 Commits

Author SHA1 Message Date
Sven Mika c524f86785 [RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika 407a3523f3 [RLlib] eval_workers after restore not generated in Trainer due to unintuitive config handling. (#12844) 2020-12-20 09:37:31 -05:00
Sven Mika e40b14d255 [RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika 19c8033df2 [RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika bb03e2499b [RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209) 2020-11-30 12:41:24 +01:00
Pierre TASSEL 60a545ab57 [RLLib] Fix HyperOptSearch tuple to list conversion (#12462)
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
2020-11-28 10:07:54 -08:00
Sven Mika 592c161032 [RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Eric Liang e8c77e2847 Remove memory quota enforcement from actors (#11480)
* wip

* fix

* deprecate
2020-10-21 14:29:03 -07:00
Sven Mika 414041c6dd [RLlib] Do not create env on driver iff num_workers > 0. (#11307) 2020-10-15 18:21:30 +02:00
Sven Mika 0c0f67c14d [RLlib] ARS/ES eval workers not working: Issue 9933. (#11308) 2020-10-12 13:49:48 -07:00
Sumanth Ratna 14d8826e43 Fix overriden typo (#11227) 2020-10-07 19:11:07 -07:00
Sven Mika ce96b03b07 [RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika 805dad3bc4 [RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Eric Liang f83c588f08 [rllib] Remove broken no eager on workers mode (#10745)
* remove no eager

* Update trainer.py
2020-09-15 17:25:20 -07:00
Sven Mika 4b278c36fc [RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Sven Mika 28ab797cf5 [RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
krfricke c31876002d [tune/rllib] made wandb compatible with rllib trainables (#10252) 2020-08-21 17:25:52 -07:00
Sven Mika e968b52cb7 [RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika d14b501692 [RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115) 2020-08-20 17:05:57 +02:00
Sven Mika 2cbe29a7fa [RLlib] Curiosity minor fixes, do-overs, and testing. (#10143) 2020-08-19 17:49:50 +02:00
Tomasz Wrona aff7f19360 [tune] Added logger_config field (#8521)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 11:10:22 -07:00
Sven Mika 2256047876 [RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
Sven Mika b0b0463161 [RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678) 2020-07-29 21:15:09 +02:00
Eric Liang 590943a499 [rllib] Type annotations for model classes (#9646) 2020-07-24 12:01:46 -07:00
Sven Mika 03ab86567f [RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). (#9269) 2020-07-14 04:27:49 +02:00
Sven Mika fcdf410ae1 [RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00
Hao Chen d49dadf891 Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
Eric Liang 4b62a888cc [rllib] Remove deprecated policy optimizer package. (#9262) 2020-07-02 14:39:40 -07:00
Richard Liaw d35f0e40d0 [tune] Use public methods for trainable (#9184) 2020-07-01 11:00:00 -07:00
Sven Mika 43043ee4d5 [RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika 4fd8977eaf [RLlib] Minor cleanup in preparation to tf2.x support. (#9130)
* WIP.

* Fixes.

* LINT.

* Fixes.

* Fixes and LINT.

* WIP.
2020-06-25 19:01:32 +02:00
Eric Liang 1e0e1a45e6 [rllib] Add type annotations for evaluation/, env/ packages (#9003) 2020-06-19 13:09:05 -07:00
Ian Rodney 2e972c2a77 RLLIB and pylintrc (#8995) 2020-06-17 18:14:25 +02:00
Ian Rodney 265ddfc2e4 blacklist to remove (#8994) 2020-06-17 18:02:28 +02:00
Joseph Suarez c6ee3cdff4 Refactor #8792 to integrate latest master (#8956) 2020-06-17 10:55:52 +02:00
Eric Liang 34bae27ac7 [rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893) 2020-06-12 20:17:27 -07:00
Sven Mika a90cd0fcbb [RLlib] Unity3d soccer benchmarks (#8834) 2020-06-11 14:29:57 +02:00
Eric Liang 831b2fe51d [rllib] Set framework to tf by default and remove import checks; "Auto" option (#8748)
* tf by default

* Update rllib/agents/trainer.py

Co-authored-by: Sven Mika <sven@anyscale.io>

* remove it

* fix

* remove

* fix

* lint

Co-authored-by: Sven Mika <sven@anyscale.io>
2020-06-08 23:04:50 -07:00
Eric Liang 1e4a1360fd [rllib] Add type annotations to Trainer class (#8642)
* type trainer

* type it

* fxi
2020-06-03 12:47:35 -07:00
Sven Mika b37a162076 [RLlib] Make envs specifiable in configs by their class path. (#8750) 2020-06-03 08:14:29 +02:00
Sven Mika 2746fc0476 [RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Eric Liang 9a83908c46 [rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
mehrdadn ebf060d484 Make more tests run on Windows (#8446)
* Remove worker Wait() call due to SIGCHLD being ignored

* Port _pid_alive to Windows

* Show PID as well as TID in glog

* Update TensorFlow version for Python 3.8 on Windows

* Handle missing Pillow on Windows

* Work around dm-tree PermissionError on Windows

* Fix some lint errors on Windows with Python 3.8

* Simplify torch requirements

* Quiet git clean

* Handle finalizer issues

* Exit with the signal number

* Get rid of wget

* Fix some Windows compatibility issues with tests

Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Eric Liang 9d012626e5 [rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Eric Liang b14cc16616 [rllib] Enable functional execution workflow API by default (#8221) 2020-05-05 12:36:42 -07:00
Eric Liang f48da50e1c [rllib] observation function api for multi-agent (#8236) 2020-05-04 22:13:49 -07:00
Eric Liang baadbdf8d4 [rllib] Execute PPO using training workflow (#8206)
* wip

* add kl

* kl

* works now

* doc update

* reorg

* add ddppo

* add stats

* fix fetch

* comment

* fix learner stat regression

* test fixes

* fix test
2020-04-30 01:18:09 -07:00
Eric Liang 2298f6fb40 [rllib] Port DQN/Ape-X to training workflow api (#8077) 2020-04-23 12:39:19 -07:00