Commit Graph

20 Commits

Author SHA1 Message Date
Sven Mika 19c8033df2 [RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika 244aafdcf8 [RLlib] Curiosity enhancements. (#10373) 2020-09-05 13:14:24 +02:00
Sven Mika 97d524c075 [RLlib] Issue 8769 broken OOM tests_dir cases (R & S). (#8770) 2020-06-05 08:34:21 +02:00
Sven Mika 2746fc0476 [RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Sven Mika c7a2e3f309 [RLlib] Removed config["sample_async"] restriction for A3C-torch. (#8617) 2020-05-27 10:22:49 +02:00
Eric Liang 9a83908c46 [rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Eric Liang b14cc16616 [rllib] Enable functional execution workflow API by default (#8221) 2020-05-05 12:36:42 -07:00
Eric Liang 31b40b00f6 [rllib] Pull out experimental dsl into rllib.execution module, add initial unit tests (#7958) 2020-04-10 00:56:08 -07:00
Eric Liang dd70720578 [rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang c3a8ba399f [rllib] Enable distributed exec api for A2C, A3C, PG by default (#7580) 2020-03-13 18:48:41 -07:00
Eric Liang f5d12a958b [rllib] Port Ape-X to distributed execution API (#7497) 2020-03-12 00:54:08 -07:00
Eric Liang 0f88444686 [rllib] Support multi-agent training in pipeline impls, add easy flag to enable (#7338) 2020-03-02 15:16:37 -08:00
Sven Mika d537e9f0d8 [RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00
roireshef 3c60caa448 [rllib] implemented compute_advantages without gae (#6941) 2020-01-31 22:25:45 -08:00
Sven Mika e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Sven f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Eric Liang 5d7afe8092 [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00