Commit Graph

21 Commits

Author SHA1 Message Date
Sven Mika ce96b03b07 [RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
mvindiola1 2b893d1bb5 fix incorrect critic loss in TD3 (#10775)
Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>
2020-09-20 20:01:51 -07:00
Barak Michener 8e76796fd0 ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika fcdf410ae1 [RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00
Sven Mika 4da0e542d5 [RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204) 2020-07-08 16:12:20 +02:00
Sven Mika b4c0b942fe [RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) (#9237) 2020-07-01 17:31:44 +02:00
Sven Mika 43043ee4d5 [RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika 4ed796a7d6 [RLlib] Add testing Policy.compute_single_action() for all agents. (#8903) 2020-06-13 17:51:50 +02:00
Sven Mika 2746fc0476 [RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Sven Mika baa053496a [RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414) 2020-05-26 11:10:27 +02:00
Eric Liang 9a83908c46 [rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Sven Mika 754290daad [RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356) 2020-05-08 16:31:31 +02:00
Sven Mika d7eaacb5fe [RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. (#8324) 2020-05-08 08:26:32 +02:00
Eric Liang b14cc16616 [rllib] Enable functional execution workflow API by default (#8221) 2020-05-05 12:36:42 -07:00
Sven Mika b95e28faea [RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288)
APEX_DDPG (PyTorch) test case and docs.
2020-05-04 09:36:27 +02:00
Sven Mika 7ec2223c84 [RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188)
Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).
2020-04-26 23:08:13 +02:00
Sven Mika d0fab84e4d [RLlib] DDPG PyTorch version. (#7953)
The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib.
This PR:
- Depends on the re-factor PR for DDPG (Functional Algorithm API).
- Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch)
- Updates the documentation to reflect that DDPG and TD3 now support PyTorch.

* Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf).
* Fix GPU target model problem.
2020-04-16 10:20:01 +02:00
Sven Mika 1b31c11806 [RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934) 2020-04-09 14:04:21 -07:00
Sven Mika 1d4823c0ec [RLlib] Add testing framework_iterator. (#7852)
* Add testing framework_iterator.

* LINT.

* WIP.

* Fix and LINT.

* LINT fix.
2020-04-03 12:24:25 -07:00
Sven Mika 20ef4a8603 [RLlib] Cleanup/unify all test cases. (#7533) 2020-03-11 20:39:47 -07:00
Sven Mika 83e06cd30a [RLlib] DDPG refactor and Exploration API action noise classes. (#7314)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix

* WIP.

* Add TD3 quick Pendulum regresison.

* Cleanup.

* Fix.

* LINT.

* Fix.

* Sort quick_learning test cases, add TD3.

* Sort quick_learning test cases, add TD3.

* Revert test_checkpoint_restore.py (debugging) changes.

* Fix old soft_q settings in documentation and test configs.

* More doc fixes.

* Fix test case.

* Fix test case.

* Lower test load.

* WIP.
2020-03-01 11:53:35 -08:00