[rllib] Behavior Cloning (#1400)

* Behavior Cloning

* episode_reward_mean -> mean_loss

* removing vestigial code

* punctuation

* unnecessary

* Behavior Cloning

* Behavior Cloning

* Update __init__.py
This commit is contained in:
Roy Fox
2018-01-23 10:50:45 -08:00
committed by Eric Liang
parent ee36effd8e
commit 4b0ef5eb2c
11 changed files with 390 additions and 83 deletions
+2 -2
View File
@@ -7,8 +7,8 @@ class Optimizer(object):
"""RLlib optimizers encapsulate distributed RL optimization strategies.
For example, AsyncOptimizer is used for A3C, and LocalMultiGPUOptimizer is
used for PPO. These optimizers are all pluggable however, it is possible
to mix as match as needed.
used for PPO. These optimizers are all pluggable, and it is possible
to mix and match as needed.
In order for an algorithm to use an RLlib optimizer, it must implement
the Evaluator interface and pass a number of Evaluators to its Optimizer