mirror of
https://github.com/wassname/ray.git
synced 2026-07-04 18:14:55 +08:00
[rllib] Behavior Cloning (#1400)
* Behavior Cloning * episode_reward_mean -> mean_loss * removing vestigial code * punctuation * unnecessary * Behavior Cloning * Behavior Cloning * Update __init__.py
This commit is contained in:
@@ -7,8 +7,8 @@ class Optimizer(object):
|
||||
"""RLlib optimizers encapsulate distributed RL optimization strategies.
|
||||
|
||||
For example, AsyncOptimizer is used for A3C, and LocalMultiGPUOptimizer is
|
||||
used for PPO. These optimizers are all pluggable however, it is possible
|
||||
to mix as match as needed.
|
||||
used for PPO. These optimizers are all pluggable, and it is possible
|
||||
to mix and match as needed.
|
||||
|
||||
In order for an algorithm to use an RLlib optimizer, it must implement
|
||||
the Evaluator interface and pass a number of Evaluators to its Optimizer
|
||||
|
||||
Reference in New Issue
Block a user