mirror of
https://github.com/wassname/ray.git
synced 2026-07-01 09:27:40 +08:00
7ec2223c84
Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).
Implementation of deep deterministic policy gradients (https://arxiv.org/abs/1509.02971), including an Ape-X variant.