mirror of
https://github.com/wassname/pytorch-soft-actor-critic.git
synced 2026-06-27 18:06:10 +08:00
942 B
942 B
Description
Reimplementation of Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.
Contributions are welcome. If you find any mistake (very likely) or know how to make it more stable, don't hesitate to send a pull request.
Requirements
Run
Use the default hyperparameters.
For SAC (Gaussian Policy):
python main.py --algo SAC --env-name HalfCheetah-v2
For SAC (Gaussian Mixture Policy):
python main.py --algo SAC(GMM) --env-name HalfCheetah-v2 --k 4
TODO
- Gaussian Policy
- Reparameterization
- Gaussian Mixture Model
- Use 2 Q-functions
- Evaluate the trained Policy
- Deterministic Policy (hard target update)