mirror of
https://github.com/wassname/pytorch-soft-actor-critic.git
synced 2026-06-27 18:06:10 +08:00
Update README.md
This commit is contained in:
@@ -34,13 +34,10 @@ python main.py --env-name Humanoid-v2 --scale_R 20 --tau 1 --value_update 1000
|
||||
python main.py --env-name Humanoid-v2 --scale_R 20 --deterministic True --tau 1 --value_update 1000
|
||||
```
|
||||
|
||||
### TODO
|
||||
### Results
|
||||
------------
|
||||
My results on Humanoid-v2 environment using SAC and SAC(deterministic, hard update).
|
||||
This is a plot of average rewards at every 10000 steps interval
|
||||
|
||||
- [x] Gaussian Policy
|
||||
- [x] Reparameterization
|
||||
- [x] Gaussian Mixture Model
|
||||
- [x] Use 2 Q-functions
|
||||
- [x] Evaluate the trained Policy
|
||||
- [ ] Deterministic Policy (hard target update)
|
||||

|
||||
|
||||
|
||||
Reference in New Issue
Block a user