Update README.md

This commit is contained in:
Pranjal Tandon
2020-02-03 13:55:23 +05:30
committed by GitHub
parent 42d2ff08cb
commit 5189f44caa
+11 -14
View File
@@ -6,14 +6,12 @@ Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf).
Added another branch for [Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf) -> [SAC_V](https://github.com/pranz24/pytorch-soft-actor-critic/tree/SAC_V)
### Requirements
------------
- [mujoco-py](https://github.com/openai/mujoco-py)
- [TensorboardX](https://github.com/lanpa/tensorboardX)
- [PyTorch](http://pytorch.org/)
- [mujoco-py](https://github.com/openai/mujoco-py)
- [TensorboardX](https://github.com/lanpa/tensorboardX)
- [PyTorch](http://pytorch.org/)
### Default Arguments and Usage
-------------
@@ -31,26 +29,25 @@ usage: main.py [-h] [--env-name ENV_NAME] [--policy POLICY] [--eval EVAL]
(Note: There is no need for setting Temperature(`--alpha`) if `--automatic_entropy_tuning` is True.)
##### For SAC :
##### For SAC
```
python main.py --env-name Humanoid-v2 --alpha 0.05
```
##### For SAC (Hard Update):
##### For SAC (Hard Update)
```
python main.py --env-name Humanoid-v2 --alpha 0.05 --tau 1 --target_update_interval 1000
```
##### For SAC (Deterministic, Hard Update):
##### For SAC (Deterministic, Hard Update)
```
python main.py --env-name Humanoid-v2 --policy Deterministic --tau 1 --target_update_interval 1000
```
### Arguments
```
PyTorch Soft Actor-Critic Args
@@ -86,9 +83,9 @@ optional arguments:
| Environment **(`--env-name`)**| Temperature **(`--alpha`)**|
| --------------- | ------------- |
| HalfCheetah-v2 | 0.2 |
| Hopper-v2 | 0.2 |
| Walker2d-v2 | 0.2 |
| Ant-v2 | 0.2 |
| Humanoid-v2 | 0.05 |
| HalfCheetah-v2 | 0.2 |
| Hopper-v2 | 0.2 |
| Walker2d-v2 | 0.2 |
| Ant-v2 | 0.2 |
| Humanoid-v2 | 0.05 |