mirror of
https://github.com/wassname/pytorch-soft-actor-critic.git
synced 2026-06-27 18:06:10 +08:00
Update README.md
This commit is contained in:
@@ -6,14 +6,12 @@ Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf).
|
||||
Added another branch for [Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
|
||||
Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf) -> [SAC_V](https://github.com/pranz24/pytorch-soft-actor-critic/tree/SAC_V)
|
||||
|
||||
|
||||
### Requirements
|
||||
------------
|
||||
|
||||
- [mujoco-py](https://github.com/openai/mujoco-py)
|
||||
- [TensorboardX](https://github.com/lanpa/tensorboardX)
|
||||
- [PyTorch](http://pytorch.org/)
|
||||
|
||||
- [mujoco-py](https://github.com/openai/mujoco-py)
|
||||
- [TensorboardX](https://github.com/lanpa/tensorboardX)
|
||||
- [PyTorch](http://pytorch.org/)
|
||||
|
||||
### Default Arguments and Usage
|
||||
-------------
|
||||
@@ -31,26 +29,25 @@ usage: main.py [-h] [--env-name ENV_NAME] [--policy POLICY] [--eval EVAL]
|
||||
|
||||
(Note: There is no need for setting Temperature(`--alpha`) if `--automatic_entropy_tuning` is True.)
|
||||
|
||||
##### For SAC :
|
||||
##### For SAC
|
||||
|
||||
```
|
||||
python main.py --env-name Humanoid-v2 --alpha 0.05
|
||||
```
|
||||
|
||||
##### For SAC (Hard Update):
|
||||
##### For SAC (Hard Update)
|
||||
|
||||
```
|
||||
python main.py --env-name Humanoid-v2 --alpha 0.05 --tau 1 --target_update_interval 1000
|
||||
```
|
||||
|
||||
##### For SAC (Deterministic, Hard Update):
|
||||
##### For SAC (Deterministic, Hard Update)
|
||||
|
||||
```
|
||||
python main.py --env-name Humanoid-v2 --policy Deterministic --tau 1 --target_update_interval 1000
|
||||
```
|
||||
|
||||
### Arguments
|
||||
|
||||
```
|
||||
PyTorch Soft Actor-Critic Args
|
||||
|
||||
@@ -86,9 +83,9 @@ optional arguments:
|
||||
|
||||
| Environment **(`--env-name`)**| Temperature **(`--alpha`)**|
|
||||
| --------------- | ------------- |
|
||||
| HalfCheetah-v2 | 0.2 |
|
||||
| Hopper-v2 | 0.2 |
|
||||
| Walker2d-v2 | 0.2 |
|
||||
| Ant-v2 | 0.2 |
|
||||
| Humanoid-v2 | 0.05 |
|
||||
| HalfCheetah-v2 | 0.2 |
|
||||
| Hopper-v2 | 0.2 |
|
||||
| Walker2d-v2 | 0.2 |
|
||||
| Ant-v2 | 0.2 |
|
||||
| Humanoid-v2 | 0.05 |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user