Update README.md

2026-06-27 18:06:10 +08:00 · 2020-02-03 13:55:23 +05:30
parent 42d2ff08cb
commit 5189f44caa
1 changed files with 11 additions and 14 deletions
@@ -6,14 +6,12 @@ Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf).
 Added another branch for [Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
 Learning with a Stochastic Actor](https://arxiv.org/pdf/1801.01290.pdf) -> [SAC_V](https://github.com/pranz24/pytorch-soft-actor-critic/tree/SAC_V)

-
 ### Requirements
 ------------

- [mujoco-py](https://github.com/openai/mujoco-py)
- [TensorboardX](https://github.com/lanpa/tensorboardX)
- [PyTorch](http://pytorch.org/)
-
+-  [mujoco-py](https://github.com/openai/mujoco-py)
+-  [TensorboardX](https://github.com/lanpa/tensorboardX)
+-  [PyTorch](http://pytorch.org/)

 ### Default Arguments and Usage
 -------------
@@ -31,26 +29,25 @@ usage: main.py [-h] [--env-name ENV_NAME] [--policy POLICY] [--eval EVAL]

 (Note: There is no need for setting Temperature(`--alpha`) if `--automatic_entropy_tuning` is True.)

-##### For SAC :
+##### For SAC

 ```
 python main.py --env-name Humanoid-v2 --alpha 0.05
 ```

-##### For SAC (Hard Update):
+##### For SAC (Hard Update)

 ```
 python main.py --env-name Humanoid-v2 --alpha 0.05 --tau 1 --target_update_interval 1000
 ```

-##### For SAC (Deterministic, Hard Update):
+##### For SAC (Deterministic, Hard Update)

 ```
 python main.py --env-name Humanoid-v2 --policy Deterministic --tau 1 --target_update_interval 1000
 ```

 ### Arguments
-
 ```
 PyTorch Soft Actor-Critic Args

@@ -86,9 +83,9 @@ optional arguments:

 | Environment **(`--env-name`)**| Temperature **(`--alpha`)**|
 | --------------- | ------------- |
-| HalfCheetah-v2  | 0.2 |
-| Hopper-v2       | 0.2 |
-| Walker2d-v2     | 0.2 |
-| Ant-v2          | 0.2 |
-| Humanoid-v2     | 0.05 |
+| HalfCheetah-v2  |      0.2      |
+| Hopper-v2       |      0.2      |
+| Walker2d-v2     |      0.2      |
+| Ant-v2          |      0.2      |
+| Humanoid-v2     |     0.05      |