mirror of
https://github.com/wassname/ray.git
synced 2026-06-30 08:27:21 +08:00
Move meta-learning algorithms into their own section in the TOC (#10727)
This commit is contained in:
@@ -96,22 +96,14 @@ Algorithms
|
||||
|
||||
- |pytorch| :ref:`Decentralized Distributed Proximal Policy Optimization (DD-PPO) <ddppo>`
|
||||
|
||||
- |pytorch| :ref:`Single-Player AlphaZero (contrib/AlphaZero) <alphazero>`
|
||||
|
||||
* Gradient-based
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Advantage Actor-Critic (A2C, A3C) <a3c>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Deep Deterministic Policy Gradients (DDPG, TD3) <ddpg>`
|
||||
|
||||
- |pytorch| :ref:`Dreamer <dreamer>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Deep Q Networks (DQN, Rainbow, Parametric DQN) <dqn>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Model-Agnostic Meta-Learning (MAML) <maml>`
|
||||
|
||||
- |pytorch| :ref:`Model-Based Meta-Policy-Optimization (MBMPO) <mbmpo>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Policy Gradients <pg>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Proximal Policy Optimization (PPO) <ppo>`
|
||||
@@ -124,7 +116,17 @@ Algorithms
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Evolution Strategies <es>`
|
||||
|
||||
* Multi-agent specific
|
||||
* Model-based / Meta-learning
|
||||
|
||||
- |pytorch| :ref:`Single-Player AlphaZero (contrib/AlphaZero) <alphazero>`
|
||||
|
||||
- |pytorch| |tensorflow| :ref:`Model-Agnostic Meta-Learning (MAML) <maml>`
|
||||
|
||||
- |pytorch| :ref:`Model-Based Meta-Policy-Optimization (MBMPO) <mbmpo>`
|
||||
|
||||
- |pytorch| :ref:`Dreamer (DREAMER) <dreamer>`
|
||||
|
||||
* Multi-agent
|
||||
|
||||
- |pytorch| :ref:`QMIX Monotonic Value Factorisation (QMIX, VDN, IQN) <qmix>`
|
||||
- |tensorflow| :ref:`Multi-Agent Deep Deterministic Policy Gradient (contrib/MADDPG) <maddpg>`
|
||||
|
||||
Reference in New Issue
Block a user