Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir ( #5324 )
2019-08-05 23:25:49 -07:00
Eric Liang
955154a19d
Reduce Ray / RLlib startup messages ( #5368 )
2019-08-05 13:23:54 -07:00
Kristian Hartikainen
13fb9fe3db
[rllib] Feature/soft actor critic v2 ( #5328 )
...
* Add base for Soft Actor-Critic
* Pick changes from old SAC branch
* Update sac.py
* First implementation of sac model
* Remove unnecessary SAC imports
* Prune unnecessary noise and exploration code
* Implement SAC model and use that in SAC policy
* runs but doesn't learn
* clear state
* fix batch size
* Add missing alpha grads and vars
* -200 by 2k timesteps
* doc
* lazy squash
* one file
* ignore tfp
* revert done
2019-08-01 23:37:36 -07:00
Eric Liang
20450a4e82
[rllib] Add rock paper scissors multi-agent example ( #5336 )
2019-08-01 13:03:59 -07:00
jichan3751
bd6dfc994f
[sgd] Replaced class Resources in sgd with use_gpu ( #5252 )
2019-08-01 01:03:10 -07:00
Jaroslaw Rzepecki
b3c8091a35
Fix Tuple spaces in rollout.py ( #5332 )
...
Make sure that the initial action is also properly flattened.
2019-07-31 11:38:49 -07:00
Michael Luo
1337c98f02
[rllib] Importance Sampling and KL Loss for APPO ( #5051 )
2019-07-29 15:02:32 -07:00
Eric Liang
3bdd114282
[rllib] Better example rnn envs ( #5300 )
2019-07-28 14:07:18 -07:00
Eric Liang
a62c5f40f6
[rllib] Document ModelV2 and clean up the models/ directory ( #5277 )
2019-07-27 02:08:16 -07:00
Richard Liaw
5e15b36d6e
[tune] experiment_analysis split to Analysis ( #5115 )
2019-07-27 01:10:52 -07:00
Antoine Galataud
827618254a
[rllib] Configure learner queue timeout ( #5270 )
...
* configure learner queue timeout
* lint
* use config
* fix method args order, add unit test
* fix wrong param name
2019-07-25 21:18:05 -07:00
Eric Liang
bf9199ad77
[rllib] ModelV2 support for pytorch ( #5249 )
2019-07-25 11:02:53 -07:00
Eric Liang
60f59639c1
[rllib] Port DDPG to the build_tf_policy pattern ( #5242 )
2019-07-24 13:55:55 -07:00
Eric Liang
690b374581
[rllib] Add Keras LSTM example with ModelV2 ( #5258 )
2019-07-24 13:09:41 -07:00
Eric Liang
97c43284a6
[rllib] Fix trainer state restore ( #5257 )
2019-07-23 21:18:58 -07:00
Eric Liang
f9043cc49a
[rllib] Remove experimental eager support
2019-07-21 12:27:17 -07:00
Eric Liang
d58b986858
[rllib] MultiCategorical shouldn't return array for kl or entropy ( #5215 )
...
* wip
* fix
2019-07-19 12:12:04 -07:00
Jones Wong
da7676c925
Removed the implicit sync barrier at the end of each training iteration ( #5217 )
...
* removed sync barrier at the end of each training iteration
* formatted
* modify the comment according to current semantics
* lint check
* Update trainer.py
2019-07-18 22:59:52 -07:00
Eric Liang
28e5c5555d
[rllib] Move some inline defs to avoid deserialization errors ( #5228 )
...
* fix bug
* move metrics too
2019-07-18 21:01:16 -07:00
Jones Wong
0af07bd493
Enable seeding actors for reproducible experiments ( #5197 )
...
* enable graph-level worker-specific seed
* lint checked
* revised according to eric's suggestions
* revised accordingly and added a test case
* formated
* Update test_reproducibility.py
* Update trainer.py
* Update rollout_worker.py
* Update run_rllib_tests.sh
* Update worker_set.py
2019-07-17 23:31:34 -07:00
Jones Wong
81d297f87e
Remove redundant scaler of l2 reg ( #5172 )
...
* remove redundant scaler of l2 reg
* lint formatted
* Update ddpg_policy.py
2019-07-17 15:11:27 -07:00
Jones Wong
ae03c42dd6
Fixed inconsistent action placeholder ( #5213 )
2019-07-17 10:55:14 -07:00
Sam Toyer
214f09d969
[rllib] Make RLLib handle zero-length observation arrays ( #5208 )
...
* [rllib] Make _summarize handle zero-len arrays
Fixes #5207
* [rllib] Make aligned_array() handle empty arrays
* [rllib] Conform with old yapf
2019-07-16 22:37:57 -07:00
Eric Liang
4fa2a6006c
[rllib] Remove nested import ( #5204 )
...
* remove nested import
* Update metrics.py
2019-07-16 10:52:56 -07:00
Eric Liang
047f4ccd61
[rllib] Fix rollout.py with tuple action space ( #5201 )
...
* fix it
* update doc too
* fix rollout
2019-07-16 10:52:35 -07:00
Jones Wong
5b13a7eb90
Keep parameter space noise consistent with action space noise (Fix 5173) ( #5193 )
...
* make parameter space noise consistent with action space noise
* modified according to lint check
* indent
2019-07-14 12:20:35 -07:00
Richard Liaw
691c9733f9
[tune] Document trainable attributes and enable user-checkpoint… ( #4868 )
2019-07-10 18:51:11 -07:00
Eric Liang
5ab5017c67
[rllib] Fix impala stress test ( #5101 )
...
* add copy
* upgrade to tf 1.14
* update
* reduce count to workaround https://github.com/ray-project/ray/issues/5125
* Update impala.py
* placeholder
* comments
* update
2019-07-09 20:22:30 -07:00
Stefan Pantic
dfc94ce7bc
[rllib]Add entropy coeff decay ( #5043 )
2019-07-08 18:30:32 -07:00
Eric Liang
893744b3be
[rllib] Revert "use make template" which seems to break DQN/Atari ( #5134 )
...
* Revert "use make template"
This reverts commit 291e9e0031c6e315fe24e5b4973dea375fe73918.
* debug vars
2019-07-07 19:51:26 -07:00
Eric Liang
932d6b2517
[rllib] Port IMPALA to ModelV2/build_tf_policy ( #5130 )
...
* port vtrace
* fix vf
* fix vs
* fix the example
* wip ddpg
* fix tests
* fix tests
* remove ddpg model
* comments
* set vf share layers True by default
* typo
* fix test
2019-07-07 15:06:41 -07:00
Eric Liang
445bcb29b0
[hotfix] fix backward compat with older yaml libraries
2019-07-06 20:41:28 -07:00
Eric Liang
c15ed3ac55
[rllib] Shuffle RNN sequences in PPO as well ( #5129 )
...
* shuffle seq
* fix test
2019-07-06 20:40:49 -07:00
Brandon Bertelsen
c04b69902c
Updates for #5072 ( #5091 )
2019-07-06 16:05:50 -07:00
Aleksei Petrenko
09bde397c9
Multiagent experiment resume ( #5102 )
...
* Fixed problem with multiagent experiment resume
* Applied format script
* fix lint
2019-07-06 11:38:17 -07:00
Dušan Josipović
e9b88dcbed
[wingman -> tune] Add system performance tracking ( #4924 )
2019-07-06 00:57:35 -07:00
Eric Liang
34d054ff19
[rllib] ModelV2 API ( #4926 )
2019-07-03 15:59:47 -07:00
Kristian Hartikainen
9e0192bc0b
[tune] Change the log syncing behavior ( #4450 )
...
* Change the log syncing behavior
* fix up abstractions for syncer
* Finished checkpoint syncing
* Code
* Set of changes to get things running
* Fixes for log syncing
* Fix parts
* Lint and other fixes
* fix some test
* Remove extra parsing functionality
* some test fixes
* Fix up cloud syncing
* Another thing to do
* Fix up tests and local sync
Changes LogSync into a mixin, and adds tests for different
functionalities.
* Fix up tests, start on local migration
* fix distributed migrations
* comments
* formatting
* Better checkpoint directory handling
* fix tests
* fix tests
* fix click
* comments
* formatting comments
* formatting and comments
* sync function deprecations
* syncfunction
* Add documentation for Syncing and Uploading
* nit
* BaseSyncer as base for Mixin in edge case
* more docs
* clean up assertions
* validate
* nit
* Update test_cluster.py
* betterdoc
* Update tune-usage.rst
* cleanup
* nit
2019-07-02 20:46:00 -07:00
Philipp Moritz
bbe3e5b4ed
[rllib] Give error if sample_async is used with pytorch for A3C ( #5000 )
...
* give error if sample_async is used with pytorch
* update
* Update a3c.py
2019-06-25 22:06:35 -07:00
Eric Liang
aa5fc52e32
[rllib] Add QMIX mixer parameters to optimizer param list ( #5014 )
...
* add mixer params
* Update qmix_policy.py
2019-06-25 19:02:40 -07:00
Eric Liang
1d17125333
temp fix for build ( #5006 )
2019-06-20 18:07:44 -07:00
Eric Liang
fa1d4c9807
[rllib] Fix DDPG example ( #4973 )
2019-06-13 15:07:46 -07:00
Eric Liang
4f8e100fe0
fix ( #4950 )
2019-06-10 10:20:55 +08:00
Eric Liang
77689d1116
[rllib] Port remainder of algorithms to build_trainer() pattern ( #4920 )
2019-06-07 16:45:36 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution ( #4927 )
2019-06-07 16:42:37 -07:00
Eric Liang
7501ee51db
[rllib] Rename PolicyEvaluator => RolloutWorker ( #4820 )
2019-06-03 06:49:24 +08:00
Eric Liang
99eae05cf6
[tune] Disallow setting resources_per_trial when it is already configured ( #4880 )
...
* disallow it
* import fix
* fix example
* fix test
* fix tests
* Update mock.py
* fix
* make less convoluted
* fix tests
2019-06-03 06:47:39 +08:00
Eric Liang
665d081fe9
[rllib] Rough port of DQN to build_tf_policy() pattern ( #4823 )
2019-06-02 14:14:31 +08:00
Eric Liang
9aa1cd613d
[rllib] Allow Torch policies access to full action input dict in extra_action_out_fn ( #4894 )
...
* fix torch extra out
* preserve setitem
* fix docs
2019-06-01 16:58:49 +08:00
Eric Liang
1c073e92e4
[rllib] Fix documentation on custom policies ( #4910 )
...
* wip
* add docs
* lint
* todo sections
* fix doc
2019-06-01 16:13:21 +08:00