Commit Graph

65 Commits

Author SHA1 Message Date
Eugene Vinitsky 3cb499632e (Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475) 2019-12-13 14:42:30 -08:00
Eric Liang be5dd8eb5e Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Zack Polizzi 9e9c524823 Update pong-apex tuned example (#6462) 2019-12-12 10:57:55 -08:00
Victor Le 4e24c805ee AlphaZero and Ranked reward implementation (#6385) 2019-12-07 12:08:40 -08:00
Eric Liang 4c6739476b [rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
Stephanie Wang da41180dc0 [direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306)
* multinode failures direct

* Add number of retries allowed for tasks

* Retry tasks

* Add failing test for object reconstruction

* Handle return status and debug

* update

* Retry task unit test

* update

* update

* todo

* Fix max_retries decorator, fix test

* Fix test that flaked

* lint

* comments
2019-12-02 10:20:57 -08:00
Eric Liang 77b5098e7d [rllib] Warn about dict action spaces 2019-11-27 12:57:38 -08:00
Eric Liang ddc8855f41 Fix wrap (#6293) 2019-11-26 17:47:47 -08:00
Ameer Haj Ali 71316fa8d0 wrap models with DistributionalQModel when running DQN (#6258)
* wrap models with DistributionalQModel when running DQN

* wrap only for tensorflow models

* Update custom_keras_model.py
2019-11-25 00:11:24 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Eric Liang 7559fdb141 [rllib/tune] Cache get_preprocessor() calls, default max_failur… (#6211) 2019-11-21 15:55:56 -08:00
Eric Liang 8fc2272f43 [rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181) 2019-11-18 10:39:07 -08:00
Philipp Moritz fc655acfee Fix linting on master branch (#6174) 2019-11-16 10:02:58 -08:00
Eric Liang a68cda0a33 [rllib] remove exists call (#6168) 2019-11-15 21:59:40 -08:00
Eric Liang 243b1b7281 [rllib] Add microbatch optimizer with A2C example (#6161) 2019-11-14 12:14:00 -08:00
waldroje e4c0843f60 Allow EntropyCoeffSchedule to accept custom schedule (#6158)
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation

* Update custom_metrics_and_callbacks.py

* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang e4565c9cc6 Reduce RLlib log verbosity (#6154) 2019-11-13 18:50:45 -08:00
Eric Liang b924299833 Add large scale regression test for RLlib (#6093) 2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang f48293f96d Fix deprecated warning (#6142) 2019-11-11 17:49:15 -08:00
Miguel Morales d17ae5ad7a Update hyperband-cartpole.yaml (#6121)
Typo
2019-11-09 19:39:03 -08:00
Eric Liang 1f043daf69 [rllib] Fix and add test for LR annealing config 2019-11-07 12:17:27 -08:00
David Bignell 3f83b2daa9 [rllib] Rollout extensions (#6065)
* Rollout improvements

* Make info-saving optional, to avoid breaking change.

* Store generating ray version in checkpoint metadata

* Keep the linter happy

* Add small rollout test

* Terse.

* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang 2a0225dd25 [rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 (#6087) 2019-11-05 11:36:29 -08:00
Eric Liang 16891e9379 [rllib] Don't use flat weights in non-eager mode (#6001) 2019-10-31 15:16:02 -07:00
Eric Liang a0dcb45dc3 [rllib] Fix APEX priorities returning zero all the time (#5980)
* fix

* move example tests to end

* level err

* guard against none

* no trace test

* ignore thumbs

* np

* fix multi node

* fix
2019-10-26 13:23:42 -07:00
Eric Liang 34fbc7fb4c rllib] Fix leak of TensorFlow assign operations in DQN/DDPG 2019-10-23 00:28:15 -07:00
Eric Liang f7bda0abad [rllib] Fix rnn shape with multi-dimensional data (#5939)
* fix shape

* add test

* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte d70abcfd70 Fix typo in examples/centralized_critic.py (#5943)
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright 0110941de5 rllib: use pytorch's fn to see if gpu is available (#5890) 2019-10-12 00:13:00 -07:00
Matthew A. Wright 4aa06918ae Qmix on gpu and with non-stacked-obs environment state support (#5751) 2019-10-08 13:18:07 -07:00
Eric Liang 04e997fe0d Fix TF2 / rllib test (#5846) 2019-10-07 14:25:16 -07:00
Eric Liang fb33160df8 Fix obs space lo/hi (#5826) 2019-10-04 09:28:06 -07:00
Eric Liang c6919d315d [rllib] Remove TorchPolicy locks (#5764)
* remove torch lock

* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics 7e214fd95e [tune] TensorBoard HParams for TF2.0 (#5678) 2019-09-21 11:06:34 -07:00
Kilian Batzner 79b9c70ad6 Add local_tf_session_args to unknown subkeys whitelist (#5742)
* Add local_tf_session_args to unknown subkeys whitelist

* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang fb3b232c0e [rllib] Properly flatten 2-d observations as input to FCnet (#5733) 2019-09-19 12:10:31 -07:00
Matthew A. Wright 3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731)
* Qmix fix.

-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.

* Allow extra obs dict keys

* Move Q-value-computing replay code to own function

* Run the autoformatter

* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring 8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Edward Oakes 07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Ashwinee Panda 946ebfaa3c [rllib] Validate that entropy coeff is not an integer (#5687)
* Validate that entropy coeff is not an integer

Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.

* Cast to float instead

Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang bc6a95deb0 [rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683) 2019-09-11 12:15:34 -07:00
Richard Liaw 0010f54378 Update Cloudpickle (#5643) 2019-09-09 17:17:29 -07:00
Eric Liang 74abeab057 [rllib] Improve accessing model state docs (#5656)
* [rllib] better model docs

* fix

* s
2019-09-08 23:01:26 -07:00
Eric Liang cf90394a09 [rllib] Fix TF2 import of EagerVariableStore (#5625) 2019-09-07 12:10:03 -07:00
Eric Liang 1455a19c85 Consolidate and clean up documentation (#5645) 2019-09-07 11:50:18 -07:00
Eric Liang 19bbf1eb4d [rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern (#5626) 2019-09-04 21:39:22 -07:00
Eric Liang a101812b9f Replace --redis-address with --address in test, docs, tune, rllib (#5602)
* wip

* add tests and tune

* add ci

* test fix

* lint

* fix tests

* wip

* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang daf38c8723 [tune] Deprecate tune.function (#5601)
* remove tune function

* remove examples

* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Philipp Moritz 747daff2cb Fix impala stress test (#5596) 2019-08-31 01:20:53 -07:00
Eric Liang 38231907f3 [rllib] Forgot to register param noise layer variables 2019-08-29 18:12:31 -07:00