Commit Graph

45 Commits

Author SHA1 Message Date
Eric Liang 0d94f3eeef [rllib] Improve datapath throughput of IMPALA / APPO (#4324) 2019-03-31 12:25:52 -07:00
bjg2 77005d1814 [rllib] Make batch timeout for remote workers tunable (#4435) 2019-03-29 13:19:42 -07:00
Eric Liang 2ffe67c5c3 [rllib] Minor cleanups to TFPolicyGraph: add init args, constants for loss inputs (#4478) 2019-03-29 12:44:23 -07:00
Robert Nishihara c6f12e5219 Update documentation from 0.7.0.dev1 to 0.7.0.dev2. (#4485) 2019-03-26 17:32:53 -07:00
Eric Liang 8ee240f40e [rllib] Use 64-byte aligned memory when concatenating arrays (#4408) 2019-03-25 23:56:51 -07:00
William Ma 11580fb7dc Changes where actor resources are assigned (#4323) 2019-03-24 15:49:36 -07:00
Hao Chen 80cd9c9c1a [travis] Add back '-v' option to pytest and install psutil (#4430) 2019-03-22 17:45:55 +08:00
Eric Liang 57c1aeb427 [rllib] Use suppress_output instead of run_silent.sh script for tests (#4386)
* fix

* enable custom loss

* Update run_rllib_tests.sh

* enable tests

* fix action prob

* Update suppress_output

* fix example

* fix
2019-03-21 00:15:24 -07:00
Eric Liang a45019d98c [rllib] Add option to proceed even if some workers crashed (#4376) 2019-03-16 13:34:09 -07:00
justinwyang db9fe6619d Run only relevant tests in Travis based on git diff. (#4271) 2019-03-15 22:23:54 -07:00
Hao Chen 93d9867290 Fix linting error on master (#4377) 2019-03-15 10:31:09 -07:00
Yuhong Guo 1a1027b3ab Update git-clang-format to support Python 3. (#4339) 2019-03-14 13:57:11 -07:00
Philipp Moritz b0c4e60ffb Build wheels for Linux with Bazel (#4281) 2019-03-13 15:57:33 -07:00
Eric Liang d5f4698305 [tune] Avoid scheduler blocking, add reuse_actors optimization (#4218) 2019-03-12 23:49:31 -07:00
Stefan Pantic 2202a81773 Fix multi discrete (#4338)
* Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)"

This reverts commit 3c41cb9b60.

* Fix a bug with log rhos for vtrace

* Reformat

* lint
2019-03-12 20:32:11 -07:00
Eric Liang 3c41cb9b60 Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)
This reverts commit 962b17f567.
2019-03-11 22:51:26 -07:00
William Ma f423909aec Temporary fix for many_actor_task.py (#4315) 2019-03-09 00:07:45 -08:00
Richard Liaw 6630a35353 [tune] Initial Commit for Tune CLI (#3983)
This introduces a light CLI for Tune.
2019-03-08 16:46:05 -08:00
Eric Liang c7f74dbdc7 [rllib] Add async remote workers (#4253) 2019-03-08 15:39:48 -08:00
Robert Nishihara fd2d8c2c06 Remove Jenkins backend tests and add new long running stress test. (#4288) 2019-03-08 15:29:39 -08:00
Yuhong Guo d5fb7b70a9 Update arrow version to fix plasma bugs (#4127)
* Update arrow

* Change to 2c511979b13b230e73a179dab1d55b03cd81ec02 which is rebased on Arrow 46f75d7

* Update to fix comment

* disable tests which use python/ray/rllib/tests/data/cartpole_small

* Fix get order of meta and data in MockObjectStore.java
2019-03-08 18:03:58 +08:00
Robert Nishihara 4c80177d6f Unpin gym in Python 2 since gym 0.12 was released. (#4291) 2019-03-07 15:59:30 -08:00
Eric Liang 437459f40a [build] Make travis logs not as long (#4213)
* clean it up

* Update .travis.yml

* Update .travis.yml

* update

* fix example

* suppress

* timeout

* print periodic progress

* Update suppress_output

* Update run_silent.sh

* Update suppress_output

* Update suppress_output

* manually do timeout

* sleep 300

* fix test

* Update run_silent.sh

* Update suppress_output

* Update .travis.yml
2019-03-07 12:09:03 -08:00
Eric Liang b0332551dd [rllib] Fix APPO + continuous spaces, feed prev_rew/act to A3C properly (#4286) 2019-03-06 21:36:26 -08:00
Philipp Moritz 39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Robert Nishihara f151aa8723 Update long running stress tests and add actor death test. (#4275) 2019-03-06 14:26:45 -08:00
Eric Liang 30bf8e46c7 [rllib] Use nested scope in custom loss example 2019-03-04 18:29:22 -08:00
Eric Liang 6e3384a719 [rllib] Add three new long-running stress tests {APEX, IMPALA, PBT} (#4215) 2019-03-04 14:05:42 -08:00
Philipp Moritz fbdd5da9c1 Fix application stress tests (#4228)
Fixes https://github.com/ray-project/ray/issues/4227
2019-03-02 21:57:27 -08:00
Richard Liaw a27cb225b6 Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Robert Nishihara 4b89eebfc7 Move test folders under rllib/tune from test -> tests. (#4214) 2019-03-02 13:37:16 -08:00
Robert Nishihara c4aa90314d Add script for shutting down tests. (#4203) 2019-03-01 19:56:30 -08:00
bjg2 962b17f567 [wingman -> rllib] IMPALA MultiDiscrete changes (#3967) 2019-03-01 19:47:06 -08:00
Eric Liang b809ef0107 [rllib] Silent tests (#4151) 2019-02-28 16:32:22 -08:00
Philipp Moritz 4dc683d39e Use latest arrow wheels (#4182) 2019-02-28 12:17:32 -08:00
Philipp Moritz 9ca9691cdc Fix mnist sgd jenkins tests on master (#4168) 2019-02-27 16:02:18 -08:00
Robert Nishihara 75504b9586 Add script for running infinitely long stress tests. (#4163)
Running `./ci/long_running_tests/start_workloads.sh` will start several workloads running (each in their own EC2 instance).
- The workloads run forever.
- The workloads all simulate multiple nodes but use a single machine.
- You can get the tail of each workload by running `./ci/long_running_tests/check_workloads.sh`.
- You have to manually shut down the instances.

As discussed with @ericl @richardliaw, the idea here is to optimize for the debuggability of the tests. If one of them fails, you can ssh to the relevant instance and see all of the logs.
2019-02-27 14:33:06 -08:00
Robert Nishihara 641f703879 Update installation instructions to include bazel and remove outdated… (#4171) 2019-02-26 23:07:43 -08:00
Richard Liaw f7450dbdd7 [tests] Stress tests for Jenkins (#3789)
Stress testing for Jenkins.

<!--
Thank you for your contribution!

Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request.
-->


<!-- Please give a short brief about these changes. -->
TODO:
 - [x] Enable a common keypair for autoscaling 
 - [x] Add automatic timeouts?
 - [x] Switch out key pair one last time before merge
2019-02-26 14:24:37 -08:00
John Liagouris 89ce4c56aa Initial Skeleton for Streaming API (#4126) 2019-02-26 12:15:08 -08:00
Kristian Hartikainen 524e69a82d [autoscaler] Change the get behavior of node providers' _get_node (#4132)
* Change the get behavior of GCPNodeProvider._get_node

* Add lock around the GCPNodeProvider._get_node call

* rename nodes

* lint

* Update GCPNodeProvider._get_node to match aws implementation

* assert

* log

* log highest heartbeats

* rename

* bringup to connected

* prune heartbeat times

* fix bringup
2019-02-24 18:43:35 -08:00
Eric Liang d9da183c7d [rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
Philipp Moritz ba52caff37 Make Bazel the default build system (#3898) 2019-02-23 11:58:59 -08:00
Eric Liang f1239a7a63 Lint script link broken, also lint filter was broken for generated py files (#4133) 2019-02-22 17:33:08 -08:00
William Ma c7a4c74f55 Moving tests from test/ to python/ray/tests/ (#3950) 2019-02-21 11:09:08 -08:00