Commit Graph

2332 Commits

Author SHA1 Message Date
mehrdadn ac1ed293e3 Patch redis-py bug for Windows (#8386) 2020-05-12 10:41:45 -05:00
Edward Oakes b84fe56bed Split test_basic to avoid timeouts in CI (#8405) 2020-05-12 10:18:21 -05:00
Eric Liang 9d012626e5 [rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Simon Mo 501b936114 [Serve] Improve error message when result is not a list (#8378) 2020-05-10 17:18:06 -07:00
Stephanie Wang 3a25f5f5b4 Clean up actor state from the GCS (#8261)
* parametrize test

* Regression test and logging

* Test no restart after actor deletion

* Unit tests

* Refactor to subscribe to and lookup from worker failure table

* Refactor ActorManager to remove dependencies

* Revert "Regression test and logging"

This reverts commit 835e1a9091b51ca8efb00392d4cc4a665145de24.

* Revert "parametrize test"

This reverts commit f31272082831ba1a494816dd5511d87b24eca4c9.

* Revert "Test no restart after actor deletion"

This reverts commit 114a83de14329aa6ab787c80cd5757cf074a9072.

* doc

* merge

* Revert "Refactor to subscribe to and lookup from worker failure table"

This reverts commit 6aa13a05178d0b9aa1db9dee5c978c911b74fa3a.

* Revert "Revert "Test no restart after actor deletion""

This reverts commit 1bd92d09172aa8ab42632551cf9c56463f9598fe.

* Revert "Revert "parametrize test""

This reverts commit 639ba4d3b02167fb2b05e9878f9aa600bcec95b3.

* Revert "Revert "Regression test and logging""

This reverts commit f18b5f0db699a23cbccde32789e3639425e99ca4.

* Clean up actors that have gone out of scope

* Use actor ID instead of shared_ptr

* Clean up actors owned by dead workers

* Use actor ID instead of shared_ptr

* TODO and lint

* Fix unit tests

* Add unit tests for supervision and docs

* xx

* Fix tests

* Fix tests

* fix build
2020-05-09 18:43:49 -07:00
Thomas Lecat 4421f3a000 [tune] Close loggers after updating trial (#8307) (#8366) 2020-05-09 13:26:59 -07:00
Edward Oakes 2677b71003 Implement named actors using the GCS service (#8328) 2020-05-09 08:58:10 -05:00
Eric Liang 1126fe4d23 [tune] Add UUID back to trial names (#8377) 2020-05-08 20:20:36 -07:00
Eric Liang 9f04a65922 [rllib] Add PPO+DQN two trainer multiagent workflow example (#8334) 2020-05-07 23:40:29 -07:00
Eric Liang 413db0902d Trigger global GC when resources may be occupied by deleted actors 2020-05-07 14:57:21 -07:00
Edward Oakes f2f118df9e [serve] Clear serve cluster state between tests. (#8357) 2020-05-07 16:45:20 -05:00
Philipp Moritz 325aec81bd Hide aliased autoscaler commands (#8348) 2020-05-07 10:17:59 -07:00
Simon Mo c5a5a5de89 [Serve] Refactor Metric System: Counter + Measure Support (#8114) 2020-05-06 17:44:02 -07:00
Eric Liang 1f312debbe Document all ray commands. (#8340) 2020-05-06 16:49:37 -07:00
SangBin Cho e631827a9f [Core] Show_webui segfault fix. (#8323) 2020-05-06 11:45:07 -05:00
Alex Wu 04813c2ef5 [Parallel Iterator] Foreach concur (#8140) 2020-05-06 10:00:01 -05:00
Thomas Desrosiers ec9357b486 [autoscaler] Fix filesystem permission race conditions (#8327) 2020-05-05 17:22:03 -07:00
mehrdadn 4bdef78e2e Various CI fixes and cleanup (#8289) 2020-05-05 10:47:49 -07:00
fangfengbin 97430b2d0f GCS adapts to node table pub sub (#8209) 2020-05-05 18:34:41 +08:00
Eric Liang ee0eb44a32 Rename async_queue_depth -> num_async (#8207)
* rename

* lint
2020-05-05 01:38:10 -07:00
Simon Mo 1480bf4295 [Serve] Improve batch size inconsistency error (#8315) 2020-05-04 20:32:12 -07:00
Simon Mo ca929671b6 [Serve] Simplify Validation (#8316) 2020-05-04 20:31:23 -07:00
ijrsvt cc7bd6650a [core] Enabling Remote Task Cancelation (#8225) 2020-05-04 15:24:22 -07:00
Eric Liang 1228369a87 Remove "This tab is experimental" (#8281) 2020-05-02 22:41:28 -07:00
Simon Mo ec6631ae58 Pin redis-py version (#8290) 2020-05-02 22:09:02 -07:00
SangBin Cho 0f54d5ab65 Async actor microbenchmark Script (#8275) 2020-05-02 21:51:00 -07:00
Richard Liaw 40dfb337bf [tune] Hotfix Ax breakage when fixing backwards-compat (#8285) 2020-05-02 20:42:50 -07:00
Xianyang Liu eda526c154 [SGD] Support multiple input model (#8246) 2020-05-02 16:49:09 -07:00
Maksim Smolin c2acb7ffe2 [SGD] Add imagenet example CI (#8150) 2020-05-02 16:48:35 -07:00
Edward Oakes 518ef4c0b3 [serve] Increase timeout waiting for HTTP server (#8286) 2020-05-02 16:55:13 -05:00
Edward Oakes 8d3236f1d0 Lower test_utils.wait_for_condition default timeout to 30s (#8283) 2020-05-02 10:19:00 -05:00
Edward Oakes d4e64709ba Shorten test_joblib (#8273) 2020-05-01 17:11:32 -05:00
Edward Oakes 13f718846d [serve] Always use internal KV store (#8270) 2020-05-01 14:18:18 -05:00
Richard Liaw 07daff8794 [tune] Avoid breakage - soft deprecation warning for search algs (#8258) 2020-05-01 10:36:43 -07:00
Edward Oakes 3aec683f61 Avoid fate sharing with owner for detached actors (#8267) 2020-05-01 11:58:47 -05:00
Edward Oakes 63bc7dc522 service -> endpoint in router (#8269) 2020-05-01 11:55:34 -05:00
Edward Oakes 421b3c9d8b Fix serve long running test (#8268) 2020-05-01 11:54:27 -05:00
Edward Oakes 6373c70661 [serve] Refactor BackendConfig (#8202) 2020-04-30 22:31:07 -05:00
Edward Oakes 95d187e556 [serve] Add delete_endpoint call (#8256) 2020-04-30 20:59:07 -05:00
Edward Oakes 43be73e4cf [serve] Add delete_backend call (#8252) 2020-04-30 13:10:39 -05:00
Richard Liaw 05df80afad Extend timeout for test_tune_server (#8233) 2020-04-30 08:39:46 -05:00
Richard Liaw 35eac2671e [sgd] Resource limit lift for GPU test (#8238) 2020-04-30 00:24:48 -07:00
mehrdadn 254b1ec370 Set up testing and wheels for Windows on GitHub Actions (#8131)
* Move some Java tests into ci.sh

* Move C++ worker tests into ci.sh

* Define run()

* Prepare to move Python tests into ci.sh

* Fix issues in install-dependencies.sh

* Reload environment for GitHub Actions

* Move wheels to ci.sh and fix related issues

* Don't bypass failures in install-ray.sh anymore

* Make CI a little quieter

* Move linting into ci.sh

* Add vitals test right after build

* Fix os.uname() unavailability on Windows

Co-authored-by: Mehrdad <noreply@github.com>
2020-04-29 21:19:02 -07:00
Edward Oakes 17f0d50f1a [serve] Temporarily disable test_master_crashes (#8230) 2020-04-29 14:36:09 -05:00
Xianyang Liu fbf23eb6ff [SGD] Fix IterableDataset errors (#8208) 2020-04-29 10:51:31 -07:00
ijrsvt c393b6d165 Remove logging (#8211) 2020-04-29 09:15:43 -07:00
chaokunyang 91f630f709 [Streaming] Streaming Cross-Lang API (#7464) 2020-04-29 13:42:08 +08:00
Simon Mo 101255f782 [Serve] RayServe TF, PyTorch, Sklearn Examples (#8156) 2020-04-28 22:24:55 -07:00
Richard Liaw 4d639354cd [tune] Hotfix for test_ls (#8215) 2020-04-28 14:06:12 -07:00
Edward Oakes 7c0200c93b [serve] Master actor fault tolerance (#8116) 2020-04-28 15:52:29 -05:00