Commit Graph

2317 Commits

Author SHA1 Message Date
Alex Wu 04813c2ef5 [Parallel Iterator] Foreach concur (#8140) 2020-05-06 10:00:01 -05:00
Thomas Desrosiers ec9357b486 [autoscaler] Fix filesystem permission race conditions (#8327) 2020-05-05 17:22:03 -07:00
mehrdadn 4bdef78e2e Various CI fixes and cleanup (#8289) 2020-05-05 10:47:49 -07:00
fangfengbin 97430b2d0f GCS adapts to node table pub sub (#8209) 2020-05-05 18:34:41 +08:00
Eric Liang ee0eb44a32 Rename async_queue_depth -> num_async (#8207)
* rename

* lint
2020-05-05 01:38:10 -07:00
Simon Mo 1480bf4295 [Serve] Improve batch size inconsistency error (#8315) 2020-05-04 20:32:12 -07:00
Simon Mo ca929671b6 [Serve] Simplify Validation (#8316) 2020-05-04 20:31:23 -07:00
ijrsvt cc7bd6650a [core] Enabling Remote Task Cancelation (#8225) 2020-05-04 15:24:22 -07:00
Eric Liang 1228369a87 Remove "This tab is experimental" (#8281) 2020-05-02 22:41:28 -07:00
Simon Mo ec6631ae58 Pin redis-py version (#8290) 2020-05-02 22:09:02 -07:00
SangBin Cho 0f54d5ab65 Async actor microbenchmark Script (#8275) 2020-05-02 21:51:00 -07:00
Richard Liaw 40dfb337bf [tune] Hotfix Ax breakage when fixing backwards-compat (#8285) 2020-05-02 20:42:50 -07:00
Xianyang Liu eda526c154 [SGD] Support multiple input model (#8246) 2020-05-02 16:49:09 -07:00
Maksim Smolin c2acb7ffe2 [SGD] Add imagenet example CI (#8150) 2020-05-02 16:48:35 -07:00
Edward Oakes 518ef4c0b3 [serve] Increase timeout waiting for HTTP server (#8286) 2020-05-02 16:55:13 -05:00
Edward Oakes 8d3236f1d0 Lower test_utils.wait_for_condition default timeout to 30s (#8283) 2020-05-02 10:19:00 -05:00
Edward Oakes d4e64709ba Shorten test_joblib (#8273) 2020-05-01 17:11:32 -05:00
Edward Oakes 13f718846d [serve] Always use internal KV store (#8270) 2020-05-01 14:18:18 -05:00
Richard Liaw 07daff8794 [tune] Avoid breakage - soft deprecation warning for search algs (#8258) 2020-05-01 10:36:43 -07:00
Edward Oakes 3aec683f61 Avoid fate sharing with owner for detached actors (#8267) 2020-05-01 11:58:47 -05:00
Edward Oakes 63bc7dc522 service -> endpoint in router (#8269) 2020-05-01 11:55:34 -05:00
Edward Oakes 421b3c9d8b Fix serve long running test (#8268) 2020-05-01 11:54:27 -05:00
Edward Oakes 6373c70661 [serve] Refactor BackendConfig (#8202) 2020-04-30 22:31:07 -05:00
Edward Oakes 95d187e556 [serve] Add delete_endpoint call (#8256) 2020-04-30 20:59:07 -05:00
Edward Oakes 43be73e4cf [serve] Add delete_backend call (#8252) 2020-04-30 13:10:39 -05:00
Richard Liaw 05df80afad Extend timeout for test_tune_server (#8233) 2020-04-30 08:39:46 -05:00
Richard Liaw 35eac2671e [sgd] Resource limit lift for GPU test (#8238) 2020-04-30 00:24:48 -07:00
mehrdadn 254b1ec370 Set up testing and wheels for Windows on GitHub Actions (#8131)
* Move some Java tests into ci.sh

* Move C++ worker tests into ci.sh

* Define run()

* Prepare to move Python tests into ci.sh

* Fix issues in install-dependencies.sh

* Reload environment for GitHub Actions

* Move wheels to ci.sh and fix related issues

* Don't bypass failures in install-ray.sh anymore

* Make CI a little quieter

* Move linting into ci.sh

* Add vitals test right after build

* Fix os.uname() unavailability on Windows

Co-authored-by: Mehrdad <noreply@github.com>
2020-04-29 21:19:02 -07:00
Edward Oakes 17f0d50f1a [serve] Temporarily disable test_master_crashes (#8230) 2020-04-29 14:36:09 -05:00
Xianyang Liu fbf23eb6ff [SGD] Fix IterableDataset errors (#8208) 2020-04-29 10:51:31 -07:00
ijrsvt c393b6d165 Remove logging (#8211) 2020-04-29 09:15:43 -07:00
chaokunyang 91f630f709 [Streaming] Streaming Cross-Lang API (#7464) 2020-04-29 13:42:08 +08:00
Simon Mo 101255f782 [Serve] RayServe TF, PyTorch, Sklearn Examples (#8156) 2020-04-28 22:24:55 -07:00
Richard Liaw 4d639354cd [tune] Hotfix for test_ls (#8215) 2020-04-28 14:06:12 -07:00
Edward Oakes 7c0200c93b [serve] Master actor fault tolerance (#8116) 2020-04-28 15:52:29 -05:00
Edward Oakes ebdccde030 Fetch internal config from raylet (#8195) 2020-04-28 13:12:11 -05:00
aannadi eb790bf3a3 [Dashboard] Set logdir in Tune Dashboard and TensorBoard Opt-in (#8074) 2020-04-27 20:17:52 -07:00
Richard Liaw be5235d982 [tune] Clarify Intro Tune Documentation (#8201) 2020-04-27 18:01:00 -07:00
ijrsvt a77e5a8cbf [Doc] Fix Docstring for Task Cancellation (#8198) 2020-04-27 17:06:08 -07:00
Neil Lugovoy 8cf598deab [sgd] Fix GPU Reservations in LocalDistributedRunner (#8157) 2020-04-27 16:03:33 -07:00
Robert Nishihara 48250217ac Fix API documentation formatting. (#8197) 2020-04-27 10:48:42 -07:00
Philipp Moritz d7da25eee1 Use RAY_ADDRESS to connect to an existing Ray cluster if present (#7977) 2020-04-27 09:59:37 -07:00
Richard Liaw 87557a00fa [tune] Refactor search algorithms (#7037)
* start refactoring of search algorithms

* format

* needs tests

* fix

* suggestions

* Fix PBT

* lint

* refactoring

* hyperopt_working

* dragonfly

* hyperopt

* change_half_of_algs

* save

* code-removed

* remove_lots_of_unneccessary

* changes

* formatting

* suggest

* reset

* rm

* tests

* search-change

* exception

* refactor-doc

* search

* py

* moredocs

* Update doc/source/tune-searchalg.rst

* concurrency

* max

* tune

* betterwarning

* bohb

* tests

* test-change

Co-authored-by: ujvl <misraujval@gmail.com>
2020-04-27 08:51:13 -07:00
Richard Liaw 5bc6e32c0a [autoscaler] latest_dlami update (#8178) 2020-04-26 00:25:46 -07:00
ijrsvt 69ff7e3e35 TaskCancellation (#7669)
* Smol comment

* WIP, not passing ray.init

* Fixed small problem

* wip

* Pseudo interrupt things

* Basic prototype operational

* correct proc title

* Mostly done

* Cleanup

* cleaner raylet error

* Cleaning up a few loose ends

* Fixing Race Conds

* Prelim testing

* Fixing comments and adding second_check for kill

* Working_new_impl

* demo_ready

* Fixing my english

* Fixing a few problems

* Small problems

* Cleaning up

* Response to changes

* Fixing error passing

* Merged to master

* fixing lock

* Cleaning up print statements

* Format

* Fixing Unit test build failure

* mock_worker fix

* java_fix

* Canel

* Switching to Cancel

* Responding to Review

* FixFormatting

* Lease cancellation

* FInal comments?

* Moving exist check to CoreWorker

* Fix Actor Transport Test

* Fixing task manager test

* chaning clock repr

* Fix build

* fix white space

* lint fix

* Updating to medium size

* Fixing Java test compilation issue

* lengthen bad timeouts
2020-04-25 16:04:52 -07:00
Richard Liaw 9dd3490c38 [tune] Safer try-catch for TensorboardX (#8174)
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-04-25 13:08:37 -07:00
Simon Mo 13c14eac07 [Asyncio] Remove async init legacy code (#8177)
* [Asyncio] Remove async init legacy code

* Fix places that call async_init
2020-04-25 09:32:38 -07:00
Edward Oakes 9dc625318f [serve] Add basic test for specifying the method in a serve call (#8172) 2020-04-24 20:15:27 -05:00
Scott Graham 0dc01d8c1e [autoscaler] Azure versioning (#8168) 2020-04-24 17:03:55 -07:00
Nick Matthews a9d8d16b6b Change memory monitor warning to a logging call (#8137) 2020-04-22 21:29:18 -07:00