Commit Graph

2420 Commits

Author SHA1 Message Date
Ian Rodney 09f89ff49d [autoscaler] Improve SSH Command Failure Logging (#8751) 2020-06-04 12:38:20 -07:00
SangBin Cho e372c06257 Hotfix dashboard broken tests. (#8757) 2020-06-04 09:44:00 -07:00
fangfengbin 84a8f2ccb5 Support reloading storage data when gcs server restarts (#8650) 2020-06-04 14:53:20 +08:00
Eric Liang a24d117c68 [autoscaler] Refactor code in preparation for multi instance type support (#8632)
* wip refactor

* add util

* wip

* fix

* fix

* remove

* remove extraneous string type for sg
2020-06-03 12:53:55 -07:00
Ian Rodney 474bbc28bf Warn if Autoscaling-config flag not set. (#8677) 2020-06-03 12:21:07 -07:00
Ian Rodney 7a2c9524d1 [Core] Randomize and 'Reserve' Port Generated for Node Manager (#8628) 2020-06-03 12:19:03 -07:00
Siyuan (Ryans) Zhuang 7fa64f2b24 Clean up unused Python code (#8755) 2020-06-03 12:09:19 -07:00
Max Fitton b9f0f7ae5b Dashboard minor refactor and first unit tests (#8705) 2020-06-03 11:04:55 -05:00
krfricke f4ee3e76d8 [tune] last-n-avg
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-06-02 20:06:04 -07:00
SangBin Cho 7c43991100 [GCS] Monitor.py bug fix (#8725)
* comment.

* Fix bugs.

* Used pubsub message instead.

* Added a ray.actors test
2020-06-02 16:06:36 -07:00
Edward Oakes 0306e4d589 [serve] Refer to serve "instances," not "clusters" (#8746) 2020-06-02 15:16:29 -07:00
Edward Oakes 2e82e05e4b [serve] Add list_backends and list_endpoints (#8737) 2020-06-02 15:14:10 -07:00
Simon Mo c5544eb070 [Async] Remove Monitor + Cleanup Code (#8691) 2020-06-02 14:11:16 -07:00
Edward Oakes e91f095d98 [Serve] Remove ray_init_kwargs in serve.init (#8747) 2020-06-02 14:05:35 -07:00
krfricke 4d0e9f3c71 [Dashboard] Sort IDLE workers to bottom in dashboard (#8708)
* Sort IDLE workers to bottom in dashboard

* Fixed linting error

Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-06-02 14:00:59 -07:00
Stephanie Wang aa06c3b15a Eager eviction even when object pinning is disabled (#8561)
* Eager eviction even when object pinning is disabled, add regression test

* Make test more robust

* lint
2020-06-02 11:48:03 -07:00
Edward Oakes ae312af435 Remove accidental passes in rllib, tune (#8742) 2020-06-02 12:29:17 -05:00
Edward Oakes 57bf0e43f0 fix docstring (#8736) 2020-06-02 08:55:20 -07:00
Lingxuan Zuo 4cbbc15ca7 [GCS] Global state accessor from node resource table (#8658) 2020-06-02 14:01:00 +08:00
Alec Brickner 207ab44129 Raise major version limit for msgpack (#8466) 2020-06-01 20:00:36 -07:00
Alex Wu a2ec282033 [Doc] Dataset lint fix (#8719) 2020-06-01 19:43:06 -07:00
Simon Mo 4cef1ee591 [Serve] Cleanup Router Implementation (#8718) 2020-06-01 19:21:28 -07:00
Alex Wu dcf58a43dc [SGD] Dataset API (#7839) 2020-06-01 15:48:15 -07:00
SangBin Cho cd5a207d69 [Dashboard] Frontend Lint Fix. (#8696) 2020-06-01 11:29:01 -07:00
fangfengbin 016337d4eb Heartbeat table uses gcs pub-sub instead of redis accessor (#8655) 2020-05-30 23:17:25 +08:00
Siyuan (Ryans) Zhuang ebea5c4111 Update cloudpickle to version 1.4.1 (#8577) 2020-05-29 17:55:48 -07:00
SangBin Cho 3ee3e64de0 [Dashboard] Ray memory frontend (#8563) 2020-05-29 19:02:09 -05:00
SangBin Cho 1115231e7c [Test] Fix test_reconstruction OOM error (#8636) 2020-05-29 18:56:19 -05:00
Edward Oakes 5bec951ece [docs] [serve] Deployment as a service on k8s docs (#8663) 2020-05-29 14:53:42 -07:00
krfricke e5b6566d28 Remove blocking flag from serve.init() (#8654) 2020-05-29 13:25:35 -07:00
Thomas Desrosiers 457a66ae9c Reverts setup.py changes from 76450c8d4 (#8670) 2020-05-29 13:24:32 -07:00
Edward Oakes 30ed20405a [autoscaler] Support creating services in k8s backend (#8659) 2020-05-29 15:19:21 -05:00
Simon Mo 6b04664645 [Serve] Add Tutorial for Batch Inference (#8490) 2020-05-29 09:55:47 -07:00
fangfengbin 35eeec5647 Add C++ global state for actor table (#8501)
* add global state actors

* fix code style

* fix GcsActorManagerTest bug

* rebase master

* add jni code

* add get checkpoint id code

* add debug code

* add debug code

* change log level

* fix compile bug

* return null in jni

* fix crash bug

* change import seq

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2020-05-29 21:10:42 +08:00
Sven Mika d483ed28ba [RLlib] Fix broken tune tests in master due to framework=auto errors. (#8672) 2020-05-29 11:55:47 +02:00
Edward Oakes c64b694560 Update RaySGD test to use ray.kill instead of __ray_kill__ (#8662) 2020-05-28 22:38:05 -05:00
Patrick Ames 76450c8d47 [autoscaler] Honor separate head and worker node subnet IDs (#8374) 2020-05-28 18:16:46 -07:00
Ian Rodney 99cc2e28b3 [Core] Fix test_cancel (#8644) 2020-05-28 09:40:37 -07:00
Hao Chen 08fee00bc8 Increase rayelt client connect timeout to fix test_debug_tools (#8605) 2020-05-28 20:57:30 +08:00
Lingxuan Zuo e594524ed3 [GCS] global state query node info table from GCS. (#8498) 2020-05-28 16:39:13 +08:00
Ujval Misra e958d261b6 Fix ray.available_resources bug (#8537) 2020-05-27 17:55:08 -07:00
Simon Mo 38399c9885 [Hotfix] [Serve] Disable Deployment Tutorial Test (#8641) 2020-05-27 10:40:40 -07:00
Bill Chambers fadd47e44e [docs] Ray Serve Documentation Overhaul (#8524) 2020-05-27 11:03:28 -05:00
fyrestone b0bb0584fb Fix fix_test_actor_method_metadata_cache (#8633) 2020-05-27 15:49:40 +08:00
mehrdadn 79a4eac48c Make more tests run on Windows (#8553) 2020-05-26 18:43:34 -05:00
Edward Oakes 137519e19d [serve] Remove start_server flag (#8620) 2020-05-26 14:34:18 -05:00
Amog Kamsetty ae2e1f0883 [Parallel Iterators] Batching + Pipelining optimizations (#7931)
* batching + get_shard pipelining

* duplicate fix

* formatting

* adding performance benchmark

* minor changes

* turn batching off by default
2020-05-26 00:37:57 -07:00
fyrestone f39760a4d3 Use uuid4() for actor creation function id hash (#8589) 2020-05-26 15:20:03 +08:00
fangfengbin 765d470c40 Add gcs object manager (#8298) 2020-05-25 17:21:35 +08:00
Edward Oakes 860eb6f13a Update named actor API (#8559) 2020-05-24 20:08:03 -05:00