Commit Graph

6189 Commits

Author SHA1 Message Date
Kai Fricke 1a1ff28d18 [tune] allow tune search spaces to be passed to search algorithms (#11503) 2020-10-26 12:33:13 -07:00
Richard Liaw 4ad8af9b0d [tune] More PTL example cleanup (#11585) 2020-10-26 12:26:14 -07:00
Richard Liaw b02e61f672 [minor] fix up docs (#11596)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-26 12:19:03 -07:00
Ian Rodney 2da6ad2176 [core] Better error message for named actor not found (#11604) 2020-10-26 09:46:02 -07:00
Tao Wang 0fbee4da0c [GCS] Remove unused ReportBatchHeartbeat/SubscribeHeartbeat (#11567)
* Remove unused message ReportBatchHeartbeat

* add up
2020-10-25 21:06:28 -07:00
Sumanth Ratna 11f1bbf03c [tune] use isinstance instead of type for TBXLogger (#11595) 2020-10-25 16:12:44 -07:00
Richard Liaw 1b357533b1 [tune] Try to enable PTL, SKlearn tests (#11542) 2020-10-24 01:08:46 -07:00
Eric Liang d3ee83205b Remove crashing assert in actor creation for old scheduler (#11577)
* remove assert

* warn log
2020-10-24 00:05:26 -07:00
Siyuan (Ryans) Zhuang 5ad5cb61ca Remove outdated numpy serializer (#11587) 2020-10-23 22:58:05 -07:00
Raoul Khouri c3c72db69b [tune] fixed validation for search metrics (#11583)
* fixed validation for search metrics

* formatting

* made error report better

* if only one metric is missing extract it from list

* any can take a generator
2020-10-23 17:04:21 -07:00
Clark Zinzow 0979589c7c [dask-on-ray] Convert tuple of object refs to list before ray.get() call. (#11582) 2020-10-23 16:39:22 -07:00
Ian Rodney d3405e74da [autoscaler] SDK fixes (#11517)
* [autoscaler] SDK Fxies

* add docs

* remove all_nodes
2020-10-23 14:09:47 -07:00
Ian Rodney aef96d17bf [yaml] HotFix for correct example full (#11584) 2020-10-23 15:55:07 -05:00
Max Fitton caf3b04b27 [Dashboard] Turn on new dashboard by default pt 2 (#11510) 2020-10-23 15:52:14 -05:00
Kai Fricke 8ee4f7eca3 [tune] fix pbt ptl example (#11573)
* [tune] fix pbt ptl example

* wider smoke test
2020-10-23 12:42:13 -07:00
Ian Rodney 7a0184e081 [docker] Push to DockerHub in CI (#11442) 2020-10-23 12:02:15 -07:00
architkulkarni 1ce0c4965b [Serve] Update front page of serve doc (#11421) 2020-10-23 12:01:04 -07:00
DK.Pino 9f804ade5f [Placement Group]Add get all placement group api (#11460)
* add get all interface for placement group

* add get all interface for placement group

* make it work

* fix lint

* fix lint

* fix comment

* add cpp test

* fix python lint
2020-10-23 11:46:48 -07:00
Richard Liaw e7aa6441b7 [tune] a tiny ptl example (#11497) 2020-10-22 18:50:34 -07:00
Barak Michener 4348ecf850 Clean up release tests (#11420) 2020-10-22 17:04:41 -07:00
Gekho457 2d1f52c21c [autoscaler] Removed .cleanup() from NodeProvider and commands.py (#11543) 2020-10-22 14:46:49 -07:00
dHannasch 47531ac7e6 Resolve Issue #11556 by changing the docs to reference _temp_dir. (#11562) 2020-10-22 16:24:46 -05:00
Frank Gu 73fa94731f [tune] Add HDFS as Cloud Sync Client (#11524) 2020-10-22 14:12:51 -07:00
Eric Liang 083737c63c Deprecate rsync to all nodes (#11563) 2020-10-22 13:45:42 -07:00
Amog Kamsetty d87c186721 [RaySGD] Docs for SGD+Tune usage (#11479) 2020-10-22 13:32:27 -07:00
Kingsley Kuan d1dd5d578e [RLlib] Fix PyTorch A3C / A2C loss function using mixed reduced sum / mean (#11449) 2020-10-22 12:39:34 -07:00
Allen cf2ee94e0c [Autoscaler] Allow users to set the names for security groups created by ray (#11405) 2020-10-22 12:28:59 -07:00
Simon Mo 7111a424af [Serve] Add regression test for #11437 (#11539) 2020-10-22 10:45:18 -07:00
Alex Wu d1182b827a [Autoscaler] Do not count unmanaged nodes in load metrics (#11458)
* fixedd

* lint

* fixed other test case

* .

Co-authored-by: Alex Wu <alex@anyscale.com>
2020-10-21 22:14:21 -07:00
Max Fitton 44fb60b4dd [hotfix] Pin node version (fix linux wheel build) (#11532)
Co-authored-by: Max Fitton <max@semprehealth.com>
2020-10-21 19:10:09 -07:00
Richard Liaw af0fde4efd [hotfix] disable sklearn again (#11541)
This reverts commit 9522918fa2.
2020-10-21 19:04:48 -07:00
Gekho457 155687e0c3 [autoscaler/AWS] Updated AWS Node Provider threading logic (#11422) 2020-10-21 18:42:38 -07:00
Philsik Chang ede9347127 [rllib] Add torch_distributed_backend flag for DDPPO (#11362) (#11425) 2020-10-21 18:30:42 -07:00
Richard Liaw a4b418d30c [docs] update cloud docs (#11262)
* update-cloud-docs

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/cluster/config.rst

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>

* fix

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

* fix

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
2020-10-21 16:37:26 -07:00
Alex Wu e02f4c0157 [New scheduler] queue by shape (#11381) 2020-10-21 15:56:06 -07:00
Eric Liang 920e4b2ef8 Try to raise ulimit for file descriptors to max allowed; warn if ulimit is still too low (#11515) 2020-10-21 14:29:43 -07:00
Eric Liang e8c77e2847 Remove memory quota enforcement from actors (#11480)
* wip

* fix

* deprecate
2020-10-21 14:29:03 -07:00
Alan Guo 8c82369cad [autoscaler] Add rsync_exclude and rsync_filter options to cluster config (#11512) 2020-10-21 14:28:33 -07:00
Richard Liaw 9522918fa2 [tune] reenable sklearn (#11192) 2020-10-21 14:21:38 -07:00
Edward Oakes 5d7f271e7d Add --worker-port-list option to ray start (#11481) 2020-10-21 14:46:45 -05:00
Tao Wang da2d3fbcfc Remove unused field in heartbeat message (#11459) 2020-10-21 10:49:16 -07:00
Kai Yang 078a22d676 [Core] Allow creating tasks/actors in a detached actor when driver has exited (#11493)
* Allow creating tasks/actors in a detached actor when driver has exited

* lint

* Address comment
2020-10-21 10:45:29 -07:00
Xuxue1 7200ddb72d Fix code_search_path failed in java (#11406)
Co-authored-by: xujiqiang eigen <xujiqiang@hpc1.ipa.aidigger.com>
2020-10-21 18:10:48 +08:00
Servon aeea168940 [tune] Update for ZOOpt (#11491)
Co-authored-by: Servon <zewen.li@polixir.ai>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-20 23:56:20 -07:00
fangfengbin a075e37695 [GCS]Fix TestActorTableResubscribe bug (#11463)
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-10-20 22:32:41 -07:00
Allen 2fc3237045 [Dashboard] Update dashboard port checking so that we can instantly reuse the dashboard port (#11487)
Co-authored-by: Allen Yin <allenyin@anyscale.io>
2020-10-20 19:19:50 -07:00
Kai Fricke 6d11fb8bc6 [tune] validate function callable in tune.with_parameters() (#11504) 2020-10-20 16:03:24 -07:00
Simon Mo 2c5cb95b42 [Serve] Get ServeHandle on the same node (#11477) 2020-10-20 10:44:23 -07:00
Simon Mo ef96793d3f [Serve] [Doc] Clarify custom method call (#11485) 2020-10-20 10:41:30 -07:00
Sumanth Ratna e663b524ae Enable highlighting (#11500) 2020-10-20 09:34:39 -07:00