Commit Graph

2885 Commits

Author SHA1 Message Date
Ian Rodney a13c83d7f0 Add WorkerCrashedError to cancel docs (#10534) 2020-09-03 13:23:04 -07:00
Clark Zinzow 0c0b0d0a73 [Core] Added support for submission-time task names. (#10449)
* Added support for submission-time task names.

* Suggestions from code review: add missing consts

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Add num_returns arg to actor method options docstring example.

* Add process name line and proctitle assertion to submission-time task name section of advanced docs.

* Add submission-time task name --> proctitle test for Python worker.

* Added Python actor options tests for num_returns and name.

* Added Java test for submission-time task names.

* Add dashboard image to task name docs section.

* Move to fstrings.

Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-09-03 11:45:24 -07:00
Edward Oakes 71274954d1 Remove unnecessary output when connecting to a cluster. (#10512) 2020-09-03 13:30:33 -05:00
Edward Oakes e4d80e1b0f fix passing sys config to start (#10514) 2020-09-03 11:18:21 -07:00
krfricke 91535e9102 [tune] Refactored Keras integration callbacks (#10509) 2020-09-03 10:16:08 -07:00
krfricke 06af62ba91 [tune] refactor tune search space (#10444)
* Added basic functionality and tests

* Feature parity with old tune search space config

* Convert Optuna search spaces

* Introduced quantized values

* Updated Optuna resolving

* Added HyperOpt search space conversion

* Convert search spaces to AxSearch

* Convert search spaces to BayesOpt

* Added basic functionality and tests

* Feature parity with old tune search space config

* Convert Optuna search spaces

* Introduced quantized values

* Updated Optuna resolving

* Added HyperOpt search space conversion

* Convert search spaces to AxSearch

* Convert search spaces to BayesOpt

* Re-factored samplers into domain classes

* Re-added base classes

* Re-factored into list comprehensions

* Added `from_config` classmethod for config conversion

* Applied suggestions from code review

* Removed truncated normal distribution

* Set search properties in tune.run

* Added test for tune.run search properties

* Move sampler initializers to base classes

* Add tune API sampling test, fixed includes, fixed resampling bug

* Add to API docs

* Fix docs

* Update metric and mode only when set. Set default metric and mode to experiment analysis object.

* Fix experiment analysis tests

* Raise error when delimiter is used in the config keys

* Added randint/qrandint to API docs, added additional check in tune.run

* Fix tests

* Fix linting error

* Applied suggestions from code review. Re-aded tune.function for the time being

* Fix sampling tests

* Fix experiment analysis tests

* Fix tests and linting error

* Removed unnecessary default_config attribute from OptunaSearch

* Revert to set AxSearch default metric

* fix-min-max

* fix

* nits

* Added function check, enhanced loguniform error message

* fix-print

* fix

* fix

* Raise if unresolved values are in config and search space is already set

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-03 09:06:13 -07:00
chaokunyang ea95e6f7cc [Java] lint java code (#10494) 2020-09-03 10:39:14 +08:00
Ian Rodney b9633a2b67 [docker] Support multiple node types (#10504) 2020-09-02 18:27:59 -07:00
SangBin Cho dc7fe1a4c5 [Placement Group] Atomic Placement Group Part 1, Basic Structure. (#10482)
* Write a test.

* Basic structure done.

* Reduce flakiness of tests.

* Addressed code review.

* Skipping tests because it is flaky for now.

* Fix linting issues.

* Increase sleep time to see lint messages.

* Lint issue fixed.
2020-09-02 18:14:46 -07:00
Ian Rodney 4324dd5929 [docker] Refactor "autoscaler" image into "-autoscaler" tag and "ray-ml" image. (#10351) 2020-09-02 13:03:35 -07:00
krfricke 57c4183724 [tune] add xgboost callbacks to integration module (#10502) 2020-09-02 11:16:09 -07:00
Vysybyl 6fa0edfbef [gcp] Update config.py for safe dir creation (#9645)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-01 21:41:44 -07:00
fyrestone b04222dbd9 [xlang] Cross language serialization for ActorHandle (#10335) 2020-09-02 10:11:53 +08:00
Simon Mo 65f17f2e14 [Serve] Refactor RequestMetadata and Query objects (#10483) 2020-09-01 18:15:31 -07:00
raoul-khour-ts 3b10b67a15 [tune] SigOpt multi-objective search + experiments (#10457) 2020-09-01 16:22:29 -07:00
Yiran Wang 2b95b613f2 [Autoscaler] Retry create_instances properly in AWSNodeProvider (#10479) 2020-09-01 16:17:11 -07:00
Alex Wu 23bbe0f36a [Autoscaler] Reload config (#10450) 2020-09-01 14:37:04 -07:00
krfricke 1dd55f4b07 [tune] remove callbacks from config in wandb logger initialization (#10441) 2020-09-01 14:26:39 -07:00
Richard Liaw 3f98a8bfcb [docs] Fix warnings for sphinx 1.8 (#10476)
* fix-build-for-sphinx18

* jnilit
2020-09-01 13:37:35 -07:00
Ian Rodney 283f4d1060 [docker] Use tmp paths for rsync and fix file_mounts on docker (#10368) 2020-09-01 13:14:35 -07:00
Ian Rodney c644650818 [docker] Run docker stop in teardown_cluster (#10407) 2020-09-01 11:25:37 -07:00
Richard Liaw 09d4a3241f [tune] Support true pooling and batched concurrency (#10352) 2020-09-01 10:33:49 -07:00
Simon Mo ddd62a177f Revert "[Serialization] Update CloudPickle to 1.6.0 (#9694)" (#10460)
This reverts commit f0c3910d59.
2020-08-31 20:41:37 -07:00
krfricke f3f698816d [tune] Added PyTorch Lightning callbacks to integrations (#10220)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-31 15:30:48 -07:00
Richard Liaw d8e7b144e4 [tune] avoid distributed error message (#10387) 2020-08-31 15:29:43 -07:00
Jan Blumenkamp 5774e1ce88 [tune/rllib/wandb] Flatten result dict so that nested result dicts are shown in W&B logger (#10429) 2020-08-31 15:28:46 -07:00
Sumanth Ratna bac8f14739 [tune] Use isinstance instead of type in assert in HyperOptSearch (#10454) 2020-08-31 15:26:55 -07:00
krfricke efa1d51aea [tune] Added external PyTorch tutorial test (#10192) 2020-08-31 15:24:31 -07:00
Zhe Zhang 4356abeb1c [Doc] Fix errors in ActorPool documentation (#10410)
* Fix errors in ActorPool documentation

1. map() and map_unordered() name
2. The print statement doesn't work

* Correctly change lines

* Address comments from pr
2020-08-31 13:57:10 -07:00
raoul-khour-ts 25f5614691 [tune] Rrk/sigopt_searcher_improvements (#10446) 2020-08-31 13:15:12 -07:00
PidgeyBE 6917efabc4 [k8s] Replace k8s service name everywhere in ingress manifest (#10445)
Co-authored-by: Pieterjan <pieterjan.soetaert@robovision.eu>
2020-08-31 13:14:40 -07:00
Robert Nishihara 0bba5485d3 [cli] Add prompt command for CLI logger. (#9897)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-31 11:03:53 -07:00
Amog Kamsetty afde3db4f0 [Tune] Synchronous Mode for PBT (#10283) 2020-08-31 00:00:47 -07:00
SangBin Cho 3e5cac80d8 [Tests] Fix Broken GCS restart test. (#10417) 2020-08-30 00:44:34 -07:00
SangBin Cho c8b14fd7e9 [Tests] Enable large test (#10391)
* Enable large tests.

* Lint.

* Fix issue.

* Skip all tests.
2020-08-30 00:44:05 -07:00
Max Fitton 63ad2e3340 [Dashboard] Fix Issue #10319 - Dashboard autoscaler crash (#10323)
* Patch error that occurred when there was an entry in the dashboard logs or errors internal data structures, and a worker was removed from the cluster. This would crash the cluster with a KeyError.

* lint

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-29 23:18:23 -07:00
Richard Liaw 8c753818ab update-scripts (#10425) 2020-08-29 23:16:25 -07:00
Siyuan (Ryans) Zhuang f0c3910d59 [Serialization] Update CloudPickle to 1.6.0 (#9694)
* update cloudpickle to 1.6.0

* fix CI timeout
2020-08-29 23:11:28 -07:00
fyrestone e9b046306a [Dashboard] Dashboard basic modules (#10303)
* Improve reporter module

* Add test_node_physical_stats to test_reporter.py

* Add test_class_method_route_table to test_dashboard.py

* Add stats_collector module for dashboard

* Subscribe actor table data

* Add log module for dashboard

* Only enable test module in some test cases

* CI run all dashboard tests

* Reduce test timeout to 10s

* Use fstring

* Remove unused code

* Remove blank line

* Fix dashboard tests

* Fix asyncio.create_task not available in py36; Fix lint

* Add format_web_url to ray.test_utils

* Update dashboard/modules/reporter/reporter_head.py

Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* Add DictChangeItem type for Dict change

* Refine logger.exception

* Refine GET /api/launch_profiling

* Remove disable_test_module fixture

* Fix test_basic may fail

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <mfitton@berkeley.edu>
2020-08-29 23:09:34 -07:00
Richard Liaw cb438be146 [core] Move log_to_driver back to public (#10422) 2020-08-29 16:35:14 -07:00
Stephanie Wang 9a31166050 Option to disable profiling and task timeline (#10414) 2020-08-29 11:35:22 -07:00
Eric Liang 910d5d2550 [hotfix] Bad merge with num_return_vals (#10418) 2020-08-28 22:45:35 -07:00
Alex Wu bd92cefbf7 [Autoscaler] Move Resource Demand Scheduler Test to Small (#10399) 2020-08-28 21:26:29 -07:00
Ian Rodney d6f2b0d933 [docker] Run profiling without sudo (#10388)
* fix profiling for docker

* small fixes

* use name

* do not import pwd on windows
2020-08-28 21:25:10 -07:00
Eric Liang f6a1698bab [autoscaler] Add documentation for multi node type autoscaling (#10405) 2020-08-28 19:57:21 -07:00
Eric Liang 2a204260a8 [api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
Alex Wu b1f3c9e10e [Autoscaler] Fix resource passing bug fix (#10397) 2020-08-28 15:43:18 -07:00
Kishan Sagathiya 2afb54c99c Validate non-integral args to ray.remote (#10221) 2020-08-28 17:18:01 -05:00
Eric Liang 519354a39a [api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
Edward Oakes 9c25ca6f5e [hotfix] Fix test_cli.py (#10403) 2020-08-28 15:43:58 -05:00