Commit Graph

2758 Commits

Author SHA1 Message Date
Alex Wu 0b5d5ec17d [Autoscaler] Pass custom resources to "ray start" multi instance autoscaling (#9986) 2020-08-17 22:34:07 -07:00
Max Fitton 71f6f83f1d [Dashboard] Group by Actor Class (#10147)
* Update dashboard API to be able to pass actors in a flat structure in addition to nested.

* Working on adapting front-end to display UI w/ new actor class grouping

* wip

* Group logical view by actor class.

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-17 22:03:50 -07:00
SangBin Cho 8cedcdf2df [Tests] Fix test output (#10162)
* Trial 1.

* Fix.

* Revert "Fix."

This reverts commit 26ad970f753d581f340857be30054d6954df8255.

* Revert "Trial 1."

This reverts commit 63f7aca5162bb40f2d5e28fb9647598cbde7ad41.

* Another fix try.

* Last trial.

* Remove unnecessary comment.

* Small fix.

* Use better units.

* Lint.
2020-08-17 21:24:20 -07:00
Robert Nishihara d45418936c Skip failing tests on Windows. (#10139) 2020-08-17 18:56:17 -07:00
Amog Kamsetty d3bac298d5 [Tune] PBT Error if metric not available (#9957) 2020-08-17 16:12:14 -07:00
Alex Wu 4b14bf85e4 [Autoscaler] Resource demand vector (hearbeat -> autoscaler plumbing) (#10127) 2020-08-17 13:57:15 -07:00
Ian Rodney a079f46c25 [autoscaler]/[docker] Cleanup YAMLs & Use RAY docker images (#10108) 2020-08-17 09:49:28 -07:00
SangBin Cho 053188dfbe [Placement Group] Support Placement Group state table. (#10090)
* Done.

* Addressed code review.

* Linting.

* Fix lint.

* Fix lint.

* Fix a test.

* Lint.

* Add a lint sleep to test.

* Fix the lint issue.

* Fixed doc build error.
2020-08-17 09:24:50 -07:00
fangfengbin edd783bc32 [Placement Group]Add soft pack strategy (#10099) 2020-08-17 12:01:34 +08:00
krfricke 8f0f7371a0 [tune] Added Kubernetes syncer and sync client (#10097)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
Philipp Moritz c7adb464e4 [autoscaler] Fix run_env='host' for initialization commands (#10137) 2020-08-15 15:25:54 -07:00
Philipp Moritz e95f0afe4c [autoscaler] Expand key path for hashing with expanduser (#10125) 2020-08-14 18:50:27 -07:00
Amog Kamsetty f87a4aa45d [Tune] Pbt Function API (#9958)
* adding function convnet example

* add unit test

* update test

* update example

* wip

* move error from experiment to tune

* wip

* Fix checkpoint deletion

* updating code

* adding smoke test

* updating pbt guide

* formatting

* fix build

* add best checkpoint analysis util

* update test

* add comments

* remove class api

* fix example

* add setup and teardown to tests

* formatting

* Update python/ray/tune/tests/test_trial_scheduler_pbt.py

Co-authored-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-14 17:52:30 -07:00
Siyuan (Ryans) Zhuang 17ca1d8ff4 [Core] Object spilling prototype (#9818) 2020-08-14 15:39:10 -07:00
Robert Nishihara 36e626e95d Revert "[Dashboard] Start the new dashboard (#9860)" (#10116)
This reverts commit 739933e5b8.
2020-08-14 14:06:57 -07:00
Philipp Moritz 6b53df9599 Hash contents of SSH key instead of key path (#10103) 2020-08-14 00:10:31 -07:00
Lixin Wei 0fe5722744 [Core] Add cached memory to unsued memory in Linux/BSD (#10084)
* add cached memory to available memory

* format

* bug fixed

* bug fixed

* fixed

* lint
2020-08-13 23:47:52 -07:00
Ian Rodney a252aa29da [docker] Wrap more internal items with run_env="host" (#10078) 2020-08-13 20:35:06 -07:00
SangBin Cho 55fe7f65a5 [Tests] Make test_output debugging easy (#10091)
* Fix.

* Fix.
2020-08-13 18:45:26 -07:00
Eric Liang c9f13b0833 [Placement Groups] Support CUDA_VISIBLE_DEVICES (#10053) 2020-08-13 18:00:04 -07:00
Simon Mo 01f38bc5d1 CoreWorker correctly push metrics to agent (#10031) 2020-08-13 16:44:53 -07:00
Amog Kamsetty 5898248645 [Tune] Update PBT Transformer Test (#10081) 2020-08-13 12:23:03 -07:00
SangBin Cho 8b689224a5 [Tests] Make test_multi_driver light. (#10086) 2020-08-13 10:00:42 -07:00
architkulkarni fe5fcb6b9c [serve] backend and endpoint validation (#9954) 2020-08-13 11:56:50 -05:00
Richard Liaw 7a56c3b71a [cli] create_or_update_cluster fix (#10085) 2020-08-13 00:54:45 -07:00
SangBin Cho 86b1db3f11 [Stats] Make metrics report time configurable (#10036)
* Done.

* Lint.

* Address code review.

* Address code review.

* Remove wrong commit.

* Fix a test error.
2020-08-13 00:30:24 -07:00
fyrestone 739933e5b8 [Dashboard] Start the new dashboard (#9860) 2020-08-13 11:01:46 +08:00
krfricke 16486a8df3 [tune] Add OptunaSearcher wrapper around Optuna samplers (#10044)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-12 16:13:22 -07:00
Richard Liaw 7a8b922841 [tune] hotfix log_once (#10069) 2020-08-12 12:40:22 -07:00
SangBin Cho 2cb79632e4 Revert "[Core] Add cached memory to available memory (#10020)" (#10064)
This reverts commit 71d2bde458.
2020-08-12 11:24:16 -05:00
Richard Liaw 5560272556 [cli] install nightly wheels via ray install-nightly (#10054) 2020-08-11 20:08:22 -07:00
Simon Mo f1ede1099f [Hotfix] Pin opencv-python-headless==4.3.0.36 (#10049) 2020-08-11 15:58:18 -07:00
Ameer Haj Ali 82cdcff898 Removing kwargs & SSHOptions args from command runners (#10014) 2020-08-11 15:09:49 -07:00
Lixin Wei 71d2bde458 [Core] Add cached memory to available memory (#10020)
* add cached memory to available memory

* format

* bug fixed
2020-08-11 15:07:00 -07:00
Zhuohan Li a6fed4820e [Core] Preliminary implementation of ownership-based object directory (#9735) 2020-08-11 15:04:13 -07:00
krfricke 221fdc0774 [tune] fix flaky PBT replay test (#10047)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-11 14:17:31 -07:00
yncxcw 32cd94b750 [Core] Do not convert gpu id to int (#9744)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Maksim Smolin d6226b80bb [cli] CliLogger typing (#10027) 2020-08-11 12:00:57 -07:00
Ian Rodney 98126a84af [docker] Remove port flags (#9992) 2020-08-11 11:43:56 -07:00
Simon Mo 4c52adddfa [Core] Type check ObjectRef (#9856)
* Type check ObjectRef

* Bug fix

* Port typing tests to bazel test
2020-08-11 10:38:29 -07:00
Richard Liaw 98df612010 [tune] option to raise on error (#10030) 2020-08-11 09:59:04 -07:00
Maksim Smolin 40b8e35d61 [cli] New logging for the rest of the ray commands (#9984)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 09:58:23 -07:00
Kai Yang 3bc17fa62a [Core] Multi-tenancy: Pass env variables from job config to worker processes (#10022) 2020-08-10 14:31:37 -07:00
Amog Kamsetty 856d4a0533 [Tune] Better error when using checkpoint_freq (#9998) 2020-08-10 13:52:46 -07:00
Richard Liaw be8e63d477 [tune] support resume for search algorithms (#9972) 2020-08-10 13:43:14 -07:00
krfricke 7301733a1f [tune] Close logfile contexts (#10026)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-10 12:40:40 -07:00
PidgeyBE 6ad2fc4831 [autoscaler] Service and Ingress per worker pod (#9359) 2020-08-10 14:13:52 -05:00
SangBin Cho 10baecb8c2 Try to fix. (#10005) 2020-08-10 10:05:43 -07:00
Maksim Smolin 0392bb7a72 [Autoscaler] Command Output Improvement (#9699)
* cross-platform prototype

* checkpoint

* Address PR comments

* format

* prepare to push

* format

* PR comments

* fixes

* fixtest

* Revert "fixtest"

This reverts commit d6f54353e1b891c784417bb8d0304c18cc5bcdd8.

* return-result

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-10 09:49:24 -07:00
Kai Yang 37821f0b4c Support unlimited JVM options (#9910) 2020-08-10 16:08:33 +08:00