Commit Graph

2808 Commits

Author SHA1 Message Date
Edward Oakes cbd9632f3a Fix wait timeout logic (#10199) 2020-08-25 22:41:39 -05:00
fyrestone 08adbb371f Cross language exception (#10023) 2020-08-26 10:46:05 +08:00
Robert Nishihara 79eefbf357 Better checking that ray.init() has been called. (#10261) 2020-08-25 17:13:11 -07:00
Stephanie Wang d4537ac1ce [core] Try to schedule tasks locally before spilling over to remote nodes (#10302)
* Regression test

* Spillback

* Remove check for actor tasks
2020-08-25 15:01:59 -07:00
Richard Liaw 146d91385c [tune] custom trial directory name (#10214) 2020-08-25 12:52:54 -07:00
SangBin Cho 3b3ca96a4e [Placement Group] Wait (#10259)
* Initial progress done.

* Fix wrong test.

* Improve tests.

* Update code.

* Addressed code review and merge conflict.

* Addressed code review.
2020-08-24 20:14:48 -07:00
Richard Liaw 6dc22a6d68 [autoscaler] Fix logging regression (#10280) 2020-08-24 14:25:12 -07:00
fyrestone 05c103af94 [Dashboard] Start the new dashboard (#10131)
* Use new dashboard if environment var RAY_USE_NEW_DASHBOARD exists; new dashboard startup

* Make fake client/build/static directory for dashboard

* Add test_dashboard.py for new dashboard

* Travis CI enable new dashboard test

* Update new dashboard

* Agent manager service

* Add agent manager

* Register agent to agent manager

* Add a new line to the end of agent_manager.cc

* Fix merge; Fix lint

* Update dashboard/agent.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Update dashboard/head.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Fix bug

* Add tests for dashboard

* Fix

* Remove const from Process::Kill() & Fix bugs

* Revert error check of execute_after

* Raise exception from DashboardAgent.run

* Add more tests.

* Fix compile on Linux

* Use dict comprehension instead of dict(generator)

* Fix lint

* Fix windows compile

* Fix lint

* Test Windows CI

* Revert "Test Windows CI"

This reverts commit 945e01051ec95cff5fcc1c0bc37045b46e7ad9a6.

* Fix ParseWindowsCommandLine bug

* Update src/ray/util/util.cc

Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
2020-08-24 13:24:23 -07:00
Max Fitton 832f5cdccb [Dashboard] Memory View Group by Stack Trace and UI Overhaul (#10227) 2020-08-24 14:54:42 -05:00
PidgeyBE a82124d304 Update memory_monitor.py (#9212) 2020-08-24 10:29:01 -07:00
Eric Liang 4761eacc3e [autoscaler] Also account for head node resources in multi node type autoscaling (#10230) 2020-08-24 10:26:22 -07:00
Ian Rodney f051c2852e [docker] docker cp correctly into container (#10253) 2020-08-24 09:18:34 -07:00
SangBin Cho 1f54acd274 [Tech Debt] Use f-string for python/ray/*.py (#10268)
* In progress.

* Done with critical path.

* Modified cluster_utils.py and log_monitor.py

* Addressed code review.
2020-08-23 22:01:31 -07:00
fangfengbin b61a79efd7 [Placement Group]Fix SigSegv bug (#10262)
* fix SigSegv bug

* fix review comments

* fix ut bug

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-23 11:33:40 -07:00
Richard Liaw 73c4246332 [Core] fix-bad-stack (#10266) 2020-08-23 10:33:29 -07:00
Yu Shan 5264f888e4 fix iterable dataset (issue 9899) (#9952) 2020-08-22 19:40:38 -07:00
Maksim Smolin 245c0a9e43 [cli] Tests (#10057)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-22 13:29:10 -07:00
fangfengbin 8362029dcf [Placement Group]Fix CrossLanguageInvocationTest failure (#10257)
* add part code

* rebase master

* add part code

* rebase master

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-22 12:12:00 -07:00
Richard Liaw 6bd5458bef [tune] cleanup error messaging/diagnose_serialization helper (#10210) 2020-08-22 11:50:49 -07:00
Richard Liaw 24ee496b89 [tune] support rerunning failed trials (#10060) 2020-08-22 09:59:05 -07:00
krfricke c31876002d [tune/rllib] made wandb compatible with rllib trainables (#10252) 2020-08-21 17:25:52 -07:00
Richard Liaw f87669372d [cli] enable log-new-style by default (#10213) 2020-08-21 15:21:43 -07:00
fangfengbin 36c6c4b298 [Placement group] Check if placement group bundle index is valid (#10194)
* add part code

* rebase master

* add java testcase

* fix review comments

* fix lint error

* rebase master

* fix lint error

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-21 11:04:56 -07:00
Max Fitton 17f801dc69 Make get_py_stack return more stack frames (#9512) 2020-08-21 13:02:12 -05:00
SangBin Cho 92664249e8 Partially Use f string (#10218)
* flynt. trial 1.

* Trial 1.

* Addressed code review.
2020-08-20 18:21:16 -07:00
architkulkarni 07cd815e5a [Serve] Type hints for API (#10205) 2020-08-20 15:33:04 -07:00
Stephanie Wang 85e57a7a98 [Object spilling] Look up the location of the primary raylet from the owner's metadata (#10197)
* Get the primary copy from the owner, python test, some node manager fixes

* fixes and todo

* update

* lint

* fix build
2020-08-20 14:46:59 -07:00
Eric Liang 0baf992a4f [hotfix] [autoscaler] Address remaining comments on renaming instance => node (#10229)
* more renaming

* fix import
2020-08-20 14:37:41 -07:00
Eric Liang 85a6876119 [autoscaler] Rename instance_type => node_type, TAG_RAY_INSTANCE_TYPE => TAG_RAY_USER_NODE_TYPE (#10207) 2020-08-20 12:27:11 -07:00
Amog Kamsetty 8d466749ee [Tune] PBT hyperparam_mutations fix (#10217) 2020-08-20 12:02:29 -07:00
fangfengbin a462ae2747 [Placement Group]Add strict spread strategy (#10174)
* support STRICT_SPREAD strategy

* fix review comments

* rebase master

* fix lint error

* fix lint error

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-20 10:18:58 -07:00
SangBin Cho 224933b5e4 [Placement Group] Remove API part 2 (#10215)
* Initial progress done.

* Fix mistake.

* Addressed code review.

* Fix cpp build issue.

* Addressed code review.
2020-08-20 09:50:13 -07:00
Eric Liang 538cb802d5 [autoscaler] Refactor multi node type autoscaler config (#10190) 2020-08-19 20:46:00 -07:00
Richard Liaw 2fd59de05d [autoscaler] hotfix - swallowed error for missing yaml (#10212) 2020-08-19 20:02:56 -07:00
Amog Kamsetty 9ff687c093 [SGD][Docs] docs for training/ validation results (#10181) 2020-08-19 17:22:28 -07:00
Amog Kamsetty 44e254788a [Tune] PBT hyperparam_mutations improvements (#10170) 2020-08-19 16:50:19 -07:00
Alex Wu b70dce0d02 [autoscaler] Hotfix bad None check (#10196) 2020-08-19 13:27:20 -07:00
fangfengbin 9734dbca3e [Placement Group]Reschedule bundles when the node of bundles is dead (#10021) 2020-08-19 13:24:42 -07:00
Edward Oakes 888f0a2c60 [serve] Use ray.experimental.metrics (#10185) 2020-08-19 13:03:22 -05:00
architkulkarni de46464aa3 [Experimental] Queue: replace polling with async actor (#10120) 2020-08-19 11:55:42 -05:00
Sven Mika 2cbe29a7fa [RLlib] Curiosity minor fixes, do-overs, and testing. (#10143) 2020-08-19 17:49:50 +02:00
Max Fitton 9c5e5a9757 [Dashboard] Fix and Recommit Reverted Group by Actor Class PR (#10186)
* Revert "Revert "[Dashboard] Group by Actor Class (#10147)" (#10180)"

This reverts commit e4d2ca620a.

* Fix metrics test to agree with the new logical view API

* lint2

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-18 20:55:58 -07:00
Edward Oakes ba0f531da0 [serve] Remove SLO code and blist dependency (#10075) 2020-08-18 17:52:36 -05:00
SangBin Cho 263df6163c [Placement Group] Placement group remove api part 1 (#10063)
* Added basic rpc calls.

* fix issues.

* Fix the gcs server not getting request issue.

* In Progress.

* Basic logic done. Tests are required.

* In progress.

* In progress in refactoring context.

* Revert "In progress in refactoring context."

This reverts commit 38236256cf1306c60dd203e75d45ceb4509c8106.

* Working now.

* Python test works.

* Lint.

* Addressed code review.

* Addressed code review.

* Lint.

* Added unit tests.

* Done, but one of unit tests fail

* Addressed code review.

* Addressed the last code review.

* Fix the wrong test case.
2020-08-18 12:44:00 -07:00
Lixin Wei d188becec2 [Python Worker] Add pid to log file name (#10149)
Co-authored-by: Alex Wu <alex@anyscale.io>
2020-08-18 11:48:48 -07:00
Simon Mo bedc2c24c8 Export Metrics in OpenCensus Protobuf Format (#10080) 2020-08-18 11:32:42 -07:00
Max Fitton 8d06e30a06 [Dashboard] Fix Ray Dashboard command error messages (#10050) 2020-08-18 13:30:51 -05:00
Max Fitton e4d2ca620a Revert "[Dashboard] Group by Actor Class (#10147)" (#10180)
This reverts commit 71f6f83f1d.
2020-08-18 11:27:46 -07:00
Tomasz Wrona aff7f19360 [tune] Added logger_config field (#8521)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 11:10:22 -07:00
Richard Liaw eacf7dddba update-code (#10106) 2020-08-18 09:28:32 -07:00