Commit Graph

204 Commits

Author SHA1 Message Date
Amog Kamsetty 20acc3b05e Revert "Inline small objects in GetObjectStatus response. (#13309)" (#13615)
This reverts commit a82fa80f7b.
2021-01-21 16:10:34 -08:00
Clark Zinzow a82fa80f7b Inline small objects in GetObjectStatus response. (#13309) 2021-01-21 09:15:18 -08:00
Clark Zinzow 9a658b568f [Core] Ownership-based Object Directory: Consolidate location table and reference table. (#13220)
* Added owned object reference before Plasma put on Create() + Seal() path.

* Consolidated location table and reference table in reference counter.

* Restore type in definition.

* Clean up owned reference on failed Seal().

* Added RemoveOwnedObject test for reference counter.

* Guard against ref going out of scope before location RPCs.

* Add 'owner must have ref in scope' precondition to documentation for object location methods.

* Move to separate Create() + Seal() methods for existing objects.

* Clearer distinction between Create() and Seal() methods.

* Make it clear that references will normally be cleaned up by reference counting.
2021-01-14 13:48:10 -08:00
Hao Chen 77cd0d5a21 Fix a crash problem caused by GetActorHandle in ActorManager (#13164) 2021-01-08 12:11:08 +08:00
Barak Michener c4e273920f [ray_client]: Insert decorators into the real ray module to allow for client mode (#13031) 2020-12-22 22:51:45 -08:00
Kai Yang 5a6801dde7 [Core] Remove delete_creating_tasks (#12962) 2020-12-22 00:01:27 +08:00
SangBin Cho 9d939e6674 [Object Spilling] Implement level triggered logic to make streaming shuffle work + additional cleanup (#12773) 2020-12-18 19:31:14 -08:00
Yi Cheng 40032541dc [core] Introduce fetch_local to ray.wait (#12526) 2020-12-16 23:44:28 -08:00
fangfengbin 91878d18b5 [PlacementGroup]Fix placement group wait api disorder bug (#12827)
* [PlacementGroup]Fix placment group wait api disorder bug

* fix review comment

* fix review comment

* fix review comment

* fix review comments

* increase num_heartbeats_timeout

Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2020-12-16 18:45:53 +08:00
Edward Oakes 03d869d51c Hold GIL while submitting (actor) tasks (#12803) 2020-12-11 21:47:16 -06:00
Simon Mo 68d7fa2137 Fix exit_actor in asyncio mode (#12693) 2020-12-11 09:35:17 -08:00
Philipp Moritz 343b479ae2 [TEST] Fix Ray windows build for debugger (#12671)
* Fix Ray windows build for debugger

* update
2020-12-08 18:12:48 -08:00
Philipp Moritz 73a1a232b9 Ray debugger stepping between tasks (#12075) 2020-12-06 21:50:18 -08:00
fangfengbin 260b07cf0c [PlacementGroup]Add PlacementGroup wait java api (#12499)
* add part code

* add part code

* add part code

* add part code

* fix review comments

* fix compile bug

* fix compile bug

* fix review comments

* fix review comments

* fix code style

* add part code

* fix review comments

* fix review comments

* fix code style

* rebase master

* fix bug

* fix lint error

* fix compile bug

* fix newline issue

Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2020-12-05 16:40:04 +08:00
SangBin Cho 0e892908f7 [Object Spilling] Delete spilled objects when references are gone out of scope. (#12341) 2020-12-01 13:10:39 -08:00
Simon Mo ef1b0c13c3 Async Future Throws RayError as well (#12419) 2020-12-01 13:07:43 -08:00
SangBin Cho f6f3cc9af1 [Core]Remove checkpoint table (#12235)
* Delete an actor entry from node manager.

* Remove checkpoint table

* remote checkpoint interface

* remove checkpoint interface

* fix ExitActorTest

Co-authored-by: chaokunyang <shawn.ck.yang@gmail.com>
2020-12-01 08:58:36 -08:00
Philipp Moritz cf73ccddae Allow more fields for object metadata (#12484) 2020-11-29 21:50:18 -08:00
Ian Rodney 679492a235 [serve] Use Long Polling in Backend Worker (#12093) 2020-11-25 12:11:38 -08:00
SangBin Cho 2e4e285ef0 [Object Spilling] Fusion small objects (#12087) 2020-11-25 10:13:32 -08:00
Ian Rodney e086ddc18f [core] Add Recursive task cancelation (#11923) 2020-11-18 15:18:40 -08:00
Simon Mo c476037c97 [Core] Async API should raise on all RayError (#12043)
Before this PR we are raising just RayTaskError, this means errors
like RayActorError(Actor Died) won't be propogated and thrown at
`await object_ref`. This PR fixes that.
2020-11-17 17:20:30 -08:00
Stephanie Wang c49554fb7a Abstract plasma store creation request queue (#12039) 2020-11-16 17:09:15 -08:00
SangBin Cho f56d7c1a76 [Logging] Remove per worker job log file / support worker log rotation (#11927)
* In progress.

* MVP done.

* In Progress.

* Remove unnecessay code.

* Fix some issues.

* Fix test failures.

* Addressed code review + fix object spilling test failure.
2020-11-16 11:29:43 -08:00
SangBin Cho f80d812799 [Object Spilling] Introduce SpillWorker & RestoreWorker Pool to avoid IO worker deadlock. (#11885) 2020-11-11 18:20:14 -08:00
Siyuan (Ryans) Zhuang b8dda0e3d0 [Serialization] Fix buffer alignment issues (#11888)
* fix buffer alignment issues

* remove unused fields

* aligned memory allocation

* windows compat

* license. fix compiler warnings

* fix compilation error

* reinterpret_cast
2020-11-10 23:44:16 -08:00
Philipp Moritz 39ce0eadbe Ray PDB support (#11739) 2020-11-03 09:49:23 -08:00
architkulkarni 4175569d96 [Core] Add option to override environment variables for tasks and actors (#11619) 2020-10-29 14:22:44 -05:00
Eric Liang e8c77e2847 Remove memory quota enforcement from actors (#11480)
* wip

* fix

* deprecate
2020-10-21 14:29:03 -07:00
Ameer Haj Ali a10e36ca04 Make the logging of gc.collect() freed refs appear in DEBUG not INFO (#11353) 2020-10-14 13:14:35 -07:00
Simon Mo 0d09a17c64 Skip set_result if the future is done (#11256) 2020-10-11 22:33:58 -07:00
Stephanie Wang ada58abcd9 [Object spilling] Update object directory and reload spilled objects automatically (#11021)
* Fix pytest...

* Release objects that have been spilled

* GCS object table interface refactor

* Add spilled URL to object location info

* refactor to include spilled URL in notifications

* improve tests

* Add spilled URL to object directory results

* Remove force restore call

* Merge spilled URL and location

* fix

* CI

* build

* osx

* Fix multitenancy issues

* Skip windows tests
2020-10-02 15:52:42 -07:00
Siyuan (Ryans) Zhuang f0dba6bd2b Code cleanup about python3 asyncio compat (#11134)
* cleanup python3 compat and others
2020-09-30 14:22:25 -07:00
SangBin Cho 1e39c40370 [Placement Group] Capture child tasks by default. (#11025)
* In progress.

* Finished up.

* Improve comment.

* Addressed code review.

* Fix test failure.

* Fix ci failures.

* Fix CI issues.
2020-09-27 19:33:00 -07:00
DK.Pino db7097fb1f [Refactor] Rename ClientId to NodeId (#10992)
* rename ClientId to NodeId

* format lint

* format lint

* fix conflicts

* rename new ClientId to NodeId

* update lint

* make same version of clang-format with travis ci
2020-09-27 10:24:21 -07:00
SangBin Cho 5e6b887f2d [Placement Group] Capture Child Task Part 1 (#10968)
* In progress.

* In progers.

* Done.

* Addressed code review.

* Increase timeout to make a test less flaky.

* Addressed code review.

* Addressed code review.
2020-09-24 09:02:03 -07:00
fyrestone 50784e2496 [Dashboard] Dashboard node grouping (#10528)
* Add RAY_NODE_ID environment var to agent

* Node ralated data use node id as key

* ray.init() return node id; Pass test_reporter.py

* Fix lint & CI

* Fix comments

* Minor fixes

* Fix CI

* Add const to ClientID in AgentManager::Options

* Use fstring

* Add comments

* Fix lint

* Add test_multi_nodes_info

Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-09-16 10:17:29 -07:00
Clark Zinzow 0c0b0d0a73 [Core] Added support for submission-time task names. (#10449)
* Added support for submission-time task names.

* Suggestions from code review: add missing consts

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Add num_returns arg to actor method options docstring example.

* Add process name line and proctitle assertion to submission-time task name section of advanced docs.

* Add submission-time task name --> proctitle test for Python worker.

* Added Python actor options tests for num_returns and name.

* Added Java test for submission-time task names.

* Add dashboard image to task name docs section.

* Move to fstrings.

Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-09-03 11:45:24 -07:00
fyrestone b04222dbd9 [xlang] Cross language serialization for ActorHandle (#10335) 2020-09-02 10:11:53 +08:00
Stephanie Wang 9a31166050 Option to disable profiling and task timeline (#10414) 2020-08-29 11:35:22 -07:00
Eric Liang 2a204260a8 [api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
Eric Liang 519354a39a [api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
fyrestone 08adbb371f Cross language exception (#10023) 2020-08-26 10:46:05 +08:00
Max Fitton 17f801dc69 Make get_py_stack return more stack frames (#9512) 2020-08-21 13:02:12 -05:00
Stephanie Wang 85e57a7a98 [Object spilling] Look up the location of the primary raylet from the owner's metadata (#10197)
* Get the primary copy from the owner, python test, some node manager fixes

* fixes and todo

* update

* lint

* fix build
2020-08-20 14:46:59 -07:00
fangfengbin a462ae2747 [Placement Group]Add strict spread strategy (#10174)
* support STRICT_SPREAD strategy

* fix review comments

* rebase master

* fix lint error

* fix lint error

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-20 10:18:58 -07:00
SangBin Cho 263df6163c [Placement Group] Placement group remove api part 1 (#10063)
* Added basic rpc calls.

* fix issues.

* Fix the gcs server not getting request issue.

* In Progress.

* Basic logic done. Tests are required.

* In progress.

* In progress in refactoring context.

* Revert "In progress in refactoring context."

This reverts commit 38236256cf1306c60dd203e75d45ceb4509c8106.

* Working now.

* Python test works.

* Lint.

* Addressed code review.

* Addressed code review.

* Lint.

* Added unit tests.

* Done, but one of unit tests fail

* Addressed code review.

* Addressed the last code review.

* Fix the wrong test case.
2020-08-18 12:44:00 -07:00
fangfengbin edd783bc32 [Placement Group]Add soft pack strategy (#10099) 2020-08-17 12:01:34 +08:00
Siyuan (Ryans) Zhuang 17ca1d8ff4 [Core] Object spilling prototype (#9818) 2020-08-14 15:39:10 -07:00
Zhuohan Li a6fed4820e [Core] Preliminary implementation of ownership-based object directory (#9735) 2020-08-11 15:04:13 -07:00