Commit Graph

48 Commits

Author SHA1 Message Date
Eric Liang afca6d3d87 Object store full with cyclic python references (#6114) 2019-11-08 14:08:24 -08:00
Edward Oakes ca53af4d0f Add pending task dependencies to ObjectID ref counting (#6054) 2019-11-07 18:37:10 -08:00
Edward Oakes 043d1f4094 Return RayObjects to core worker (#6052) 2019-11-04 20:27:57 -08:00
Eric Liang 8485304e83 Support concurrent Actor calls in Ray (#6053) 2019-11-04 01:14:35 -08:00
Eric Liang fb34928a2a [minor] Perf optimizations for direct actor task submission (#6044)
* merge optimizations

* fix

* fix memory err

* optimize

* fix tests

* fix serialization of method handles

* document weakref

* fix check

* bazel format

* disable on 2
2019-11-01 14:41:14 -07:00
Simon Mo 7f5b3502da Implement Detached Actor (#6036)
* Arg propagation works

* Implement persistent actor

* Add doc

* Initialize is_persistent_

* Rename persistent->detached

* Address comment

* Make test passes

* Address comment

* Python2 compatiblity

* Fix naming, py2

* Lint
2019-11-01 10:28:23 -07:00
Eric Liang 8ebba202df [minor] Reduce perf overhead of object ref tracking (#6041) 2019-10-29 18:14:51 -07:00
Eric Liang b89cac976a Basic direct actor call support in Python (#5991) 2019-10-28 22:09:04 -07:00
Edward Oakes c1418b04df Remove CoreWorkerObjectInterface (#6023) 2019-10-28 10:48:41 -07:00
Philipp Moritz 80c01617a3 Optimize python task execution (#6024) 2019-10-27 00:43:34 -07:00
Eric Liang a5523466a2 Enable memstore by default (#6003) 2019-10-25 21:59:12 -07:00
Edward Oakes d4055d70e3 Remove CoreWorkerTaskExecutionInterface (#6009) 2019-10-25 16:33:44 -07:00
Edward Oakes 1ce521a7f3 Remove task context from python worker (#5987)
Removes duplicated state between the python and C++ workers. Also cleans up the serialization codepaths a bit.
2019-10-25 07:38:33 -07:00
Eric Liang 4edae7ea2b Speed up task submissions a bit (#5992) 2019-10-25 00:10:37 -07:00
Philipp Moritz 09d05bb3fa Reduce actor submission python overhead (#5949) 2019-10-23 00:11:32 -07:00
Edward Oakes 02931e08f3 [core worker] Python core worker task execution (#5783)
Executes tasks via the event loop in the C++ core worker. Also properly handles signals (including KeyboardInterrupt), so ctrl-C in a python interactive shell works now (if connecting to an existing cluster).
2019-10-22 20:15:59 -07:00
Edward Oakes fc56872012 Send active object IDs to the raylet (#5803)
* Send active object IDs to the raylet

* comment

* comments

* dedup

* signed int in config

* comments

* Remove object ID from monitor

* Fix test

* re-add check

* fix cast

* check if core worker

* Add comment

* Reservoir sampling

* Fix lint

* Pointer return

* tmp

* Fix merge

* Initialize object ids properly

* Fix lint
2019-10-20 22:05:28 -07:00
Stephanie Wang 697f765efc Refactor CoreWorker to remove TaskInterface (#5924)
* Remove TaskInterface

* Remove Status return value

* Remove CActorHandle, some return values, TaskSubmitter

* lint

* doc

* doc

* fix build

* lint

* Return Status, guarded by annotation, fail tasks for RECONSTRUCTING actors

* fix

* move annotation

* revert

* Fix core worker test

* nits
2019-10-18 00:03:57 -04:00
Stephanie Wang 3ac8592dcf Remove actor handle IDs (#5889)
* Remove actor handle ID from main ActorHandle constructor

* Set the actor caller ID when calling submit task instead of in the actor handle

* Remove ActorHandle::Fork, remove actor handle ID from protobuf

* Make inner actor handle const, remove new_actor_handles

* Move caller ID into the common task spec, start refactoring raylet

* Some fixes for forking actor handles

* Store ActorHandle state in CoreWorker, only expose actor ID to Python

* Remove some unused fields

* lint

* doc

* fix merge

* Remove ActorHandleID from python/cpp

* doc

* Fix core worker test

* Move actor table subscription to CoreWorker, reset actor handles on actor failure

* lint

* Remove GCS client from direct actor

* fix tests

* Fix

* Fix tests for raylet codepath

* Fix local mode

* Fix multithreaded test

* Fix AsyncSubscribe issue...

* doc

* fix serve

* Revert bazel
2019-10-17 12:36:34 -04:00
Edward Oakes 08e4e3a153 [core worker] Submit Python actor tasks through core worker (#5750)
* Submit actor tasks through core worker

* Fix java

* add comment

* Remove task builder

* Check negative

* Increase -> Increment

* pass by reference

* fix signal

* Clean up c++ actor handle

* more cleanup

* Clean up headers

* Fix unique_ptr construction

* Fix java

* Move profiling to c++

* dedup

* fix error

* comments

* fix java

* Fix tests

* wait for actor to exit

* Start after constructor

* ignore java build

* fix comment

* always init logging

* Fix logging

* fix logging issue

* shared_ptr for profiler

* DEBUG -> WARNING

* fix killed_ init

* Fix flaky checkpointing tests

* -v flag for tune tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix test exception info

* Fix import

* fix build

* Fix test

* shared_ptr
2019-10-07 15:42:19 -07:00
Si-Yuan 2fb7d7846f Initial implementation of Cython pickle5 support (#5725) 2019-10-03 09:20:26 -07:00
Edward Oakes 963bbe8bbd Move profiling to c++ (#5771)
* Move profiling to c++

* comments

* Fix tests

* Start after constructor

* fix comment

* always init logging

* Fix logging

* fix logging issue

* shared_ptr for profiler

* DEBUG -> WARNING

* fix killed_ init

* Fix flaky checkpointing tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix import

* fix build

* use boost::asio

* fix double const

* Properly reset async_wait

* remove SIGINT

* Change error message

* increase timeout

* small nits

* Don't trap on SIGINT

* -v for tune

* Fix test
2019-10-01 10:06:25 -07:00
Edward Oakes 61e5d674be Push driver task in core worker (#5752) 2019-09-23 10:53:55 -05:00
Edward Oakes a5d7de6aaf [core worker] Python core worker normal task submission (#5566) 2019-09-14 13:02:53 -07:00
Edward Oakes 07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Philipp Moritz 599cc2be60 Revert raylet to worker GRPC communication back to asio (#5450) 2019-08-17 19:11:32 -07:00
Philipp Moritz 8d6c50c821 Fix compiler warnings and make warnings fatal (#5375) 2019-08-07 14:04:05 -07:00
Qing Wang d372f24e3c [ID Refactor] Refactor ActorID, TaskID and ObjectID (#5286)
* Refactor ActorID, TaskID on the Java side.

Left a TODO comment

WIP for ObjectID

ADD test

Fix

Add java part

Fix Java test

Fix

Refine test.

Enable test in CI

* Extra a helper function.

* Resolve TODOs

* Fix Python CI

* Fix Java lint

* Update .travis.yml

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address some comments.

Address some comments.

Add id_specification.rst

Reanme id_specification.rst to id_specification.md

typo

Address zhijun's comments.

Fix test

Address comments.

Fix lint

Address comments

* Fix test

* Address comments.

* Fix build error

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Address comments.

* Update C++ part to make sure task id is generated determantic

* WIP

* Fix core worker

* Fix Java part

* Fix comments.

* Add Python side

* Fix python

* Address comments

* Fix linting

* Fix

* Fix C++ linting

* Add JobId() method to TaskID

* Fix linting

* Update src/ray/common/id.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ActorId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments

* Add DriverTaskId embeding job id

* Fix tests

* Add python dor_fake_driver_id

* Address comments and fix linting

* Fix CI
2019-08-07 11:04:51 +08:00
Robert Nishihara 63a6b0e710 Fix bug in passing large arguments to tasks. (#5325) 2019-07-30 22:28:35 -07:00
Joey Jiang 40395acadf [gRPC] Migrate raylet client implementation to grpc (#5120) 2019-07-25 14:48:56 +08:00
Richard Liaw 53fb876a5f Improved KeyboardInterrupt Exception Handling (#5237) 2019-07-22 02:29:56 -07:00
Edward Oakes e5be5fd46d Remove dependencies from TaskExecutionSpecification (#5166) 2019-07-15 18:15:21 -07:00
Hao Chen 8a30b93e42 Define common data structures with protobuf. (#5121) 2019-07-08 22:41:37 +08:00
Qing Wang 62e4b591e3 [ID Refactor] Rename DriverID to JobID (#5004)
* WIP

WIP

WIP

Rename Driver -> Job

Fix complition

Fix

Rename in Java

In py

WIP

Fix

WIP

Fix

Fix test

Fix

Fix C++ linting

Fix

* Update java/runtime/src/main/java/org/ray/runtime/config/RayConfig.java

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/core_worker/core_worker.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Fix

* Fix CI

* Fix cpp linting

* Fix py lint

* FIx

* Address comments and fix

* Address comments

* Address

* Fix import_threading
2019-06-28 00:44:51 +08:00
Yuhong Guo 1f0809e2b4 Refactor ID Serial 2: change all ID functions to CamelCase (#4896) 2019-05-31 11:31:18 +08:00
Yuhong Guo 1a39fee9c6 Refactor ID Serial 1: Separate ObjectID and TaskID from UniqueID (#4776)
* Enable BaseId.

* Change TaskID and make python test pass

* Remove unnecessary functions and fix test failure and change TaskID to
16 bytes.

* Java code change draft

* Refine

* Lint

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ObjectId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comment

* Lint

* Fix SINGLE_PROCESS

* Fix comments

* Refine code

* Refine test

* Resolve conflict
2019-05-22 14:46:30 +08:00
Romil Bhardwaj 004440f526 Dynamic Custom Resources - create and delete resources (#3742) 2019-05-11 20:06:04 +08:00
Wang Qing fe07a5b4b1 Add delete_creating_tasks option for internal.free() (#4588)
* add delete creating task objects.

* format code style

* Fix lint

* add tests add address comments.

* Refine test

* Refine java test

* Fix CI

* Refine

* Fix lint

* Fix CI
2019-04-12 13:38:31 +08:00
Yuhong Guo b9ea821d16 Use strongly typed IDs in C++. (#4185)
*  Use strongly typed IDs for C++.

* Avoid heap allocation in cython.

* Fix JNI part

* Fix rebase conflict

* Refine

* Remove type check from __init__

* Remove unused constructor declarations.
2019-03-07 21:43:01 +08:00
Hao Chen 48f6cd3e5d Release GIL in prepare_actor_checkpoint (#4208) 2019-03-01 19:43:28 -08:00
Robert Nishihara 9c5fdbb63c Release gil when doing ray.wait. (#4190) 2019-02-28 00:32:07 -08:00
Hao Chen 62055cc01c Cleanup depulicated code of Cython ID types (#4162) 2019-02-26 16:19:12 +08:00
Hao Chen 49dc85e54b Fix wrong ID type in prepare_checkpoint (#4124)
* Fix wrong ID type in prepare_checkpoint

* fix

* fix eq
2019-02-25 11:53:09 -08:00
Philipp Moritz bcd5af78c7 Lint Cython files (#4097) 2019-02-20 22:29:25 -08:00
Hao Chen 042ad84573 Simplify Cython ID types and fix bug of ActorCheckpointID (#4045) 2019-02-15 20:15:16 +08:00
Hao Chen f31a79f3f7 Implement actor checkpointing (#3839)
* Implement Actor checkpointing

* docs

* fix

* fix

* fix

* move restore-from-checkpoint to HandleActorStateTransition

* Revert "move restore-from-checkpoint to HandleActorStateTransition"

This reverts commit 9aa4447c1e3e321f42a1d895d72f17098b72de12.

* resubmit waiting tasks when actor frontier restored

* add doc about num_actor_checkpoints_to_keep=1

* add num_actor_checkpoints_to_keep to Cython

* add checkpoint_expired api

* check if actor class is abstract

* change checkpoint_ids to long string

* implement java

* Refactor to delay actor creation publish until checkpoint is resumed

* debug, lint

* Erase from checkpoints to restore if task fails

* fix lint

* update comments

* avoid duplicated actor notification log

* fix unintended change

* add actor_id to checkpoint_expired

* small java updates

* make checkpoint info per actor

* lint

* Remove logging

* Remove old actor checkpointing Python code, move new checkpointing code to FunctionActionManager

* Replace old actor checkpointing tests

* Fix test and lint

* address comments

* consolidate kill_actor

* Remove __ray_checkpoint__

* fix non-ascii char

* Loosen test checks

* fix java

* fix sphinx-build
2019-02-13 19:39:02 +08:00
Philipp Moritz 20162ce159 Compile raylet cython bindings with bazel (#3842) 2019-01-25 00:57:31 -08:00
Si-Yuan 48139cf861 Migrate Python C extension to Cython (#3541) 2019-01-24 09:17:14 -08:00