Edward Oakes
c2be794f10
Remove try/except import asyncio for python 2 ( #6947 )
2020-01-29 09:17:07 -08:00
Simon Mo
396d7fafc8
UI improvement for asyncio ( #6905 )
2020-01-27 12:45:51 -08:00
Yunzhi Zhang
0834bda8c1
[Dashboard] Display actor task execution info ( #6705 )
...
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com >
2020-01-22 22:33:55 -08:00
Simon Mo
8f246c17b5
Initialize async plasma for async actors ( #6813 )
...
* Initialize async plasma for async actors
* Address comment
2020-01-17 14:58:06 -08:00
Edward Oakes
2a4d2c6e9e
Basic reference counting & pinning ( #6554 )
2020-01-06 17:30:26 -06:00
Simon Mo
9fe90cdafc
Fix async actor recursion limitation ( #6672 )
...
* Do not start threadpool when using async
* Turn function_executor into a generator
* Add new test for high concurrency and bump the default
* Set direct call
2020-01-02 19:45:13 -06:00
Eric Liang
46acb02aa4
Fix verbose shutdown error and test_env_with_subprocesses ( #6614 )
2019-12-26 22:43:39 -08:00
Edward Oakes
6b1a57542e
Add actor.__ray_kill__() to terminate actors immediately ( #6523 )
2019-12-23 23:12:57 -06:00
Yunzhi Zhang
bac6f3b61e
[Dashboard] Collecting worker stats in node manager and implement webui display in the backend ( #6574 )
2019-12-22 17:50:23 -08:00
Edward Oakes
e50aa99be1
Reference counting for direct call submitted tasks ( #6514 )
...
Co-authored-by: Zhijun Fu <37800433+zhijunfu@users.noreply.github.com >
2019-12-20 17:06:33 -08:00
Eric Liang
e556b729c2
[direct call] Fix max_calls interaction with background tasks. ( #6536 )
2019-12-19 13:48:32 -08:00
Simon Mo
26ec500ef9
Implement async get for direct actor call ( #6339 )
2019-12-18 11:50:21 -08:00
Edward Oakes
e2b7459bfc
Fix worker exit cleanup ( #6450 )
...
* working but ugly
* comments
* proper but hanging in grpc server destructor
* grpc server shutdown deadline
* fix disconnect
* lint
* shutdown_only in test
* replace shutdown
2019-12-13 16:52:50 -08:00
Stephanie Wang
c57dcc82d1
Port actor creation to use direct calls ( #6375 )
2019-12-12 19:50:51 -08:00
Edward Oakes
044527adb8
Remove ref counting dependencies on ray.get() ( #6412 )
...
* Remove ref counting dependencies on Get()
* comment
* don't send IDs when disabled
* pass through internal config
* fix
* allow reinit
* remove flag
2019-12-10 18:11:34 -08:00
Chaokun Yang
6272907a57
[Streaming] Streaming data transfer and python integration ( #6185 )
2019-12-10 20:33:24 +08:00
Zhijun Fu
b88b8202cc
fix java build failure ( #6062 )
2019-12-06 14:38:43 +08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py ( #6306 )
...
* multinode failures direct
* Add number of retries allowed for tasks
* Retry tasks
* Add failing test for object reconstruction
* Handle return status and debug
* update
* Retry task unit test
* update
* update
* todo
* Fix max_retries decorator, fix test
* Fix test that flaked
* lint
* comments
2019-12-02 10:20:57 -08:00
Eric Liang
b7b655c851
Also use NotifyDirectCallTaskBlock/Unblocked for plasma store accesses ( #6249 )
...
* wip
* fix it
* lint
* wip
* fix
* unblock
* flaky
* use fetch only flag
* Revert "use fetch only flag"
This reverts commit 56e938a0ee2024f5c99c9ab2d55fd35558fb15e1.
* restore error resolution
* use worker task id
* proto comments
* fix if
2019-11-27 22:46:15 -08:00
Stephanie Wang
2797c11b69
[direct task] For serialized object IDs, check with owner before declaring object unreconstructable ( #6286 )
...
* Track borrowed vs owned objects
* Serialize owner address with object ID
* serialize owner task id
* Deserialize object IDs
* Pass direct task ID instead of plasma ID
* it works
* Fix ref count test
* Add unit test
* update warning
* we own ray.put objects
* missing file
* doc
* Fix unit test
* comments
* Fix py2
* lint
* update
2019-11-27 15:31:44 -08:00
Simon Mo
1ca8c427e3
Consistent Name for Process Title ( #6276 )
...
* Consistent naming for setprotitle
* Address comments
* Add debug/verbose mode
* Fix test
2019-11-26 11:56:28 -08:00
Simon Mo
aa8d5d2f6c
Rate limit asyncio actor ( #6242 )
2019-11-24 11:39:28 -08:00
Simon Mo
29ba6bfc64
Basic Async Actor Call ( #6183 )
...
* Start trying to figure out where to put fibers
* Pass is_async flag from python to context
* Just running things in fiber works
* Yield implemented, need some debugging to make it work
* It worked!
* Remove debug prints
* Lint
* Revert the clang-format
* Remove unnecessary log
* Remove unncessary import
* Add attribution
* Address comment
* Add test
* Missed a merge conflict
* Make test pass and compile
* Address comment
* Rename async -> asyncio
* Move async test to py3 only
* Fix ignore path
2019-11-21 11:56:46 -08:00
Stephanie Wang
db77595298
Fix segfault for task arguments passed by value ( #6214 )
...
* Fix null data
* rename
2019-11-20 22:02:18 -08:00
Stephanie Wang
66edebce3a
Spillback scheduling for direct task calls ( #6164 )
...
* add dac
* remove cachign
* rename return buffer
* cleanup
* add tests
* add perf
* fix
* flip
* remove
* remove it
* lint
* remove fork safety
* lint
* comments
* s/core/client
* wip
* remove
* fmt
* consistently return direct naming
* basic pass by ref
* fix bugs
* wip
* wip
* wip
* wip
* add test
* works now
* fix constructor
* fix merge
* add todo for perf
* fix single client test
* use lower n
* bazel
* faster
* fix core worker test
* init
* fix tests
* no plasma for direct call
* Update worker.py
* add order test
* fixes
* comments
* remove old assert
* lint
* add test
* Very wip
* wip
* add options for tasks
* add test
* fmt
* add backpressure
* remove idle prof event
* lint
* Fix 0 returns
* Set memcopy threads globally
* add benchmark
* Fix object exists
* Fix reference
* Remove return_buffer
* Add check
* add exit handler
* update benchmarks
* Fix compile error
* Fix NoReturn
* Use is instead of == for NoReturn
* fix
* Remove list comprehension
* Fix core worker test
* comment
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com >
* fix merge error
* lint
* wip
* fix merge
* wip
* finish
* lint
* task interface
* add file
* add
* wip
* now works!
* updated
* wip
* dep resolution
* remove remote dep handling
* comments
* fix test_multithreading
* fix merge
* fix exit handling
* fix merge
* comments
* get fallback fetch working
* handle contains
* fix typo
* Skeleton for SubmitTask proto
* Update src/ray/common/id.h
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu >
* comments
* rename to core worker service
* lint
* fix compile
* wip
* update
* error code
* fix up and rename
* clean up call manager
* comments
* add test and cleanup deserialization
* fix pickle
* fix comments, lint
* test todo
* comments
* use shared ptr
* rename
* Update src/ray/protobuf/gcs.proto
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu >
* require transport type for ids; lint
* cleanup
* comments 1
* use worker available for real
* wip
* fix test
* resolve local dependencies test
* add num pending metric
* client factory
* unit test task submission
* wip
* fix bug
* rename
* Pass through node manager port, connect in raylet client
* finish rename
* Switch submit task to grpc
* fix crash
* Check port in use
* fix merge
* comments more
* doc
* Remove default port, set port randomly from driver
* add unique_ptr comment about TaskSpec
* lint
* fix test
* update
* fix lint
* GetMessageMutable should not be const
* iwyu
* fix const
* Update direct_task_transport_test.cc
* fix segfault
* Fix test
* Add RpcAddress, set in actor table data
* fix serialization
* fix lint
* Pass through task caller address
* Fix object manager test
* RpcAddress -> Address
* merge
* Port WorkerLease to grpc
* wip
* fix test
* add mem test
* update
* comments
* fix core worker tests
* fix
* remove old worker lease code
* First pass on spillback
* lint
* crash?
* Debug
* Fix task spec copy, extend test basic
* lint
* Port return worker to grpc
* lint
* Return worker to the correct raylet
* Only request worker if queued tasks
* A bit better failure handling
* Fix unit test
* Add unit test for spillback
* fix
* python test multinode
* update
* updates
* fix
2019-11-17 20:29:32 -08:00
Eric Liang
7d33e9949b
Integrate ref count module into local memory store ( #6122 )
2019-11-15 10:52:19 -08:00
Eric Liang
8ff393a7bd
Handle exchange of direct call objects between tasks and actors ( #6147 )
2019-11-14 17:32:04 -08:00
Ujval Misra
e3e3ad4b25
Add timeout param to ray.get ( #6107 )
2019-11-14 00:50:04 -08:00
Eric Liang
f3f86385d6
Minimal implementation of direct task calls ( #6075 )
2019-11-12 11:45:28 -08:00
Stephanie Wang
35d177f459
Use grpc for communication from worker to local raylet (task submission and direct actor args only) ( #6118 )
...
* Skeleton for SubmitTask proto
* Pass through node manager port, connect in raylet client
* Switch submit task to grpc
* Check port in use
* doc
* Remove default port, set port randomly from driver
* update
* Fix test
* Fix object manager test
2019-11-11 21:17:25 -08:00
Philipp Moritz
decaa65cd6
Use pickle by default for serialization ( #5978 )
2019-11-10 18:12:18 -08:00
Eric Liang
afca6d3d87
Object store full with cyclic python references ( #6114 )
2019-11-08 14:08:24 -08:00
Edward Oakes
ca53af4d0f
Add pending task dependencies to ObjectID ref counting ( #6054 )
2019-11-07 18:37:10 -08:00
Edward Oakes
043d1f4094
Return RayObjects to core worker ( #6052 )
2019-11-04 20:27:57 -08:00
Eric Liang
8485304e83
Support concurrent Actor calls in Ray ( #6053 )
2019-11-04 01:14:35 -08:00
Eric Liang
fb34928a2a
[minor] Perf optimizations for direct actor task submission ( #6044 )
...
* merge optimizations
* fix
* fix memory err
* optimize
* fix tests
* fix serialization of method handles
* document weakref
* fix check
* bazel format
* disable on 2
2019-11-01 14:41:14 -07:00
Simon Mo
7f5b3502da
Implement Detached Actor ( #6036 )
...
* Arg propagation works
* Implement persistent actor
* Add doc
* Initialize is_persistent_
* Rename persistent->detached
* Address comment
* Make test passes
* Address comment
* Python2 compatiblity
* Fix naming, py2
* Lint
2019-11-01 10:28:23 -07:00
Eric Liang
8ebba202df
[minor] Reduce perf overhead of object ref tracking ( #6041 )
2019-10-29 18:14:51 -07:00
Eric Liang
b89cac976a
Basic direct actor call support in Python ( #5991 )
2019-10-28 22:09:04 -07:00
Edward Oakes
c1418b04df
Remove CoreWorkerObjectInterface ( #6023 )
2019-10-28 10:48:41 -07:00
Philipp Moritz
80c01617a3
Optimize python task execution ( #6024 )
2019-10-27 00:43:34 -07:00
Eric Liang
a5523466a2
Enable memstore by default ( #6003 )
2019-10-25 21:59:12 -07:00
Edward Oakes
d4055d70e3
Remove CoreWorkerTaskExecutionInterface ( #6009 )
2019-10-25 16:33:44 -07:00
Edward Oakes
1ce521a7f3
Remove task context from python worker ( #5987 )
...
Removes duplicated state between the python and C++ workers. Also cleans up the serialization codepaths a bit.
2019-10-25 07:38:33 -07:00
Eric Liang
4edae7ea2b
Speed up task submissions a bit ( #5992 )
2019-10-25 00:10:37 -07:00
Philipp Moritz
09d05bb3fa
Reduce actor submission python overhead ( #5949 )
2019-10-23 00:11:32 -07:00
Edward Oakes
02931e08f3
[core worker] Python core worker task execution ( #5783 )
...
Executes tasks via the event loop in the C++ core worker. Also properly handles signals (including KeyboardInterrupt), so ctrl-C in a python interactive shell works now (if connecting to an existing cluster).
2019-10-22 20:15:59 -07:00
Edward Oakes
fc56872012
Send active object IDs to the raylet ( #5803 )
...
* Send active object IDs to the raylet
* comment
* comments
* dedup
* signed int in config
* comments
* Remove object ID from monitor
* Fix test
* re-add check
* fix cast
* check if core worker
* Add comment
* Reservoir sampling
* Fix lint
* Pointer return
* tmp
* Fix merge
* Initialize object ids properly
* Fix lint
2019-10-20 22:05:28 -07:00
Stephanie Wang
697f765efc
Refactor CoreWorker to remove TaskInterface ( #5924 )
...
* Remove TaskInterface
* Remove Status return value
* Remove CActorHandle, some return values, TaskSubmitter
* lint
* doc
* doc
* fix build
* lint
* Return Status, guarded by annotation, fail tasks for RECONSTRUCTING actors
* fix
* move annotation
* revert
* Fix core worker test
* nits
2019-10-18 00:03:57 -04:00
Stephanie Wang
3ac8592dcf
Remove actor handle IDs ( #5889 )
...
* Remove actor handle ID from main ActorHandle constructor
* Set the actor caller ID when calling submit task instead of in the actor handle
* Remove ActorHandle::Fork, remove actor handle ID from protobuf
* Make inner actor handle const, remove new_actor_handles
* Move caller ID into the common task spec, start refactoring raylet
* Some fixes for forking actor handles
* Store ActorHandle state in CoreWorker, only expose actor ID to Python
* Remove some unused fields
* lint
* doc
* fix merge
* Remove ActorHandleID from python/cpp
* doc
* Fix core worker test
* Move actor table subscription to CoreWorker, reset actor handles on actor failure
* lint
* Remove GCS client from direct actor
* fix tests
* Fix
* Fix tests for raylet codepath
* Fix local mode
* Fix multithreaded test
* Fix AsyncSubscribe issue...
* doc
* fix serve
* Revert bazel
2019-10-17 12:36:34 -04:00