Commit Graph

122 Commits

Author SHA1 Message Date
Clark Zinzow d4cae5f632 [Core] Added ability to specify different IP addresses for a core worker and its raylet. (#7985) 2020-04-16 10:32:24 -05:00
ijrsvt 44825d81e9 Change Proctitle to IDLE after an Error (#7863) 2020-04-08 11:33:43 -07:00
fyrestone fc6259a656 Cross language serialization for primitive types (#7711)
* Cross language serialization for Java and Python

* Use strict types when Python serializing

* Handle recursive objects in Python; Pin msgpack >= 0.6.0, < 1.0.0

* Disable gc for optimizing msgpack loads

* Fix merge bug

* Java call Python use returnType; Fix ClassLoaderTest

* Fix RayMethodsTest

* Fix checkstyle

* Fix lint

* prepare_args raises exception if try to transfer a non-deserializable object to another language

* Fix CrossLanguageInvocationTest.java, Python msgpack treat float as double

* Minor fixes

* Fix compile error on linux

* Fix lint in java/BUILD.bazel

* Fix test_failure

* Fix lint

* Class<?> to Class<T>; Refine metadata bytes.

* Rename FST to Fst; sort java dependencies

* Change Class<?>[] to Optional<Class<?>>; sort requirements in setup.py

* Improve CrossLanguageInvocationTest

* Refactor MessagePackSerializer.java

* Refactor MessagePackSerializer.java; Refine CrossLanguageInvocationTest.java

* Remove unnecessary dependencies for Java; Add getReturnType() for RayFunction in Java

* Fix bug

* Remove custom cross language type support

* Replace Serializer.Meta with MutableBoolean

* Remove @SuppressWarnings support from checkstyle.xml; Add null test in CrossLanguageInvocationTest.java

* Refine MessagePackSerializer.pack

* Ray.get support RayObject as input

* Improve comments and error info

* Remove classLoader argument from serializer

* Separate msgpack from pickle5 in Python

* Pair<byte[], MutableBoolean> to Pair<byte[], Boolean>

* Remove public static <T> T get(RayObject<T> object), use RayObject.get() instead

* Refine test

* small fixes

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2020-04-08 21:10:57 +08:00
Kai Yang 48b48cc8c2 Support multiple core workers in one process (#7623) 2020-04-07 11:01:47 +08:00
ijrsvt 9bfc2c4b54 Moving Local Mode to C++ (#7670) 2020-04-01 15:50:57 -05:00
Robert Nishihara b011c604d7 Remove ray.tasks() from API. (#7807) 2020-04-01 10:10:40 -05:00
Simon Mo dc9b62e007 Deserialize Args in Event Loop Thread (#7806) 2020-03-30 18:28:13 -07:00
Edward Oakes 8b4f5a9431 Remove non-direct-call code from core worker (#7625) 2020-03-22 19:20:08 -05:00
Simon Mo 89d959fd6a Stop gap solution for cython functions breaking in memory monitor (#7687) 2020-03-21 15:16:12 -07:00
Edward Oakes 58dc70f90e [minor] Remove get_global_worker(), RuntimeContext (#7638) 2020-03-20 15:45:29 -05:00
Eric Liang 745b9d643d First pass at ray memory command for memory debugging (#7589) 2020-03-17 20:45:07 -07:00
Edward Oakes c1b0f9ccdf Add failure tests to test_reference_counting (#7400) 2020-03-17 10:30:21 -05:00
ijrsvt 46953c53b1 Cleanup Plasma Async Callback (#7452) 2020-03-16 10:12:44 -07:00
Kai Yang d6e8f47065 Add a flag to disable reconstruction for a killed actor (#7346) 2020-03-13 19:10:21 +08:00
Stephanie Wang fdb528514b [core] Ref counting for actor handles (#7434)
* tmp

* Move Exit handler into CoreWorker, exit once owner's ref count goes to 0

* fix build

* Remove __ray_terminate__ and add test case for distributed ref counting

* lint

* Remove unused

* Fixes for detached actor, duplicate actor handles

* Remove unused

* Remove creation return ID

* Remove ObjectIDs from python, set references in CoreWorker

* Fix crash

* Fix memory crash

* Fix tests

* fix

* fixes

* fix tests

* fix java build

* fix build

* fix

* check status

* check status
2020-03-10 17:45:07 -07:00
Edward Oakes 0c254295b0 Remove experimental.signal API (#7477)
* Remove experimental.signal API

* fix test
2020-03-09 16:03:36 -07:00
Edward Oakes b4e2d5317e Remove experimental.NoReturn (#7475) 2020-03-09 11:09:36 -07:00
Edward Oakes 0abcca258f Add entries to in-memory store on Put() (#7085) 2020-03-04 10:17:27 -08:00
ijrsvt fb76092d75 Re-route asyncio plasma code path through raylet instead of direct plasma connection (#7234) 2020-03-03 15:43:46 -05:00
Edward Oakes bd9411f849 Call TriggerGlobalGC when the plasma store is full (#7337) 2020-02-27 11:01:49 -08:00
Edward Oakes d9027acaf2 Deprecate non-direct-call API (#7336) 2020-02-27 10:37:23 -08:00
Edward Oakes 2ad9bc5684 Move plasma retry logic into plasma store provider (#7328) 2020-02-26 16:57:02 -08:00
Eric Liang b310661338 Add internal_api.global_gc() method, which triggers gc.collect() on all workers (#7327) 2020-02-26 14:09:29 -08:00
Edward Oakes 44b4394afa Remove unused AddContainedObjectIDs (#7323) 2020-02-25 16:42:20 -08:00
Edward Oakes d190e73727 Use our own implementation of parallel_memcopy (#7254) 2020-02-21 11:03:50 -08:00
Simon Mo b804d40c04 Stop vendoring pyarrow (#7233) 2020-02-19 19:01:26 -08:00
Simon Mo 7bef7031c2 Revert "Revert "Revert "Removing Pyarrow dependency (#7146)" (#7209) (#7214)" (#7232) 2020-02-19 13:35:29 -08:00
Simon Mo e8941b1b79 Revert "Revert "Removing Pyarrow dependency (#7146)" (#7209) (#7214) 2020-02-19 10:08:52 -08:00
Stephanie Wang f76ce836b2 Distributed ref counting for serialized ObjectIDs (#6945)
* Skeleton plus a unit test for simple borrower case

* First unit test passes - forward an ID and task returns with 1 submitted task pending on the inner ID

* Invariant for contained_in

* Unit test passes for testing task return without creating a borrower

* Wrap ref count functionality in test case

* Fix bad delete

* Unit test and fix for borrowers creating more borrowers

* Unit test and fix for simple borrowing, but owner sends call after borrower's ref count goes to 0

* Refactor:
- keep a sentinel ref count for task argument IDs
- keep contained_in_borrowed in addition to contained_in_owned

* Unit test for nested IDs passes

* Refactor so that an object ID can only be contained in 1 borrowed ID at a time

* Add check

* Fix

* Unit test (passes) to test nesting object IDs but no borrowers created

* Unit test for nested objects from different owners passes, refactor to unset contained_in when popping refs

* Unit tests for borrowers receiving an ObjectID from multiple sources,
skip adding ownership info if we already have it to handle duplicate
refs

* Unit test for returning object ID passes

* More unit tests for returning object IDs pass

* Add serialized ID tests

* fix serialization issue

* remove swap

* It builds!

* debugging and some fixes:
- register handler for WaitForRefRemoved
- don't create a python reference for arg IDs
- pass in client factory into ReferenceCounter
- fix bad decrement in PopBorrowerRefs

* Fix accounting for serialized IDs:
- don't decrement for IDs on dependency resolution, wait until task finished
- add object IDs that were inlined when building the arguments to the task spec, pin these on the task executor until task finishes

* mu_ -> mutex_

* lint

* fix build

* clear outer_object_id

* add direct call type check

* Fix test for direct call IDs and return IDs for actor calls

* Fix CoreWorkerClient.Addr()

* Remove unneeded lock

* Remove unnecessary ObjectID refs

* Fix worker holding serialized refs test

* Fix hex IDs

* fix

* fix tests

* fix tests

* refactor and cleanups

* lint

* Put inlined Ids in task args and some cleanup

* Add back gc.collect() line for test case

* Refactor and fixes:
- store inlined IDs in RayObject
- allow storing objects with inlined IDs in memory store
- pin objects that were promoted to plasma

* oops

* make sure worker ID is set in address, pass in rpc::Address to CoreWorkerClient

* todos

* cleanups and test builds

* Fix tests

* Add feature flag

* cleanups

* address comments and some cleanups

* cleanup

* fix recursive test

* Comments for tests

* Turn off ref counting by default

* Skip tests

* Fix some bugs for test_array.py, java build

* Don't include nested objects in the ref count when the feature flag is off

* C++ feature flag does not work...

* Remove

* Turn on python tests and add a warning when plasma objects are evicted before being pinned

* Fix build and remove irrelevant test

* Fix for java

* Revert "Fix build and remove irrelevant test"

This reverts commit 056cca9b263ed05b0f9ab2250907338edcbca2d5.

* Fix ray.internal.free

* Fixes and skip some flaky tests

* fix java build

* fix windows build

* Add IDs contained in owned objects

* Update src/ray/protobuf/core_worker.proto

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/reference_count.cc

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/protobuf/core_worker.proto

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/protobuf/core_worker.proto

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/reference_count.h

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/reference_count.h

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/reference_count.cc

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* update

* Try to fix ::test_direct_call_serialized_id_eviction

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-18 18:21:34 -08:00
Eric Liang 0aa9373d62 Revert "Removing Pyarrow dependency (#7146)" (#7209)
This reverts commit 2116fd3bca.
2020-02-18 14:12:06 -08:00
ijrsvt 2116fd3bca Removing Pyarrow dependency (#7146) 2020-02-17 18:00:13 -08:00
fyrestone a6b8bd47b0 [xlang] Cross language serialize ActorHandle (#7134) 2020-02-17 20:44:56 +08:00
Qing Wang f3703bafa3 [Java] Support concurrent actor calls API. (#7022)
* WIP

Temp change

Attach native thread to jvm

* Fix run mode

* Address comments.
2020-02-14 13:02:39 +08:00
Edward Oakes e904711e74 Add python tests for serialized object ID reference counting (#7038) 2020-02-12 16:52:07 -08:00
Simon Mo 0e94e1dc2a [Asyncio] Increase recursion limit manually (#7142) 2020-02-12 14:15:36 -08:00
chaokunyang 247a4d022a Fix passing empty bytes in python tasks (#7045)
* ensure data_ won't be null_ptr when size == 0

* when data_sizes[i] == 0, we should Allocate an empty buffer

* work around for pyarrow.py_buffer

* fix comments

* add null ptr check

* add test for bytes

* lint
2020-02-10 12:07:29 +08:00
fyrestone 0648bd28ef [xlang] Cross language Python support (#6709) 2020-02-08 13:01:28 +08:00
Stephanie Wang 3333ee84a5 Fix ref counting (#7075) 2020-02-06 14:35:08 -08:00
Edward Oakes 844f607c93 Collect contained ObjectIDs during deserialization (#7029) 2020-02-03 22:49:14 -08:00
Edward Oakes 984490d2be Collect object IDs during serialization (#6946) 2020-02-03 18:38:11 -08:00
Siyuan (Ryans) Zhuang 42cbf801e1 workaround for python3.5 fast numpy serialization (#6675) 2020-02-03 13:08:18 -08:00
Edward Oakes 92525f35d1 Remove raylet client from Python worker (#6018) 2020-01-31 18:23:01 -08:00
Edward Oakes 341a921d81 Remove vanilla pickle serialization for task arguments (#6948) 2020-01-31 16:52:43 -08:00
Edward Oakes c2be794f10 Remove try/except import asyncio for python 2 (#6947) 2020-01-29 09:17:07 -08:00
Simon Mo 396d7fafc8 UI improvement for asyncio (#6905) 2020-01-27 12:45:51 -08:00
Yunzhi Zhang 0834bda8c1 [Dashboard] Display actor task execution info (#6705)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
2020-01-22 22:33:55 -08:00
Simon Mo 8f246c17b5 Initialize async plasma for async actors (#6813)
* Initialize async plasma for async actors

* Address comment
2020-01-17 14:58:06 -08:00
Edward Oakes 2a4d2c6e9e Basic reference counting & pinning (#6554) 2020-01-06 17:30:26 -06:00
Simon Mo 9fe90cdafc Fix async actor recursion limitation (#6672)
* Do not start threadpool when using async

* Turn function_executor into a generator

* Add new test for high concurrency and bump the default

* Set direct call
2020-01-02 19:45:13 -06:00
Eric Liang 46acb02aa4 Fix verbose shutdown error and test_env_with_subprocesses (#6614) 2019-12-26 22:43:39 -08:00