Commit Graph

71 Commits

Author SHA1 Message Date
Barak Michener c4e273920f [ray_client]: Insert decorators into the real ray module to allow for client mode (#13031) 2020-12-22 22:51:45 -08:00
Barak Michener 80f6dd16b2 [ray_client] Implement optional arguments to ray.remote() and f.options() (#12985) 2020-12-20 15:43:48 -08:00
Barak Michener 7ab9164f1b [ray_client] Integrate with test_basic, test_basic_2 and test_actor (#12964) 2020-12-20 14:54:18 -08:00
Edward Oakes aec5c9879e Add tests for atexit handler behavior (#12808) 2020-12-11 21:47:05 -06:00
Clark Zinzow 0c0b0d0a73 [Core] Added support for submission-time task names. (#10449)
* Added support for submission-time task names.

* Suggestions from code review: add missing consts

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Add num_returns arg to actor method options docstring example.

* Add process name line and proctitle assertion to submission-time task name section of advanced docs.

* Add submission-time task name --> proctitle test for Python worker.

* Added Python actor options tests for num_returns and name.

* Added Java test for submission-time task names.

* Add dashboard image to task name docs section.

* Move to fstrings.

Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-09-03 11:45:24 -07:00
Eric Liang 2a204260a8 [api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
SangBin Cho 326a470bc2 [Test] Reduce the wait for condition timeout. (#9971) 2020-08-07 11:44:53 -07:00
Barak Michener 21994c594b python/test: Faster tests and better BUILD (#9791) 2020-08-06 10:58:42 -07:00
SangBin Cho 685182923c [Core] Fix detached actor local mode when gcs actor management is on. (#9839)
* Fix local mode detached actor.

* Revert changes.
2020-08-05 09:04:24 -07:00
ZhuSenlin a269ae9bc4 [GCS] Fix actor task hang when its owner exits before local dependencies resolved (#8045) 2020-07-27 10:56:52 +08:00
Hao Chen d49dadf891 Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
fyrestone b0bb0584fb Fix fix_test_actor_method_metadata_cache (#8633) 2020-05-27 15:49:40 +08:00
mehrdadn ebf060d484 Make more tests run on Windows (#8446)
* Remove worker Wait() call due to SIGCHLD being ignored

* Port _pid_alive to Windows

* Show PID as well as TID in glog

* Update TensorFlow version for Python 3.8 on Windows

* Handle missing Pillow on Windows

* Work around dm-tree PermissionError on Windows

* Fix some lint errors on Windows with Python 3.8

* Simplify torch requirements

* Quiet git clean

* Handle finalizer issues

* Exit with the signal number

* Get rid of wget

* Fix some Windows compatibility issues with tests

Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Stephanie Wang 3a25f5f5b4 Clean up actor state from the GCS (#8261)
* parametrize test

* Regression test and logging

* Test no restart after actor deletion

* Unit tests

* Refactor to subscribe to and lookup from worker failure table

* Refactor ActorManager to remove dependencies

* Revert "Regression test and logging"

This reverts commit 835e1a9091b51ca8efb00392d4cc4a665145de24.

* Revert "parametrize test"

This reverts commit f31272082831ba1a494816dd5511d87b24eca4c9.

* Revert "Test no restart after actor deletion"

This reverts commit 114a83de14329aa6ab787c80cd5757cf074a9072.

* doc

* merge

* Revert "Refactor to subscribe to and lookup from worker failure table"

This reverts commit 6aa13a05178d0b9aa1db9dee5c978c911b74fa3a.

* Revert "Revert "Test no restart after actor deletion""

This reverts commit 1bd92d09172aa8ab42632551cf9c56463f9598fe.

* Revert "Revert "parametrize test""

This reverts commit 639ba4d3b02167fb2b05e9878f9aa600bcec95b3.

* Revert "Revert "Regression test and logging""

This reverts commit f18b5f0db699a23cbccde32789e3639425e99ca4.

* Clean up actors that have gone out of scope

* Use actor ID instead of shared_ptr

* Clean up actors owned by dead workers

* Use actor ID instead of shared_ptr

* TODO and lint

* Fix unit tests

* Add unit tests for supervision and docs

* xx

* Fix tests

* Fix tests

* fix build
2020-05-09 18:43:49 -07:00
Stephanie Wang fdb528514b [core] Ref counting for actor handles (#7434)
* tmp

* Move Exit handler into CoreWorker, exit once owner's ref count goes to 0

* fix build

* Remove __ray_terminate__ and add test case for distributed ref counting

* lint

* Remove unused

* Fixes for detached actor, duplicate actor handles

* Remove unused

* Remove creation return ID

* Remove ObjectIDs from python, set references in CoreWorker

* Fix crash

* Fix memory crash

* Fix tests

* fix

* fixes

* fix tests

* fix java build

* fix build

* fix

* check status

* check status
2020-03-10 17:45:07 -07:00
fyrestone a6b8bd47b0 [xlang] Cross language serialize ActorHandle (#7134) 2020-02-17 20:44:56 +08:00
Edward Oakes d91d3ea936 Split half of test_actor into test_actor_advanced (#7143) 2020-02-12 15:17:25 -08:00
Eric Liang 305eaaabe9 Fix hang if actor object id is returned from a task that exits (#6885) 2020-02-11 20:28:13 -08:00
Ziyad Edher c480d1d1e4 Treat static methods as class methods instead of instance methods in actors (#6756)
* Treat static methods as class methods rather than instance methods

* Add tests for static methods in actors

* Revert formatting changes

* Readd future imports

* Restructure static method check

* Documentation enhancements

* Fix linting issues
2020-01-15 19:38:41 -06:00
Edward Oakes a950e95c7d Use exit() in __kill_actor__ (#6760) 2020-01-13 11:37:59 -06:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara 39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Robert Nishihara 480206eef8 Remove some Python 2 compatibility code. (#6624) 2019-12-31 17:14:58 -08:00
Robert Nishihara ff82613b66 Fix test_actor.py test_kill. (#6623) 2019-12-27 22:39:17 -08:00
Zhijun Fu 088ce2d1e1 Fix hang on actor creation task failure (#6617) 2019-12-27 10:48:17 -08:00
Eric Liang d3db9e9c1e By default, reconstruction should only be enabled for actor creation. (#6613)
* wip

* fix

* fix
2019-12-26 19:57:50 -08:00
Zhijun Fu d2bba596ab Fix actor reconstruction with direct call (#6570) 2019-12-26 10:59:50 +08:00
Edward Oakes 6b1a57542e Add actor.__ray_kill__() to terminate actors immediately (#6523) 2019-12-23 23:12:57 -06:00
Eric Liang 5a5c94939f [direct call] Retry failed tasks with delay (#6453)
* retry failed tasks with delay

* set to 0 for direct tests
2019-12-12 17:12:38 -08:00
Eric Liang 304b4f0d3d Shard unit tests into medium sized files for test stability (#6398) 2019-12-09 13:15:29 -08:00
Eric Liang 6223d2ed0b [direct call] Assign resource ids for direct call tasks (#6364) 2019-12-05 10:16:04 -08:00
Eric Liang 64a3a7239e Set RAY_FORCE_DIRECT=1 for run_rllib_tests, test_basic (#6171) 2019-11-25 14:12:11 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Simon Mo eb6a93c0f0 [hotfix] fix lint (#6236) 2019-11-21 18:30:57 -08:00
Stephanie Wang d3227f2f2d Fix bug in direct task calls for objects that were evicted (#6216)
* Fix bug and add some checks

* rename
2019-11-21 15:38:31 -08:00
Simon Mo 29ba6bfc64 Basic Async Actor Call (#6183)
* Start trying to figure out where to put fibers

* Pass is_async flag from python to context

* Just running things in fiber works

* Yield implemented, need some debugging to make it work

* It worked!

* Remove debug prints

* Lint

* Revert the clang-format

* Remove unnecessary log

* Remove unncessary import

* Add attribution

* Address comment

* Add test

* Missed a merge conflict

* Make test pass and compile

* Address comment

* Rename async -> asyncio

* Move async test to py3 only

* Fix ignore path
2019-11-21 11:56:46 -08:00
Philipp Moritz fc655acfee Fix linting on master branch (#6174) 2019-11-16 10:02:58 -08:00
Eric Liang fb34928a2a [minor] Perf optimizations for direct actor task submission (#6044)
* merge optimizations

* fix

* fix memory err

* optimize

* fix tests

* fix serialization of method handles

* document weakref

* fix check

* bazel format

* disable on 2
2019-11-01 14:41:14 -07:00
Simon Mo 7f5b3502da Implement Detached Actor (#6036)
* Arg propagation works

* Implement persistent actor

* Add doc

* Initialize is_persistent_

* Rename persistent->detached

* Address comment

* Make test passes

* Address comment

* Python2 compatiblity

* Fix naming, py2

* Lint
2019-11-01 10:28:23 -07:00
Edward Oakes 1ce521a7f3 Remove task context from python worker (#5987)
Removes duplicated state between the python and C++ workers. Also cleans up the serialization codepaths a bit.
2019-10-25 07:38:33 -07:00
Philipp Moritz 785670bc18 Fix class attributes and methods for actor classes (#5802) 2019-10-07 23:56:07 -07:00
Edward Oakes 08e4e3a153 [core worker] Submit Python actor tasks through core worker (#5750)
* Submit actor tasks through core worker

* Fix java

* add comment

* Remove task builder

* Check negative

* Increase -> Increment

* pass by reference

* fix signal

* Clean up c++ actor handle

* more cleanup

* Clean up headers

* Fix unique_ptr construction

* Fix java

* Move profiling to c++

* dedup

* fix error

* comments

* fix java

* Fix tests

* wait for actor to exit

* Start after constructor

* ignore java build

* fix comment

* always init logging

* Fix logging

* fix logging issue

* shared_ptr for profiler

* DEBUG -> WARNING

* fix killed_ init

* Fix flaky checkpointing tests

* -v flag for tune tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix test exception info

* Fix import

* fix build

* Fix test

* shared_ptr
2019-10-07 15:42:19 -07:00
Edward Oakes 963bbe8bbd Move profiling to c++ (#5771)
* Move profiling to c++

* comments

* Fix tests

* Start after constructor

* fix comment

* always init logging

* Fix logging

* fix logging issue

* shared_ptr for profiler

* DEBUG -> WARNING

* fix killed_ init

* Fix flaky checkpointing tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix import

* fix build

* use boost::asio

* fix double const

* Properly reset async_wait

* remove SIGINT

* Change error message

* increase timeout

* small nits

* Don't trap on SIGINT

* -v for tune

* Fix test
2019-10-01 10:06:25 -07:00
Edward Oakes 86610a30c9 [flaky test] Fix flaky checkpointing tests (#5791)
* Fix flaky checkpointing tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix import

* fix build
2019-09-27 11:03:07 -07:00
Robert Nishihara 18ce7bda2b Fix flaky test_actors_and_tasks_with_gpus_version_two test. (#5756) 2019-09-25 11:47:47 -07:00
Edward Oakes d499601bd7 Fix flaky checkpoint tests (#5778) 2019-09-25 10:55:17 -07:00
Simon Mo fc9f03cd96 Fix queue actor init in setup_queue_actor fixture (#5676) 2019-09-13 12:35:44 -07:00
Edward Oakes 07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Simon Mo 147e7d46ec [Flaky tests] FIx test fork (#5671)
* Start testing test_fork

Maybe queue actor takes too long to initialize, that's why we are
seeing "Many python processes started" since most of the python
tasks are blocked on ray.get

* Add a comment
2019-09-09 19:21:20 -07:00
Robert Nishihara 87adb5a3c8 Remove timeout in test_actor_lifetime_load_balancing. (#5659) 2019-09-08 16:10:59 -07:00