Commit Graph

31 Commits

Author SHA1 Message Date
SangBin Cho 4ad79ca963 [Object Spilling] Remove LRU eviction (#13977)
* done.

* formatting.

* done.

* done.
2021-02-15 14:24:53 -08:00
architkulkarni 75568f856c skip restart and multi restart test on win (#14084) 2021-02-14 15:17:54 -08:00
Hao Chen e1a5e5bad4 Fix test_actor_restart (#13901) 2021-02-05 14:08:43 -08:00
Tao Wang 44aa9c173f Rename timeout to period with heartbeat interval (#13872) 2021-02-04 10:37:28 +08:00
Edward Oakes 62d6b0a558 Fix max_task_retries for named actors (#12762) 2020-12-10 18:24:55 -06:00
Eric Liang 48dee789b3 Add random actor placement; fix cancellation callback; update test skips (#11684) 2020-10-30 18:36:35 -07:00
Eric Liang 2a204260a8 [api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
Eric Liang 519354a39a [api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
Eric Liang bd245a1c18 [api] Clean up and document Actor name / lifetime API (#10332) 2020-08-27 13:38:39 -07:00
Stephanie Wang f75dfd60a3 [api] API deprecations and cleanups for 1.0 (internal_config and Checkpointable actor) (#10333)
* remove

* internal config updates, remove Checkpointable

* Lower object timeout default

* remove json

* Fix flaky test

* Fix unit test
2020-08-27 10:19:53 -07:00
kisuke95 28b1f7710c [Core] Error info pubsub (Remove ray.errors API) (#9665) 2020-08-04 14:04:29 +08:00
SangBin Cho 914cc96c91 Fix broken actor failure tests. (#9737) 2020-07-27 18:59:44 -07:00
Robert Nishihara db0d6e8efa Make wait_for_condition raise exception when timing out. (#9710) 2020-07-26 22:56:32 -07:00
ZhuSenlin a269ae9bc4 [GCS] Fix actor task hang when its owner exits before local dependencies resolved (#8045) 2020-07-27 10:56:52 +08:00
Stephanie Wang baf4be245d Fix flaky test_actor_failures::test_actor_restart (#9509)
* Fix flaky test

* os exit
2020-07-16 10:48:33 -07:00
Stephanie Wang 6d99aa34a5 [core] Handle out-of-order actor table notifications (#9449)
* Drop stale actor table notifications

* build

* Add num_restarts to disconnect handler

* Unit test and increment num_restarts on ALIVE, not RESTARTING

* Wait for pid to exit
2020-07-14 22:55:04 -07:00
Hao Chen d49dadf891 Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
Ian Rodney 9172f8c3a6 [core] Store Internal Config in GCS (#8921) 2020-07-08 11:22:08 -05:00
SangBin Cho 557da7044f Fix flaky test that says ray.init is called twice. (#9234) 2020-07-06 15:19:00 -07:00
fangfengbin 8fcfcc4100 GCS server error handling for actor creation (#8899) 2020-07-02 16:27:32 +08:00
SangBin Cho 7af6c69672 [Test] Cluster util fix (#8929) 2020-06-29 14:15:41 -05:00
Lingxuan Zuo e594524ed3 [GCS] global state query node info table from GCS. (#8498) 2020-05-28 16:39:13 +08:00
mehrdadn ebf060d484 Make more tests run on Windows (#8446)
* Remove worker Wait() call due to SIGCHLD being ignored

* Port _pid_alive to Windows

* Show PID as well as TID in glog

* Update TensorFlow version for Python 3.8 on Windows

* Handle missing Pillow on Windows

* Work around dm-tree PermissionError on Windows

* Fix some lint errors on Windows with Python 3.8

* Simplify torch requirements

* Quiet git clean

* Handle finalizer issues

* Exit with the signal number

* Get rid of wget

* Fix some Windows compatibility issues with tests

Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Stephanie Wang bd169749e0 Option to retry failed actor tasks (#8330)
* Python

* Consolidate state in the direct actor transport, set the caller starts at

* todo

* Remove unused

* Update and unit tests

* Doc

* Remove unused

* doc

* Remove debug

* Update src/ray/core_worker/transport/direct_actor_transport.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* Update src/ray/core_worker/transport/direct_actor_transport.cc

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* lint and fix build

* Update

* Fix build

* Fix tests

* Unit test for max_task_retries=0

* Fix java?

* Fix bad test

* Cross language fix

* fix java

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-05-15 20:15:15 -07:00
Max Fitton 00325eb2b2 Rename max_reconstructions to max_restarts and use -1 for infinite (#8274)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-05-14 10:30:29 -05:00
Zhijun Fu a7a5d172b1 [core] fix bug that actor tasks from reconstructed actor is ignored by scheduling queue (#7637) 2020-03-21 13:05:24 +08:00
Edward Oakes c1b0f9ccdf Add failure tests to test_reference_counting (#7400) 2020-03-17 10:30:21 -05:00
mehrdadn a0700e2f86 Change /tmp to platform-specific temporary directory (#7529) 2020-03-16 18:10:14 -07:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara 39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00