Commit Graph

37 Commits

Author SHA1 Message Date
Eric Liang 8c8af2616e Minimal version of piping autoscaler events to driver logs (#13434) 2021-01-16 10:06:20 -08:00
Eric Liang 602c103eae Make request_resources() use internal kv instead of redis pub sub (#13410) 2021-01-13 17:30:43 -08:00
Philipp Moritz 9872fc1801 Start ray client server with 'ray start' (#13217) 2021-01-06 21:04:14 -08:00
Tao Wang 35f7d84dbe Revert heartbeat interval to keep ci stable (#12836)
* Revert heartbeat interval to keep ci stable

* fix missing one
2020-12-14 16:58:40 +08:00
Tao Wang 295b6e5ce4 Split heartbeat message (#12535)
* first

* xxx

* Split heartbeat message

* only report resource usage when changed

* Fix GetAllResourceUsage

* Fix report resource usage

* Increase default heartbeat interval

* regularize heartbeat interval in test case
2020-12-11 21:19:57 +08:00
Eric Liang fd8ae0697b [autoscaler] Fix test heartbeats single test (#12513)
* update

* update

* update
2020-11-30 21:24:45 -08:00
Eric Liang 569eee5e71 Enable more new scheduler tests (#12421) 2020-11-27 16:10:38 -08:00
Eric Liang e72abcd0aa Enable even more new scheduler tests (#12096) 2020-11-19 16:47:18 -08:00
Ameer Haj Ali 85197deece [autoscaler] Remove legacy autoscaler (#11802) 2020-11-11 13:36:48 -08:00
Eric Liang f9f372c327 [autoscaler] Clean up monitoring loop code (#11677) 2020-10-30 13:48:43 -07:00
Tao Wang 1d5694ddea [GCS]Use direct getting instead of pub-sub to update load metrics in monitor.py (#11339) 2020-10-28 11:23:18 -07:00
Alex Wu 8906c1a59f [Autoscaler] Demand autoscaler take into account utilized resources (#10464) 2020-09-06 20:54:44 -07:00
Stephanie Wang f75dfd60a3 [api] API deprecations and cleanups for 1.0 (internal_config and Checkpointable actor) (#10333)
* remove

* internal config updates, remove Checkpointable

* Lower object timeout default

* remove json

* Fix flaky test

* Fix unit test
2020-08-27 10:19:53 -07:00
Ian Rodney 9172f8c3a6 [core] Store Internal Config in GCS (#8921) 2020-07-08 11:22:08 -05:00
Siyuan (Ryans) Zhuang 4b31b383f3 [Core] Run Plasma Store as a Raylet thread (with a feature flag) (#8897)
* integrate plasma store as a thread (C++)

* integrate plasma store as a thread (Python)

* fix config issues

* remove plasma component fail tests

* without forcefully kill the plasma store thread
2020-06-11 22:54:08 -07:00
fangfengbin 016337d4eb Heartbeat table uses gcs pub-sub instead of redis accessor (#8655) 2020-05-30 23:17:25 +08:00
Edward Oakes cbe494ab13 [flaky test] Fix flaky test_heartbeats_single (#7857) 2020-04-02 16:23:28 -05:00
mehrdadn 4d42664b2a Use prctl(PR_SET_PDEATHSIG) on Linux instead of reaper (#7150) 2020-03-03 11:45:42 -06:00
Eric Liang 5df801605e Add ray.util package and move libraries from experimental (#7100) 2020-02-18 13:43:19 -08:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara 39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Simon Mo e530c37b0e Use localhost and set redis password by default (#6481) 2019-12-17 19:41:19 -08:00
Eric Liang 304b4f0d3d Shard unit tests into medium sized files for test stability (#6398) 2019-12-09 13:15:29 -08:00
Edward Oakes e4f9b3b7d9 Use process reaper for cleanup (#6253) 2019-11-26 22:00:08 -06:00
Eric Liang 30b2fc1d81 Fix actor creation hang due to race in SWAP queue (#6280) 2019-11-26 15:21:03 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Simon Mo c8d7065bf3 [CI] Use rerunfailures instead of flaky (#6061)
* Use rerunfailures instead of flaky

* Lint
2019-11-01 13:59:03 -07:00
Edward Oakes 02931e08f3 [core worker] Python core worker task execution (#5783)
Executes tasks via the event loop in the C++ core worker. Also properly handles signals (including KeyboardInterrupt), so ctrl-C in a python interactive shell works now (if connecting to an existing cluster).
2019-10-22 20:15:59 -07:00
Eric Liang 6843a01a7f Automatically create custom node id resource (#5882)
* node id

* comment

* comments

* fix tests
2019-10-15 21:31:11 -07:00
Eric Liang faeaa34bdd Deflake cluster heartbeat test (#5552) 2019-09-11 12:26:04 -07:00
Eric Liang a101812b9f Replace --redis-address with --address in test, docs, tune, rllib (#5602)
* wip

* add tests and tune

* add ci

* test fix

* lint

* fix tests

* wip

* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
Richard Liaw 3e0ad11ae0 Add heartbeat test + Fix monitor.py (#5191) 2019-07-16 21:59:48 -07:00
Robert Nishihara 6703519144 Move global state API out of global_state object. (#4857) 2019-05-26 11:27:53 -07:00
Yuhong Guo c2c548bdfd Fix broken pipe callback (#4513) 2019-04-02 17:42:18 +08:00
Yuhong Guo 8ce7565530 Refactor pytest fixtures for ray core (#4390) 2019-03-20 11:48:32 +08:00
William Ma c7a4c74f55 Moving tests from test/ to python/ray/tests/ (#3950) 2019-02-21 11:09:08 -08:00