Commit Graph

41 Commits

Author SHA1 Message Date
Simon Mo 5d2e2532e7 Fix Serve long running test (#8223) 2020-04-29 14:09:33 -07:00
Eric Liang dd70720578 [rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Stephanie Wang 7c174d0ffe Make the ref counting test more stressful (#7473) 2020-03-05 20:51:24 -08:00
Simon Mo 29b08ddc09 Improve release process from 0.8.2 (#7303) 2020-02-24 21:18:53 -08:00
Stephanie Wang 2c1f4fd82c [core] Add long running regression test for distributed ref counting and fix memory leak (#7302)
* Add long running test for serialized IDs and fix mem leak

* comment
2020-02-24 17:58:42 -08:00
Eric Liang 5df801605e Add ray.util package and move libraries from experimental (#7100) 2020-02-18 13:43:19 -08:00
Simon Mo bec92a8946 [Hotfix] Fix flake8 lint failing (#7118) 2020-02-10 19:57:21 -08:00
Simon Mo f6c09ff614 Add serve stress test (#7076) 2020-02-10 09:37:39 -08:00
Edward Oakes b750bd7fc9 Use 2xlarge instances in long running tests (#6802) 2020-01-15 19:47:59 -06:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Philipp Moritz 735f282494 Use 0.9.0.dev0 as the version tag (#6630) 2019-12-30 10:14:07 -08:00
Eric Liang 1a1324d2a2 Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508) 2019-12-16 23:57:42 -08:00
Philipp Moritz f5d10eea0b [Projects] Refactor cluster specification (#6488) 2019-12-14 22:43:06 -08:00
Edward Oakes 032e8553c7 use numpy in long-running tests (#6448) 2019-12-11 17:53:30 -08:00
Edward Oakes f63b64310a Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Philipp Moritz a454c815f1 Fix long running stress tests (#6374) 2019-12-05 18:29:41 -08:00
Philipp Moritz dd27bfbb75 Rename .rayproject to ray-project (#6278) 2019-12-05 16:15:42 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Edward Oakes abbfe7392f Bump dev version to 0.8.0.dev6 (#5906) 2019-10-14 11:36:13 +01:00
Eric Liang b5da32df78 Bump Ray version in documentation to dev5 (#5794) 2019-09-27 00:19:17 -07:00
Philipp Moritz 57a5871ea6 Convert long running stress tests to projects (#5641) 2019-09-26 11:25:09 -07:00
Eric Liang a101812b9f Replace --redis-address with --address in test, docs, tune, rllib (#5602)
* wip

* add tests and tune

* add ci

* test fix

* lint

* fix tests

* wip

* sugar dep
2019-09-01 16:53:02 -07:00
Robert Nishihara 851c5b2dae Add a script for benchmarking performance for Ray developers. (#5472) 2019-08-19 23:41:23 -07:00
Philipp Moritz ccee77aafd fix node_failures.py (#5167) 2019-07-11 11:40:13 -07:00
Eric Liang 5ab5017c67 [rllib] Fix impala stress test (#5101)
* add copy

* upgrade to tf 1.14

* update

* reduce count to workaround https://github.com/ray-project/ray/issues/5125

* Update impala.py

* placeholder

* comments

* update
2019-07-09 20:22:30 -07:00
Eric Liang 904dcf081d Switch cluster longevity tests to DLAMI, fix ray up verbosity (#5084)
* fix

* add branch commit

* comments

* Update ci/long_running_tests/.gitignore

Co-Authored-By: Robert Nishihara <robertnishihara@gmail.com>
2019-07-02 00:19:05 -07:00
Robert Nishihara bcc379556b Make some fixes to long running stress tests. (#5056) 2019-06-28 15:42:54 -07:00
Hersh Godse 89722ff003 [tune] Directional metrics for components (#4120) (#4915) 2019-06-02 22:13:40 -07:00
Robert Nishihara 7a78e1e320 Install bazel in autoscaler development configs. (#4874) 2019-05-26 16:13:50 -07:00
Devin Petersohn fb2655fa93 Update Release Process documentation (#4670) 2019-04-25 00:05:19 -07:00
Philipp Moritz b0f6ddf6d1 Remove CMake files (#4493) 2019-04-02 22:17:33 -07:00
bjg2 77005d1814 [rllib] Make batch timeout for remote workers tunable (#4435) 2019-03-29 13:19:42 -07:00
Robert Nishihara c6f12e5219 Update documentation from 0.7.0.dev1 to 0.7.0.dev2. (#4485) 2019-03-26 17:32:53 -07:00
William Ma 11580fb7dc Changes where actor resources are assigned (#4323) 2019-03-24 15:49:36 -07:00
William Ma f423909aec Temporary fix for many_actor_task.py (#4315) 2019-03-09 00:07:45 -08:00
Robert Nishihara fd2d8c2c06 Remove Jenkins backend tests and add new long running stress test. (#4288) 2019-03-08 15:29:39 -08:00
Philipp Moritz 39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Robert Nishihara f151aa8723 Update long running stress tests and add actor death test. (#4275) 2019-03-06 14:26:45 -08:00
Eric Liang 6e3384a719 [rllib] Add three new long-running stress tests {APEX, IMPALA, PBT} (#4215) 2019-03-04 14:05:42 -08:00
Robert Nishihara c4aa90314d Add script for shutting down tests. (#4203) 2019-03-01 19:56:30 -08:00
Robert Nishihara 75504b9586 Add script for running infinitely long stress tests. (#4163)
Running `./ci/long_running_tests/start_workloads.sh` will start several workloads running (each in their own EC2 instance).
- The workloads run forever.
- The workloads all simulate multiple nodes but use a single machine.
- You can get the tail of each workload by running `./ci/long_running_tests/check_workloads.sh`.
- You have to manually shut down the instances.

As discussed with @ericl @richardliaw, the idea here is to optimize for the debuggability of the tests. If one of them fails, you can ssh to the relevant instance and see all of the logs.
2019-02-27 14:33:06 -08:00