Commit Graph

  • ea4154df80 [Hotfix] Master compilation error on MacOS. (#13946) Simon Mo 2021-02-05 16:07:45 -08:00
  • cbd3598970 [tune] Fixed wait_for_gpu to handle str representations of ordinal IDs (#13936) Travis Addair 2021-02-05 15:41:24 -08:00
  • e1a5e5bad4 Fix test_actor_restart (#13901) Hao Chen 2021-02-06 06:08:43 +08:00
  • 4a3dd6858d Buildkite determine-to-run support (#13866) Simon Mo 2021-02-05 12:58:07 -08:00
  • 81f0796841 xgboost cpu small autoscaler yaml Alex 2021-02-05 20:28:48 +00:00
  • 189f38c22b [Tune] Add try-except to FailureInjectorCallback (#13939) Amog Kamsetty 2021-02-05 11:02:42 -08:00
  • 5f61ace191 autoscaler yaml for long running distributed Alex 2021-02-05 19:40:02 +00:00
  • f44f368eae [Tune] Add try-except to FailureInjectorCallback (#13939) Amog Kamsetty 2021-02-05 11:02:42 -08:00
  • f782ed59a0 Ray client version check strict eq (#13926) Eric Liang 2021-02-05 00:06:10 -08:00
  • eee624cf5f Revert "Fix passing env on windows (#13253)" (#13828) fyrestone 2021-02-05 13:03:16 +08:00
  • 8a5999c12a [GCS]Fix bug that gcs client does not set last_resource_usage_ (#13856) fangfengbin 2021-02-05 11:51:25 +08:00
  • fb89f9c2c8 [Placement Group] Support named placement group (#13755) DK.Pino 2021-02-05 11:04:51 +08:00
  • 75886c8e78 Merge branch 'releases/1.2.0' of github.com:ray-project/ray into releases/1.2.0 Alex 2021-02-05 02:45:12 +00:00
  • 40beec569c long running distributed fails Alex 2021-02-05 02:44:45 +00:00
  • c2a46846f2 long running distributed fails Alex 2021-02-05 02:43:55 +00:00
  • 40bad86c7a [hotfix][test][windows] Exclude k8s operator mock test from build. (#13924) Dmitri Gekhtman 2021-02-04 18:35:10 -08:00
  • 982c606b86 Add more user-friendly error message upon async def remote task (#13915) Kathryn Zhou 2021-02-04 21:33:33 -05:00
  • a0ff0defac scalability tests run Alex 2021-02-05 01:18:50 +00:00
  • e89bbcbd44 [Serve] Revert "Revert "[Serve] Fix ServeHandle serialization"" and disable failing Windows test (#13771) architkulkarni 2021-02-04 14:50:01 -08:00
  • 7af0c999f3 [serve] Built-in support for imported backends (#13867) Edward Oakes 2021-02-04 15:09:12 -06:00
  • db59736b1a [autoscaler][kubernetes] Add ability to not copy cluster config to head node when calling create_or_update_head_node. (#13720) Dmitri Gekhtman 2021-02-04 10:30:03 -08:00
  • 1e113d2e6e [tune/xgboost] Update release test docs (#13880) Kai Fricke 2021-02-04 13:10:56 +01:00
  • 6c77aeb98a [docs] ray slack remove banners (#13898) Richard Liaw 2021-02-04 01:14:34 -08:00
  • 0fc81e2393 [tune] fix gpu check (#13825) Richard Liaw 2021-02-04 01:13:58 -08:00
  • e79a380a7e Check in shuffle code as experimental (#13899) Eric Liang 2021-02-04 00:24:16 -08:00
  • 243f678ffd Fall back to random port instead of default port for non-primary Redis shards; attempt to cluster Redis shard ports close to each other. (#13847) Clark Zinzow 2021-02-03 23:00:15 -07:00
  • a13208f113 Scalability envelope readme typo (#13874) Alex Wu 2021-02-03 21:43:45 -08:00
  • 44aa9c173f Rename timeout to period with heartbeat interval (#13872) Tao Wang 2021-02-04 10:37:28 +08:00
  • e0d9c8f0a8 Always replace DEL with UNLINK (#13832) Tao Wang 2021-02-04 10:30:00 +08:00
  • 1187d1dd3e [autoscaler][kubernetes][operator] Rudimentary error handling, make "MODIFIED" -> update event work. (#13756) Dmitri Gekhtman 2021-02-03 18:07:11 -08:00
  • e8fce9f1f3 Check Ray client protocol version (#13886) Eric Liang 2021-02-03 16:44:09 -08:00
  • 4c71f76b25 [Release] Fix SGD+Tune long running distributed release test (#13812) Amog Kamsetty 2021-01-31 21:05:50 -08:00
  • 34e0dfe934 [Core] Put raylet ip's in resource usage report (#13871) Alex Wu 2021-02-03 11:28:56 -08:00
  • 407302f93a [Core] Ownership-based Object Directory - Changed infinite short-poll location subscription to long-poll. (#13841) Clark Zinzow 2021-02-03 15:16:42 -07:00
  • cb9fa90203 [Object Spilling] Add consumed bytes to detect thrashing. (#13853) SangBin Cho 2021-02-03 14:16:26 -08:00
  • 77ee2c569f [ray_client] convert things registered for ray into ray_client (#13639) Barak Michener 2021-02-03 13:30:05 -08:00
  • f14171ced9 [Core] Put raylet ip's in resource usage report (#13871) Alex Wu 2021-02-03 11:28:56 -08:00
  • 79310452e7 Enabling the cancellation of non-actor tasks in a worker's queue 2 (#13244) Gabriele Oliaro 2021-02-03 13:20:12 -05:00
  • 875ea3fe1d [docs] Update actors.rst (#13873) Haoyuan Ge 2021-02-04 01:51:53 +08:00
  • a695c651ee [serve] Small cleanups for BackendState (#13870) Edward Oakes 2021-02-03 11:46:25 -06:00
  • 2a903b904a [joblib] Log once the context warning argument. (#13865) Ameer Haj Ali 2021-02-03 10:23:20 +02:00
  • c8c20ca73c Scalability envelope readme typo wuisawesome-patch-1 Alex Wu 2021-02-02 19:34:15 -08:00
  • d335ce2aab Move the tune driver into a remote task (#13778) Eric Liang 2021-02-02 18:41:45 -08:00
  • b4684cf37a Fix bug that otal_commands_queued_ is not initialized (#13852) fangfengbin 2021-02-03 10:00:15 +08:00
  • c8e1f07c52 remove starlette install instruction (#13869) architkulkarni 2021-02-02 14:37:55 -08:00
  • 32fc649f39 [serve] Add example code for custom status code response (#13868) architkulkarni 2021-02-02 14:30:45 -08:00
  • fc956e084a [Hotfix] Lint (#13864) Edward Oakes 2021-02-02 14:56:50 -06:00
  • 863c1b8282 Add podman support (#13633) James 2021-02-02 14:09:43 -05:00
  • 9ac731558b [RLlib] Unify fcnet initializers for the value output layer (std=1.0 in torch, but 0.01 in tf). (#13733) Sven Mika 2021-02-02 18:42:49 +01:00
  • 0a0d9183fe [RLlib] Trajectory view API example script (enhancements and tf2 support). (#13786) Sven Mika 2021-02-02 18:42:18 +01:00
  • a6138ca31f [serve] Support batches for ImportedBackends (#13843) Edward Oakes 2021-02-02 09:44:01 -06:00
  • d29fcfb45c [tune] catch SIGINT signal and trigger experiment checkpoint (#13767) Kai Fricke 2021-02-02 14:52:09 +01:00
  • b9c15a2551 [RLlib] Issue #13761: Fix get action shape (#13764) Stanislav Chekmenev 2021-02-02 13:13:43 +01:00
  • 714c367b9d [RLlib] Trainer._validate_config idempotentcy correction (issue 13427) (#13556) Raoul Khouri 2021-02-02 07:11:57 -05:00
  • 0c93bb77cb [RLlib] Update Documentation for Curiosity's support of continuous actions (#13784) QuantumMecha 2021-02-02 22:40:09 +10:30
  • 52c94b7ee9 [RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) Sven Mika 2021-02-02 13:05:58 +01:00
  • fa4290090d Add Ray client protocol version (#13846) Eric Liang 2021-02-02 00:19:08 -08:00
  • 26beb3b67b Revert "Revert "Enable Ray client server by default (#13350)" (#13429)" (#13442) Eric Liang 2021-02-02 00:17:29 -08:00
  • 88ab887cc4 Unconditionally retry all RPC errors on client connect (#13845) Eric Liang 2021-02-02 00:10:35 -08:00
  • d71eeac2d6 remove lru evict docs (#13849) Eric Liang 2021-02-02 00:07:47 -08:00
  • 886217c333 [Object Spilling] Skip normal ray.get path when spilling objects. (#13831) SangBin Cho 2021-02-01 16:03:34 -08:00
  • e4d30430c0 Fix naming of ray_spilled_objects directory Eric Liang 2021-02-01 15:46:40 -08:00
  • 26ba95e96d [python/ray]: add cloudpickle dependency (#13838) Barak Michener 2021-02-01 15:27:39 -08:00
  • 1ee5d5faff [AWS] Fill-in AMI if not provided (#13808) Ian Rodney 2021-02-01 14:30:48 -08:00
  • 55566bc797 [ray_client]: Add python version check and test (and some minor fixes along the way) (#13722) Barak Michener 2021-02-01 13:04:38 -08:00
  • 754bee9282 [core][object spillin] Fix bugs in admission control (#13781) Stephanie Wang 2021-02-01 10:48:21 -08:00
  • 6e53a71978 bug fix for doc (#13834) SongGuyang 2021-02-01 21:13:43 +08:00
  • 361e5f0bef support dynamic library loading in C++ worker (#13734) SongGuyang 2021-02-01 19:24:33 +08:00
  • 1d2ab018b0 Use right reserve size (#13829) Tao Wang 2021-02-01 15:49:34 +08:00
  • 9d7b8b58a2 [autoscaler] Remove min workers from multi node type examples (#13814) Ameer Haj Ali 2021-02-01 09:29:57 +02:00
  • d1ec787d9d [Object Spilling] Turn on by default. (#13745) SangBin Cho 2021-01-31 23:28:37 -08:00
  • 2ba77ae3a2 [Release] Fix SGD+Tune long running distributed release test (#13812) Amog Kamsetty 2021-01-31 21:05:50 -08:00
  • b5f0aed974 [Log] use default stderr logger if no raylog starting (#13762) Lingxuan Zuo 2021-02-01 11:13:06 +08:00
  • 660857ffab Fix windows test (#13811) Ameer Haj Ali 2021-01-30 07:10:59 +02:00
  • ceb60965ae rllib regression Alex 2021-01-30 04:32:18 +00:00
  • 4b60c388ef [Dashboard] fix new dashboard entrance and some table problem (#13790) Dominic Ming 2021-01-30 10:42:16 +08:00
  • 30f82329e3 [core] Add debug information for the PullManager and LocalObjectManager (#13782) Stephanie Wang 2021-01-29 17:55:46 -08:00
  • a3796b3ed5 [CI] Add other Travis Linux builds to buildkite (#13769) Simon Mo 2021-01-29 15:48:02 -08:00
  • 194656731d [CI] Deflake test_basics and skip test_component_failures_3 (#13801) Simon Mo 2021-01-29 15:47:21 -08:00
  • 50808024eb Revert "[autoscaler] Better validation for min_workers and max_workers (#13779)" (#13807) Simon Mo 2021-01-29 15:43:01 -08:00
  • 115afee4c3 stress tests done Alex 2021-01-29 21:24:15 +00:00
  • 9fd198635f stress tests done Alex 2021-01-29 21:24:07 +00:00
  • 9441f85e1a [client] Hook runtime context (#13750) Barak Michener 2021-01-29 12:58:41 -08:00
  • c21a79ae6e [Object Spilling] 100GB shuffle release test (#13729) SangBin Cho 2021-01-29 12:38:06 -08:00
  • 1a9a0024d5 [Wheel] Build Py36 & Py38 in separate deploy (#13797) Ian Rodney 2021-01-29 12:28:40 -08:00
  • 0b598c0f05 [Serialization] API for deregistering serializers; code & doc cleanup (#13471) Siyuan (Ryans) Zhuang 2021-01-29 10:27:05 -08:00
  • b20a38febb [autoscaler] Avoid launching GPU nodes when the workload only has CPU tasks. (#13776) Eric Liang 2021-01-29 09:50:28 -08:00
  • 4d6817c683 [autoscaler] Better validation for min_workers and max_workers (#13779) Ameer Haj Ali 2021-01-29 19:41:56 +02:00
  • 9a413144b1 [tune] dynamic global checkpointing interval (#13736) Kai Fricke 2021-01-29 17:14:46 +01:00
  • 0f3a3e14aa Only delete local object in CoreWorkerPlasmaStoreProvider:::WarmupStore (#13788) Hao Chen 2021-01-29 20:24:09 +08:00
  • 752da83bb7 [Dashboard] Add the new dashboard code and prompt users to try it (#11667) Dominic Ming 2021-01-29 15:22:26 +08:00
  • 42d501d747 [core] Pin arguments during task execution (#13737) Stephanie Wang 2021-01-28 19:07:10 -08:00
  • 813a7ab0e2 [docker] Build Python3.6 & Python3.8 Docker Images (#13548) Ian Rodney 2021-01-28 15:24:50 -08:00
  • 0c906a8b93 [Docker] usage of python-version (#13011) Tanja Bayer 2021-01-28 23:27:54 +01:00
  • b4d87b8fc5 Fix high CPU usage in object manager due to O(n^2) iteration over active pulls list (#13724) Eric Liang 2021-01-27 14:02:22 -08:00
  • 5c2aedc7d9 [CLI] Fix Ray Status with ENV Variable set (#13707) Ian Rodney 2021-01-26 10:29:42 -08:00
  • 942d603d7e [Core] Hotfix Windows Compilation Error for ClusterTaskManager (#13754) Simon Mo 2021-01-27 19:01:56 -08:00
  • 9a40d7b4ee [Core/Autoscaler] Properly clean up resource backlog from (#13727) Alex Wu 2021-01-27 15:30:58 -08:00
  • cb771f263d [Serve] Add ServeHandle metrics (#13640) architkulkarni 2021-01-28 12:40:47 -08:00
  • 4bc257f4fb [RLlib] Fix custom multi action distr (#13681) Sven Mika 2021-01-28 19:28:48 +01:00