Commit Graph

  • c583113d66 [Ax] Align optimization mode and reported SEM with Ax (#13611) Lena Kashtelyan 2021-01-28 13:01:51 -05:00
  • b01b0f80aa [RLlib] Fix multiple Unity3DEnvs trying to connect to the same custom port (#13519) Yuri Rocha 2021-01-28 21:28:08 +09:00
  • d4ef5c5993 [RLlib] Atari-RAM-Preprocessing, unsigned observation vector results in a false preprocessed observation (#13013) cathrinS 2021-01-28 12:07:00 +01:00
  • 56ee6ef55f [GCS]only update states related fields when publish actor table data (#13448) Tao Wang 2021-01-28 11:12:57 +08:00
  • cb95ff1e56 [Serve] Add "endpoint registered" message to router log (#13752) architkulkarni 2021-01-27 19:03:15 -08:00
  • 4f1f558802 [Core] Hotfix Windows Compilation Error for ClusterTaskManager (#13754) Simon Mo 2021-01-27 19:01:56 -08:00
  • c10abbb1bb Revert "[Serve] Fix ServeHandle serialization (#13695)" (#13753) Simon Mo 2021-01-27 17:47:42 -08:00
  • 2e01d5d26e Report failed deserialization of errors in Ray client Eric Liang 2021-01-27 17:37:50 -08:00
  • 0e7343ec19 [docs] Fix MLflow / Tune example in documentation (#13740) Zhe Zhang 2021-01-27 17:16:29 -08:00
  • 40234ad631 [autoscaler][AWS] Make sure subnets belong to same VPC as user-specified security groups (#13558) Dmitri Gekhtman 2021-01-27 17:00:52 -08:00
  • 28cf5f91e3 [docs] change MLFlow to MLflow in docs (#13739) architkulkarni 2021-01-27 16:53:15 -08:00
  • 25fa391193 [Core] Add private on_completed callback for ObjectRef (#13688) Simon Mo 2021-01-27 16:32:00 -08:00
  • 32ec0d205f [Object Spilling] Remove job id from the io worker log name. (#13746) SangBin Cho 2021-01-27 16:26:32 -08:00
  • bdf0c00989 Revert "Revert "[CLI] Fix Ray Status with ENV Variable set (#13707) (#13726) Ian Rodney 2021-01-27 15:33:33 -08:00
  • c0fe816466 [Core/Autoscaler] Properly clean up resource backlog from (#13727) Alex Wu 2021-01-27 15:30:58 -08:00
  • 3644df415a [CI] Add retry to java doc test (#13743) Simon Mo 2021-01-27 14:18:06 -08:00
  • 56a9523020 Fix high CPU usage in object manager due to O(n^2) iteration over active pulls list (#13724) Eric Liang 2021-01-27 14:02:22 -08:00
  • c5209e2dab [Docker] default to /home/ray (#13738) Ian Rodney 2021-01-27 13:46:07 -08:00
  • b4bcb9b60a [Docker] Use Cuda 11 (#13691) Ian Rodney 2021-01-27 13:45:30 -08:00
  • eba698d48e Remove docs for install-nightly (#13744) Eric Liang 2021-01-27 13:10:45 -08:00
  • 202fbdf38c [Serve] Fix ServeHandle serialization (#13695) architkulkarni 2021-01-27 12:11:31 -08:00
  • 06fac785b8 [serve] Fix whacky worker replica failure test (#13696) Edward Oakes 2021-01-27 14:05:37 -06:00
  • 2d34e95c93 Don't gather check_parent_task on Windows, since it's undefined. (#13700) Clark Zinzow 2021-01-27 10:19:58 -07:00
  • c5b645e3da [tune] add type hints to tune.run(), fix abstract methods of ProgressReporter (#13684) Kai Fricke 2021-01-27 16:43:50 +01:00
  • 2664a2a8f6 [tune] fix non-deterministic category sampling by switching back to np.random.choice (#13710) Kai Fricke 2021-01-27 16:42:44 +01:00
  • 7f6d326ad8 [Placement Group]Add detached support for placement group. (#13582) DK.Pino 2021-01-27 18:51:26 +08:00
  • d2963f4ee1 [Object Spilling] Clean up FS storage upon sigint for ray.init(). (#13649) SangBin Cho 2021-01-26 23:10:29 -08:00
  • 8baafacb1e [Logging] Log rotation config (#13375) SangBin Cho 2021-01-26 20:15:55 -08:00
  • 9cf0c49015 [CI] Skip test_multi_node_3 on Windows (#13723) Simon Mo 2021-01-26 16:12:13 -08:00
  • 4db0a31130 [Core] Better error if /dev/shm is too small (#13624) Ian Rodney 2021-01-26 15:26:45 -08:00
  • 4f4e1b664b Fix multiprocessing starmap to allow passing in zip (#13664) Rand Xie 2021-01-26 14:15:35 -08:00
  • 2f482193b9 Revert "[CLI] Fix Ray Status with ENV Variable set (#13707)" (#13719) Simon Mo 2021-01-26 14:14:51 -08:00
  • ab6a634a94 [Serve] Revert "Revert "[Serve] Refactor BackendState" (#13626) (#13697) Ian Rodney 2021-01-26 13:31:01 -08:00
  • f490e2be43 [ray_client] Fix and extend get_actor test to detached actors (#13016) Barak Michener 2021-01-26 13:19:51 -08:00
  • 6b477dd37a [CI] Split test_multi_node to avoid timeouts (#13712) Amog Kamsetty 2021-01-26 12:06:19 -08:00
  • 0c46d09940 [ray_client]: Monitor client stream errors (#13386) Barak Michener 2021-01-26 10:56:56 -08:00
  • 5d82654022 [CLI] Fix Ray Status with ENV Variable set (#13707) Ian Rodney 2021-01-26 10:29:42 -08:00
  • ddcbd229ba Rename the ray.operator module to ray.ray_operator (#13705) Dmitri Gekhtman 2021-01-26 10:29:07 -08:00
  • 4aff86bfa7 [CI] skip failing java tests (#13702) Amog Kamsetty 2021-01-26 10:17:58 -08:00
  • 5d882b062d [Serve] fix k8s doc (#13713) Edward Oakes 2021-01-26 12:09:13 -06:00
  • 148b1022d6 [tune](deps): Bump autogluon-core in /python/requirements (#13698) dependabot[bot] 2021-01-26 11:32:56 +01:00
  • ef1f7e4d42 [tune](deps): Bump smart-open[s3] in /python/requirements (#13699) dependabot[bot] 2021-01-26 11:32:17 +01:00
  • 7a78f4e959 [Collective][PR 4/6] NCCL Communicator caching and preliminary stream management (#13030) Hao Zhang 2021-01-26 04:05:21 -05:00
  • c589de6bc8 Version bump Alex Wu 2021-01-25 19:37:09 -08:00
  • 840987c7af Scalability Envelope Tests (#13464) Alex Wu 2021-01-25 18:48:31 -08:00
  • f2867b0609 [CI] Remove object_manager_test (#13703) Simon Mo 2021-01-25 17:33:41 -08:00
  • fe8262afd0 Add K8s test to release process (#13694) Simon Mo 2021-01-25 16:53:52 -08:00
  • 8b8d6b984b [Buildkite] Add all Python tests (#13566) Simon Mo 2021-01-25 16:05:59 -08:00
  • 0d75f37c1f [tune](deps): Bump distributed in /python/requirements (#13643) dependabot[bot] 2021-01-26 00:03:38 +01:00
  • 9feae90e3b skip test_spill (#13693) Amog Kamsetty 2021-01-25 14:37:07 -08:00
  • d96a9fa192 Revert "Revert "[dashboard] Fix RAY_RAYLET_PID KeyError on Windows (#12948)" (#13572)" (#13685) Amog Kamsetty 2021-01-25 10:35:25 -08:00
  • 1c77cc7e23 [docs] Remove API warning from mp.Pool (#13683) Edward Oakes 2021-01-25 11:59:46 -06:00
  • 79209110c5 [kubernetes][operator][hotfix] Dictionary fix (#13663) Dmitri Gekhtman 2021-01-25 08:40:59 -08:00
  • f9f2bfa778 [Metric] Fix crashed when register metric view in multithread (#13485) Lingxuan Zuo 2021-01-25 20:32:08 +08:00
  • db2c836587 [Placement Group] Move PlacementGroup public method to interface. (#13629) DK.Pino 2021-01-25 20:14:21 +08:00
  • b4702de1c2 [RLlib] move evaluation to trainer.step() such that the result is properly logged (#12708) Maltimore 2021-01-25 12:56:00 +01:00
  • 964689b280 [RLlib] Fix bug in ModelCatalog when using custom action distribution (#12846) Jan Blumenkamp 2021-01-25 11:42:39 +00:00
  • 9423930bcc [RLlib] MAML: Add cartpole mass test for PyTorch. (#13679) Sven Mika 2021-01-25 12:32:41 +01:00
  • e9103eeb6d [Java] [Test] Move multi-worker config to ray.conf file (#13583) Kai Yang 2021-01-25 18:07:45 +08:00
  • 4dabf017ee Close #12031 (Autoscaler is overriding your resource for same quantity) (#13671) Ameer Haj Ali 2021-01-25 02:31:53 +02:00
  • edbb2937d3 [Object Spilling] Multi node file spilling V2. (#13542) SangBin Cho 2021-01-23 23:15:32 -08:00
  • e675e5b75a [ray_client]: Add more retry logic (#13478) Barak Michener 2021-01-23 23:11:39 -08:00
  • b7dd7ddb52 deprecate useless fields in the cluster yaml. (#13637) Ameer Haj Ali 2021-01-23 22:06:51 +02:00
  • 17760e1510 [tune] update Optuna integration to 2.4.0 API (#13631) Kai Fricke 2021-01-23 09:32:37 +01:00
  • 8ef835ff03 Remove idle actor from worker pool. (#13523) Qing Wang 2021-01-23 13:57:30 +08:00
  • 01d74af89d [horovod] Horovod+Ray Pytorch Lightning Accelerator (#13458) Amog Kamsetty 2021-01-22 16:30:10 -08:00
  • 25e1b78eed [Dependencies] Move requirements.txt to requirements directory. (#13636) Amog Kamsetty 2021-01-22 16:29:05 -08:00
  • 0c3d9a3eaa [Metrics] Fix serialization for custom metrics (#13571) architkulkarni 2021-01-22 12:11:59 -08:00
  • c4a710369b Revert "[dashboard] Fix RAY_RAYLET_PID KeyError on Windows (#12948)" (#13572) Amog Kamsetty 2021-01-22 12:10:24 -08:00
  • 7fec19dad2 [kubernetes][operator][minutiae] Backwards compatibility of operator (#13623) Dmitri Gekhtman 2021-01-22 12:07:25 -08:00
  • d629292d63 [RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. (#13634) Sven Mika 2021-01-22 19:36:02 +01:00
  • da5928304a [Metrics] Cache metrics ports in a file at each node (#13501) architkulkarni 2021-01-22 09:59:20 -08:00
  • 90f1e408de [Java] Add fetchLocal parameter in Ray.wait() (#13604) Kai Yang 2021-01-22 17:55:00 +08:00
  • 00c14ce4a4 [Object Spilling] Skip flaky tests (#13628) Amog Kamsetty 2021-01-22 00:31:33 -08:00
  • 39755fdb20 Revert "[Serve] Refactor BackendState" (#13626) Amog Kamsetty 2021-01-21 23:06:15 -08:00
  • aa5d7a5e6c [Dashboard]Don't set node actors when node_id of actor is Nil (#13573) Tao Wang 2021-01-22 12:18:34 +08:00
  • 4ecd29ea2b [dashboard] Fixes dashboard issues when environments have set http_proxy (#12598) Xianyang Liu 2021-01-22 12:10:01 +08:00
  • 1fbb752f42 [autoscaler] remove worker_default_node_type that is useless. (#13588) Ameer Haj Ali 2021-01-22 03:04:38 +02:00
  • 4e01a9ec38 [Autoscaler] Ensure ubuntu is owner of docker host mount folder (#13579) Nikita Vemuri 2021-01-21 17:01:55 -08:00
  • 0998d69968 [core] Admission control for pulling objects to the local node (#13514) Stephanie Wang 2021-01-21 16:46:42 -08:00
  • ccc901f662 add 3.8 (#13608) Amog Kamsetty 2021-01-21 16:38:51 -08:00
  • 20acc3b05e Revert "Inline small objects in GetObjectStatus response. (#13309)" (#13615) Amog Kamsetty 2021-01-21 16:10:34 -08:00
  • 87ca102c93 [Kubernetes] Unit test for cluster launch and teardown using K8s Operator (#13437) Dmitri Gekhtman 2021-01-21 10:00:37 -08:00
  • 68038741ac [serve] Refactor BackendState to use ReplicaState classes (#13406) Ian Rodney 2021-01-21 09:16:02 -08:00
  • a82fa80f7b Inline small objects in GetObjectStatus response. (#13309) Clark Zinzow 2021-01-21 10:15:18 -07:00
  • 92f1e0902e [Java] Fix return of java doc (#13601) Kai Yang 2021-01-21 23:57:20 +08:00
  • 587f207c2f [RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) Michael Luo 2021-01-21 07:43:55 -08:00
  • d11e62f9e6 [RLlib] Fix problem in preprocessing nested MultiDiscrete (#13308) Saeid 2021-01-21 15:36:11 +00:00
  • daf0bef285 [RLlib] Dreamer: Fix broken import and add compilation test case. (#13553) Sven Mika 2021-01-21 16:30:26 +01:00
  • b9ac3878ae [Autoscaler] Display node status tag in autsocaler status (#13561) Alex Wu 2021-01-20 19:20:54 -08:00
  • a09997dc9e [Core] Remove 'PlasmaBuffer' in the buffer header (#13188) Siyuan (Ryans) Zhuang 2021-01-20 12:01:44 -08:00
  • b796de4104 [metrics] Check that all tag_keys are set when recording (#13420) Edward Oakes 2021-01-20 13:09:44 -06:00
  • fd6882176a Fix for operator role definition to add raycluster/finalizer (#13567) dmatch01 2021-01-20 14:02:02 -05:00
  • 8804758409 [xgboost] Add XGBoost release tests (#13456) Kai Fricke 2021-01-20 18:40:23 +01:00
  • e6412efdf5 Extra fix ray client newline (#13577) Eric Liang 2021-01-20 09:23:14 -08:00
  • 2e7c2b774f [Core] add thread name to help performance profiling (#13506) ZhuSenlin 2021-01-20 20:34:28 +08:00
  • 6c23bef2a7 [tune] Allow actor reuse for new trials (#13549) Kai Fricke 2021-01-20 11:25:33 +01:00
  • 800304acfb [tune] wandb - WandbLogger now also accepts wandb.data_types.Video (#13169) Daan Klijn 2021-01-20 10:19:54 +01:00
  • d0f224d5cf Revert "Pipe monitor.err logs to driver" (#13574) Eric Liang 2021-01-20 00:29:19 -08:00
  • b2a6e55289 [GCS]Only publish fileds used by sub clients in WorkerTableData (#13508) Tao Wang 2021-01-20 16:14:59 +08:00