Commit Graph

6812 Commits

Author SHA1 Message Date
fangfengbin e261b4778e Adjust the state initialization sequence and put it after core worker google logging initialization (#8511) 2020-05-21 11:30:28 +08:00
Simon Mo ed2f434593 [Serve] Start Replicas in Parallel (#8433) 2020-05-20 19:46:03 -07:00
Edward Oakes a76434ccde Add ability to specify worker and driver ports (#8071) 2020-05-20 15:31:13 -05:00
Sven Mika d76578700d [RLlib] Policy.compute_single_action() broken for nested actions (Issue 8411). (#8514) 2020-05-20 22:29:08 +02:00
mehrdadn ebf060d484 Make more tests run on Windows (#8446)
* Remove worker Wait() call due to SIGCHLD being ignored

* Port _pid_alive to Windows

* Show PID as well as TID in glog

* Update TensorFlow version for Python 3.8 on Windows

* Handle missing Pillow on Windows

* Work around dm-tree PermissionError on Windows

* Fix some lint errors on Windows with Python 3.8

* Simplify torch requirements

* Quiet git clean

* Handle finalizer issues

* Exit with the signal number

* Get rid of wget

* Fix some Windows compatibility issues with tests

Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Eric Liang aa7a58e92f [rllib] Support training intensity for dqn / apex (#8396) 2020-05-20 11:22:30 -07:00
Ian Rodney f56b3be916 [Docs] Add Cancelation to main docs. (#8508)
* Update walkthrough.rst

* Adding example

* Better example

* Better example

* Adding Ray Kill Info
2020-05-20 10:31:57 -07:00
Lingxuan Zuo cd706f40c4 [Stats] add nodeaddress tag for stats test (#8423) 2020-05-20 12:30:01 -05:00
Luca Cappelletti c9898eff24 [Tune] Added method to integrate previous analysis in BO (#8486) 2020-05-19 23:26:43 -07:00
Bill Chambers f8f7efc24f [Serve] Rename RayServe -> "Ray Serve" in Documentation (#8504) 2020-05-19 19:13:54 -07:00
Edward Oakes 85cb721f19 [serve] Fix worker replica leak (#8506) 2020-05-19 20:51:50 -05:00
Simon Mo c9c84c87f4 [Serve] Add Instructions for GPU (#8495) 2020-05-19 18:33:58 -07:00
Ian Rodney 1163ddbe45 Remove timeouts in test_cancel (#8272) 2020-05-19 12:35:16 -05:00
mehrdadn 8da084bc54 Try to address linting issues (#8485) 2020-05-19 10:29:17 -05:00
internetcoffeephone a73c488c74 Change tf_utils.py get_weights to evaluate all tensors at once rather than calling tensor.eval per-tensor. (#8491) 2020-05-18 22:06:03 -07:00
Hao Chen 6c5ea32857 Fix installing pickle5-backport for Python 3.8.2 (#8453) 2020-05-18 17:03:13 -07:00
Luca Cappelletti 5b330de182 [Tune] Introduced patience to early stopping (#8484) 2020-05-18 13:12:16 -07:00
Luca Cappelletti d1ef70da16 [Tune] Added default values for utility kwargs (#8488) 2020-05-18 13:10:43 -07:00
Robert Nishihara 14aeb30473 [Serve] Require traffic weights to sum more closely to 1. (#8476) 2020-05-18 11:46:34 -07:00
Max Fitton 0fadc11437 [dashboard] Only show workers from the correct cluster (#8434) 2020-05-18 13:30:41 -05:00
Max Fitton 13231ba63b Rename redis-port to port and add default (#8406) 2020-05-18 13:25:34 -05:00
Robert Nishihara 2cff471d2c Don't print Redis connection warning in ray.init(). (#8475) 2020-05-18 11:19:13 -07:00
Richard Liaw b6c4f45ae0 [tune] Fix links (#8477) 2020-05-18 10:08:29 -07:00
Edward Oakes 9a721ed71a Link to serve in tune overview (#8487) 2020-05-18 11:29:38 -05:00
Sven Mika 796a834c48 [RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371) 2020-05-18 17:26:40 +02:00
fangfengbin 9347a5d10c Add global state accessor of jobs (#8401) 2020-05-18 20:32:05 +08:00
Dennis van der Hoff be1f158747 Added Done to MultiAgentExternalEnv. (#8478)
Co-authored-by: devanderhoff <devanderhoff@hotmail.com>
2020-05-17 16:29:47 -07:00
Richard Liaw 87cbf2aedd [docs][tune] Make search algorithm, scheduler docs better! (#8179) 2020-05-17 12:19:44 -07:00
Luca Cappelletti 2ff26f13d2 [tune] Added EarlyStopping and relative test suite (#8459) 2020-05-17 12:18:59 -07:00
Joseph Lucas 42c9fa19d1 [autoscaler] Ray Up url-arg (#8279) 2020-05-17 12:18:00 -07:00
SangBin Cho 2f01776d09 Fix ray memory example (#8462) 2020-05-17 11:34:11 -05:00
Edward Oakes 16f48078d9 Remove use of ObjectID transport flag (#7699) 2020-05-17 11:29:49 -05:00
Tao Wang acffdb2349 [TEST]use cc_test to run core_worker_test, enforce/reuse RedisServiceManagerForTest (#8443) 2020-05-17 18:43:00 +08:00
Edward Oakes fb23bd6fc0 [serve] Optionally namespace serve clusters (#8447) 2020-05-17 00:14:42 -05:00
Richard Liaw 67c01455fe [tune] tune.track -> tune.report (#8388) 2020-05-16 12:55:08 -07:00
mehrdadn c8cd716295 Restore unset -f cd (#8467)
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-16 09:54:59 -07:00
Stephanie Wang bd169749e0 Option to retry failed actor tasks (#8330)
* Python

* Consolidate state in the direct actor transport, set the caller starts at

* todo

* Remove unused

* Update and unit tests

* Doc

* Remove unused

* doc

* Remove debug

* Update src/ray/core_worker/transport/direct_actor_transport.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* Update src/ray/core_worker/transport/direct_actor_transport.cc

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* lint and fix build

* Update

* Fix build

* Fix tests

* Unit test for max_task_retries=0

* Fix java?

* Fix bad test

* Cross language fix

* fix java

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-05-15 20:15:15 -07:00
Robert Nishihara 41d8c2bd0a [CI] Don't uninstall ruby in Travis. (#8463) 2020-05-15 18:10:55 -07:00
SangBin Cho 1b734ba045 Pin sklearn version (#8465) 2020-05-15 16:54:54 -07:00
Edward Oakes ef498e8aa5 [serve] Add basic session affinity via shard key (#8449) 2020-05-15 16:18:52 -05:00
Sven Mika c9435cad43 WIP. (#8456)
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
krfricke 4633d81c39 [tune] added average scope to experiment analysis (#8445) 2020-05-14 15:20:43 -07:00
Edward Oakes ef20564d8e [serve] Pin http proxy to the node that serve.init() is run on (#8436) 2020-05-14 16:38:29 -05:00
Max Fitton 00325eb2b2 Rename max_reconstructions to max_restarts and use -1 for infinite (#8274)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-05-14 10:30:29 -05:00
Sven Mika 5f4c196fed [RLlib] Make PyTorch Model forward pass faster in vf-case. (#8422) 2020-05-14 10:15:50 +02:00
Hao Chen 212f78f735 Small improvements on build.sh (#8418) 2020-05-14 15:30:09 +08:00
fangfengbin 08b612052b Add redis store client AsyncGetAll/AsyncBatchDelete/AsyncDeleteByIndex API (#8390) 2020-05-14 14:38:25 +08:00
Eric Liang eabb801a40 less important (#8439) 2020-05-13 22:52:38 -07:00
Simon Mo 122353c392 [Serve] Fix SKLearn example against newest version (#8428) 2020-05-13 14:09:57 -07:00
Eric Liang 6bf1dc0888 [rllib] [hotfix] Build broken due to merge conflict: MixInReplay has no attribute buffer 2020-05-13 12:21:04 -07:00