Commit Graph

1169 Commits

Author SHA1 Message Date
Ion 88e14feb53 Reset signal counters when a task finishes (#4173) 2019-02-28 15:15:03 -08:00
Robert Nishihara 9c5fdbb63c Release gil when doing ray.wait. (#4190) 2019-02-28 00:32:07 -08:00
Robert Nishihara 387c98cf01 Make sure dashboard is packaged with wheels. (#4175) 2019-02-27 18:36:49 -08:00
Ion 7395c86a50 A few fixes in receive() signal. (#4142) 2019-02-27 18:00:59 -08:00
Philipp Moritz 9ca9691cdc Fix mnist sgd jenkins tests on master (#4168) 2019-02-27 16:02:18 -08:00
Robert Nishihara 75504b9586 Add script for running infinitely long stress tests. (#4163)
Running `./ci/long_running_tests/start_workloads.sh` will start several workloads running (each in their own EC2 instance).
- The workloads run forever.
- The workloads all simulate multiple nodes but use a single machine.
- You can get the tail of each workload by running `./ci/long_running_tests/check_workloads.sh`.
- You have to manually shut down the instances.

As discussed with @ericl @richardliaw, the idea here is to optimize for the debuggability of the tests. If one of them fails, you can ssh to the relevant instance and see all of the logs.
2019-02-27 14:33:06 -08:00
Yuhong Guo 41b81af11b Downgrade six to 1.0.0 (#4180) 2019-02-27 13:05:25 -08:00
Yuhong Guo 0a11b27971 Fix the case of use decorator directly to raw class and add test case (#4177) 2019-02-28 00:09:42 +08:00
Wang Qing db5c3b22b7 Fix the issue about starting cross-lang cluster (#4176) 2019-02-27 20:11:58 +08:00
Richard Liaw 5bfcfa8ec8 [autoscaler] Fix Submit (#4174) 2019-02-27 00:02:50 -08:00
Hao Chen d583edb07c Skip test_multithreading in Python 2 (#4107) 2019-02-27 14:06:12 +08:00
Adi Zimmerman 5cf388f29d [tune] Support RESTful API for the Web Server (#4080)
Change the client/server API to RESTful design. This includes resource modeling, model URI's, and correct HTTP methods.
2019-02-26 21:56:02 -08:00
justinwyang 19b8793b6a Updated test script paths in documentation (#4170) 2019-02-26 16:14:55 -08:00
John Liagouris 89ce4c56aa Initial Skeleton for Streaming API (#4126) 2019-02-26 12:15:08 -08:00
Hao Chen 62055cc01c Cleanup depulicated code of Cython ID types (#4162) 2019-02-26 16:19:12 +08:00
Eric Liang 60dbc771a2 Revert "[autoscaler] Fix redirects, fix submit (#4085)" (#4158)
This reverts commit acf4d53b55.
2019-02-25 17:00:59 -08:00
Eric Liang 3896b726dd Dynamically adjust redis memory usage (#4152)
* f

* Update services.py
2019-02-25 16:21:37 -08:00
Hao Chen 49dc85e54b Fix wrong ID type in prepare_checkpoint (#4124)
* Fix wrong ID type in prepare_checkpoint

* fix

* fix eq
2019-02-25 11:53:09 -08:00
Kristian Hartikainen 524e69a82d [autoscaler] Change the get behavior of node providers' _get_node (#4132)
* Change the get behavior of GCPNodeProvider._get_node

* Add lock around the GCPNodeProvider._get_node call

* rename nodes

* lint

* Update GCPNodeProvider._get_node to match aws implementation

* assert

* log

* log highest heartbeats

* rename

* bringup to connected

* prune heartbeat times

* fix bringup
2019-02-24 18:43:35 -08:00
Eric Liang d9da183c7d [rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
Robert Nishihara 7b04ed059e Move TensorFlowVariables to ray.experimental.tf_utils. (#4145) 2019-02-24 14:26:46 -08:00
Eric Liang 05d96ce81b [rllib] Raise an error if multi-agent envs terminate without a last observation for agents (#4139)
* fix it

* lint

* Update rllib-training.rst
2019-02-23 21:23:40 -08:00
Robert Nishihara 688a0d17e6 Kill dashboard and reporter in ray stop. (#4116) 2019-02-23 12:08:39 -08:00
Philipp Moritz ba52caff37 Make Bazel the default build system (#3898) 2019-02-23 11:58:59 -08:00
Philipp Moritz 9b3ce3e64b Revert inline objects PR (#4125)
* Revert "Inline objects (#3756)"

This reverts commit f987572795.

* fix rebase problems

* more rebase fixes

* add back debug statement
2019-02-22 18:21:01 -08:00
Eric Liang 9896df7799 [rllib] Guard against PPO value function not training with RNN models (#4037)
* better lstm settings

* 1.0

* docs

* warn on truncate

* clarify

* Update ppo_policy_graph.py

* Update ppo_policy_graph.py

* Update ppo_policy_graph.py
2019-02-22 11:18:51 -08:00
Zachary Barry ae4dd1db76 Custom provider_config options for NodeProvider implementations (#4075)
* added a key to send custom provider_config options to NodeProvider implementations

* Update autoscaler.py

* Update autoscaler.py
2019-02-21 21:09:22 -08:00
Stefan Pantic a54386e499 Added custom LSTM detection (#4087)
* Added autodetection of custom LSTM usage

* Reverted line separators

* Added check for LSTM

* Update vtrace_policy_graph.py

* Update appo_policy_graph.py
2019-02-21 21:07:48 -08:00
William Ma fedad488d8 Kills gdb processes with ray stop (#4046) 2019-02-21 11:28:26 -08:00
William Ma c7a4c74f55 Moving tests from test/ to python/ray/tests/ (#3950) 2019-02-21 11:09:08 -08:00
Jones Wong acbe0b4e5f Fix twin q bug (#4108) 2019-02-21 10:47:01 -08:00
Tianming Xu 94eaaed197 [rllib]convert export format to lower case while validating (#4088)
* convert export format to lower case while validating

* fix lint error
2019-02-21 10:40:28 -08:00
Daniel Edgecumbe 2e30f7ba38 Add a web dashboard for monitoring node resource usage (#4066) 2019-02-21 00:10:04 -08:00
Jones Wong 3ac8fd7ee8 Exploration with Parameter Space Noise (#4048)
*  enable parameter space noise for exploration

*  enable parameter space noise for exploration

*  yapf formatted

*  remove the usage of scipy softmax avialable in the latest version only

*  enable subclass that has no parameter_noise in the config

*  run user specified callbacks and test parameter space noise in multi node setting

*  formatted by yapf

* Update dqn.py

* lint
2019-02-20 22:35:18 -08:00
Philipp Moritz bcd5af78c7 Lint Cython files (#4097) 2019-02-20 22:29:25 -08:00
Richard Liaw acf4d53b55 [autoscaler] Fix redirects, fix submit (#4085) 2019-02-20 21:35:33 -08:00
Yuhong Guo 3549cd8195 Add the Delete function in GCS (#4081)
* Add the Delete function in GCS

* Unify BatchDelete and Delete

* Fix comment

* Lint

* Refine according to comments

* Unify test.

* Address comment

* C++ lint

* Update ray_redis_module.cc
2019-02-21 13:33:37 +08:00
Yuhong Guo 1f864a02bc Add option of load_code_from_local which is required in cross-language ray call. (#3675) 2019-02-21 12:37:17 +08:00
Eric Liang e3066d1fa5 [autoscaler] Try making GCP node provider thread-safe 2019-02-20 16:35:20 -08:00
Csordás Róbert b2677fabc0 [tune] Fix not saving a checkpoint in certain cases (issue #4041) (#4053)
## What do these changes do?

It saves checkpoint if needed regardless of what the scheduler have returned. Until now, it have not saved the checkpoint when scheduler returned TrialScheduler.PAUSE, which caused PopulationBasedTraining preventing to save any checkpoints in certain cases. See issue #4041 for more details.

## Related issue number
#4041
2019-02-20 11:54:28 -08:00
mika 64c95aea85 [rllib] Update README.md for qmix (#4101)
## What do these changes do?

Fixed PyMARL repository path.

## Related issue number

N/A
2019-02-20 10:21:08 -08:00
Robert Nishihara e7651b1117 Fix excessive buffering of worker stdout/stderr. (#4094)
* Start workers with 'python -u' to prevent buffering of prints.

* Set sys.stdout and sys.stderr.

* Add comment.
2019-02-19 20:20:47 -08:00
Eric Liang e9ee38ace2 More compact format for worker logs (#4092) 2019-02-19 19:53:43 -08:00
Robert Nishihara c92a867c8b Fix log monitor CPU utilization. (#4091) 2019-02-19 12:19:21 -08:00
Wang Qing 794a093249 Add runtime_context to get some runtime fields in worker (#4065) 2019-02-19 15:57:30 +08:00
Wang Qing 7574757391 Fix crash for Java task's task.argument() in state. (#4063) 2019-02-19 12:46:07 +08:00
Philipp Moritz cfc7e2c5a9 Fix modin test (#4069) 2019-02-18 12:17:36 -08:00
Eric Liang 6e46d75554 [tune] Remove slow gzip of checkpoints; ignore jupyter stop errors (#4076)
* fix gzip

* ignore jupyter
2019-02-18 01:30:13 -08:00
Eric Liang f8bef004da [rllib] Improve error message for bad envs, add remote env docs (#4044)
* commit

* fix up rew
2019-02-18 01:28:19 -08:00
Philipp Moritz f51969964d Fix linting on master (#4077) 2019-02-17 13:55:40 -08:00