Commit Graph

16 Commits

Author SHA1 Message Date
Robert Nishihara 8548f12eb2 Give better error when include_webui=1 and webui can't be started. (#4471) 2019-03-26 14:54:32 -07:00
Robert Nishihara 9c158c6a87 Start dashboard on all nodes and other small fixes. (#4428)
* Start reporter on all nodes.

* More fixes
2019-03-20 13:04:06 -07:00
Philipp Moritz 95254b3d71 Remove the old web UI (#4301) 2019-03-07 23:15:11 -08:00
Robert Nishihara f21e6a2cff Update documentation regarding UI and timeline. (#4189) 2019-03-01 19:54:33 -08:00
Wang Qing db5c3b22b7 Fix the issue about starting cross-lang cluster (#4176) 2019-02-27 20:11:58 +08:00
Daniel Edgecumbe 2e30f7ba38 Add a web dashboard for monitoring node resource usage (#4066) 2019-02-21 00:10:04 -08:00
Yuhong Guo 1f864a02bc Add option of load_code_from_local which is required in cross-language ray call. (#3675) 2019-02-21 12:37:17 +08:00
Robert Nishihara e7651b1117 Fix excessive buffering of worker stdout/stderr. (#4094)
* Start workers with 'python -u' to prevent buffering of prints.

* Set sys.stdout and sys.stderr.

* Add comment.
2019-02-19 20:20:47 -08:00
Robert Nishihara 5f71751891 API cleanups. Remove worker argument. Remove some deprecated arguments. (#4025)
* Remove worker argument from API methods.

* Remove deprecated arguments and deprecate redirect_output and redirect_worker_output.

* Fix
2019-02-15 10:49:16 -08:00
Si-Yuan 2de31eb489 minor fix (#4040) 2019-02-13 17:22:45 -08:00
Si-Yuan 21472b890a Integrate "tempfile_service" into "ray.node.Node" (#3953) 2019-02-12 17:34:04 -08:00
Robert Nishihara ef527f84ab Stream logs to driver by default. (#3892)
* Stream logs to driver by default.

* Fix from rebase

* Redirect raylet output independently of worker output.

* Fix.

* Create redis client with services.create_redis_client.

* Suppress Redis connection error at exit.

* Remove thread_safe_client from redis.

* Shutdown driver threads in ray.shutdown().

* Add warning for too many log messages.

* Only stop threads if worker is connected.

* Only stop threads if they exist.

* Remove unnecessary try/excepts.

* Fix

* Only add new logging handler once.

* Increase timeout.

* Fix tempfile test.

* Fix logging in cluster_utils.

* Revert "Increase timeout."

This reverts commit b3846b89040bcd8e583b2e18cb513cb040e71d95.

* Retry longer when connecting to plasma store from node manager and object manager.

* Close pubsub channels to avoid leaking file descriptors.

* Limit log monitor open files to 200.

* Increase plasma connect retries.

* Add comment.
2019-02-07 19:53:50 -08:00
Wang Qing e1c68a0881 Enable including Java worker for ray start command (#3838) 2019-02-04 16:23:43 +08:00
Yuhong Guo 54cbb4396f Prepare socket file when start ray (#3925) 2019-02-02 12:53:36 +08:00
Robert Nishihara 0b1608a546 Factor out code for starting new processes and test plasma store in valgrind. (#3824)
* Factor out starting Ray processes.

* Detect flags through environment variables.

* Return ProcessInfo from start_ray_process.

* Print valgrind errors at exit.

* Test valgrind in travis.

* Some valgrind fixes.

* Undo raylet monitor change.

* Only test plasma store in valgrind.
2019-01-22 14:59:11 -08:00
Robert Nishihara 8723d6b061 Define a Node class to manage Ray processes. (#3733)
* Implement Node class and move most of services.py into it.

* Wait for nodes as they are added to the cluster.

* Fix Redis authentication bug.

* Fix bug in client table ordering.

* Address comments.

* Kill raylet before plasma store in test.

* Minor
2019-01-11 22:30:38 -08:00