Commit Graph

29 Commits

Author SHA1 Message Date
Alex Wu 46962f5db1 [Core] Log monitor multidriver (#8953) 2020-06-25 11:05:53 -07:00
mehrdadn f93bb008bb Change os.uname()[1] and socket.gethostname() to the portable and faster platform.node_ip() (#8839)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-08 21:29:46 -07:00
mehrdadn 254b1ec370 Set up testing and wheels for Windows on GitHub Actions (#8131)
* Move some Java tests into ci.sh

* Move C++ worker tests into ci.sh

* Define run()

* Prepare to move Python tests into ci.sh

* Fix issues in install-dependencies.sh

* Reload environment for GitHub Actions

* Move wheels to ci.sh and fix related issues

* Don't bypass failures in install-ray.sh anymore

* Make CI a little quieter

* Move linting into ci.sh

* Add vitals test right after build

* Fix os.uname() unavailability on Windows

Co-authored-by: Mehrdad <noreply@github.com>
2020-04-29 21:19:02 -07:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara 39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Eric Liang 09968a3c55 [revert] Disable monitor error logging to stdout #5692 2019-09-14 22:32:48 -07:00
Eric Liang 2fdefe19b7 Take into account queue length in autoscaling (#5684) 2019-09-11 11:31:35 -07:00
Philipp Moritz d0125d4212 Fix log monitor process error when attempting to read raylet PID. (#5655) 2019-09-07 15:51:09 -07:00
Philipp Moritz 04b869678e Fix O(n^2) behavior in the log_monitor (#5569) 2019-08-29 14:46:31 -07:00
Philipp Moritz ddfababb82 Fix log files being opened as unicode files (#5545) 2019-08-27 12:47:00 -07:00
Eric Liang 0a3ff489fa Send raylet error logs through the log monitor (#5351) 2019-08-05 23:35:09 -07:00
Xianyang Liu 3ae54a2b20 Fix log monitor read error (#5221) 2019-08-01 15:47:10 -07:00
Eric Liang e9ee38ace2 More compact format for worker logs (#4092) 2019-02-19 19:53:43 -08:00
Robert Nishihara c92a867c8b Fix log monitor CPU utilization. (#4091) 2019-02-19 12:19:21 -08:00
Robert Nishihara ef527f84ab Stream logs to driver by default. (#3892)
* Stream logs to driver by default.

* Fix from rebase

* Redirect raylet output independently of worker output.

* Fix.

* Create redis client with services.create_redis_client.

* Suppress Redis connection error at exit.

* Remove thread_safe_client from redis.

* Shutdown driver threads in ray.shutdown().

* Add warning for too many log messages.

* Only stop threads if worker is connected.

* Only stop threads if they exist.

* Remove unnecessary try/excepts.

* Fix

* Only add new logging handler once.

* Increase timeout.

* Fix tempfile test.

* Fix logging in cluster_utils.

* Revert "Increase timeout."

This reverts commit b3846b89040bcd8e583b2e18cb513cb040e71d95.

* Retry longer when connecting to plasma store from node manager and object manager.

* Close pubsub channels to avoid leaking file descriptors.

* Limit log monitor open files to 200.

* Increase plasma connect retries.

* Add comment.
2019-02-07 19:53:50 -08:00
Si-Yuan 9295ab8f60 Various Python code cleanups. (#3837) 2019-02-03 10:16:24 -08:00
Richard Liaw d128636bab Ray Logging Configuration (#3691)
* fix logging for autoscaler

* module logging

* try this for logging

* yapf

* fix

* Initial logging setup

* momery

* ok

* remove basicconfig

* catch

* remove package logging

* print

* fix

* try_fix

* fix 1

* revert rllib

* logging level

* flake8

* fix

* fix

* Remove vestigal TODO
2019-01-30 21:01:12 -08:00
Peter Schafhalter a41bbc10ef Add password authentication to Redis ports (#2952)
* Implement Redis authentication

* Throw exception for legacy Ray

* Add test

* Formatting

* Fix bugs in CLI

* Fix bugs in Raylet

* Move default password to constants.h

* Use pytest.fixture

* Fix bug

* Authenticate using formatted strings

* Add missing passwords

* Add test

* Improve authentication of async contexts

* Disable Redis authentication for credis

* Update test for credis

* Fix rebase artifacts

* Fix formatting

* Add workaround for issue #3045

* Increase timeout for test

* Improve C++ readability

* Fixes for CLI

* Add security docs

* Address comments

* Address comments

* Adress comments

* Use ray.get

* Fix lint
2018-10-16 22:48:30 -07:00
Yuhong Guo 0b6e08ebee Separate python logger module-wise (#2703)
## What do these changes do?
1. Separate the log related code to logger.py from services.py.
2. Allow users to modify logging formatter in `ray start`.

## Related issue number
https://github.com/ray-project/ray/pull/2664
2018-08-26 13:46:14 -07:00
Robert Nishihara b90e551b41 [xray] Implement timeline and profiling API. (#2306)
* Add profile table and store profiling information there.

* Code for dumping timeline.

* Improve color scheme.

* Push timeline events on driver only for raylet.

* Improvements to profiling and timeline visualization

* Some linting

* Small fix.

* Linting

* Propagate node IP address through profiling events.

* Fix test.

* object_id.hex() should return byte string in python 2.

* Include gcs.fbs in node_manager.fbs.

* Remove flatbuffer definition duplication.

* Decode to unicode in Python 3 and bytes in Python 2.

* Minor

* Submit profile events in a batch. Revert some CMake changes.

* Fix

* Workaround test failure.

* Fix linting

* Linting

* Don't return anything from chrome_tracing_dump when filename is provided.

* Remove some redundancy from profile table.

* Linting

* Move TODOs out of docstring.

* Minor
2018-07-04 23:23:48 -07:00
Richard Liaw 178346fa16 Printing messages to stderr (#2312)
Move core python code onto logging module.

Addressing #1884.
2018-07-02 16:10:57 -07:00
Melih Elibol bea97b425b Fix python linting (#2076) 2018-05-16 15:04:31 -07:00
Philipp Moritz 74162d1492 Lint Python files with Yapf (#1872) 2018-04-11 10:11:35 -07:00
Robert Nishihara 6a8661a014 Reduce verbosity of log monitor. (#1569) 2018-02-21 22:40:16 -08:00
Robert Nishihara e0867c8845 Switch Python indentation from 2 spaces to 4 spaces. (#726)
* 4 space indentation for actor.py.

* 4 space indentation for worker.py.

* 4 space indentation for more files.

* 4 space indentation for some test files.

* Check indentation in Travis.

* 4 space indentation for some rl files.

* Fix failure test.

* Fix multi_node_test.

* 4 space indentation for more files.

* 4 space indentation for remaining files.

* Fixes.
2017-07-13 21:53:57 +00:00
Robert Nishihara 8bc9c275fa Increase the number of log file names and handle errors better in log monitor. (#693) 2017-06-25 05:20:50 +00:00
alanamarzoev f0339f3386 Expose log files through global state API. (#641)
* added log_table function and a test

* fixed log_files and added task_profiles

* fixed formatting

* fixed linting errors

* fixes

* removed file

* more fixes

* hopefully fixed

* Small changes.

* Fix linting.

* Fix bug in log monitor.

* Small changes.

* Fix bug in travis.
2017-06-08 00:08:10 -07:00
Robert Nishihara ba02fc0eb0 Run flake8 in Travis and make code PEP8 compliant. (#387) 2017-03-21 12:57:54 -07:00
Robert Nishihara f1d4dda8cb Put all log files in redis and visualize them in UI. (#350)
* Start process for monitoring log files and push changes to redis.

* Display log files in UI.

* Bug fix for recent tasks.

* Use flatbuffers to parse local scheduler heartbeats.
2017-03-16 15:27:00 -07:00