Commit Graph

17 Commits

Author SHA1 Message Date
mehrdadn f93bb008bb Change os.uname()[1] and socket.gethostname() to the portable and faster platform.node_ip() (#8839)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-08 21:29:46 -07:00
Nick Matthews a9d8d16b6b Change memory monitor warning to a logging call (#8137) 2020-04-22 21:29:18 -07:00
yncxcw 51559c08b9 Fix mis-memory counting in memory monitor for contaienr environment (#8113)
Co-authored-by: weich <weich@nvidia.com>
2020-04-22 14:32:35 -07:00
Eric Liang 745b9d643d First pass at ray memory command for memory debugging (#7589) 2020-03-17 20:45:07 -07:00
ijrsvt 0826f95e1c Including psutil & setproctitle (#7031) 2020-02-05 14:16:58 -08:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara 39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Eric Liang 4edae7ea2b Speed up task submissions a bit (#5992) 2019-10-25 00:10:37 -07:00
Si-Yuan 0292f99e6c Fix DeprecationWarning (#5608) 2019-09-01 15:21:32 -07:00
Andrey K d41963c546 Fixed: missing brackets when appending proc info on OutOfMemory (#5530)
* Fixed: missing brackets when appending proc info on OutOfMemory

proc_stats.append was missing the set of brackets when adding a tuple to the list, which resulted in runtime error instead of correct Out of Memory message display.

* Update memory_monitor.py
2019-08-24 18:33:20 -07:00
Eric Liang e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
Qingqing Mao 63f49f95dd Improve memory check (#5216)
* Improve MemoryMonitor

- Add an env var to control the threshold.
- Use cgroup memory limit and usage for container environment.

* linting

* white space

* add comment
2019-07-17 23:30:02 -07:00
Eric Liang 5aec750107 Add warning/error if object store memory exceeds available memory (#4893)
* exclude

* format

* add warning

* hatch

* reduce mem usage

* reduce object store mem

* set obj mem
2019-07-08 21:37:08 -07:00
Richard Liaw d128636bab Ray Logging Configuration (#3691)
* fix logging for autoscaler

* module logging

* try this for logging

* yapf

* fix

* Initial logging setup

* momery

* ok

* remove basicconfig

* catch

* remove package logging

* print

* fix

* try_fix

* fix 1

* revert rllib

* logging level

* flake8

* fix

* fix

* Remove vestigal TODO
2019-01-30 21:01:12 -08:00
Eric Liang cffe8f9806 Add option to evict keys LRU from the sharded redis tables (#3499)
* wip

* wip

* format

* wip

* note

* lint

* fix

* flag

* typo

* raise timeout

* fix

* optional get

* fix flag

* increase timeout in test

* update docs

* format
2018-12-09 05:48:52 -08:00
Eric Liang 0d56fc10cc Move setproctitle to ray[debug] package (#3415) 2018-11-27 09:50:59 -08:00
Eric Liang 5723291db6 Raise exception if the node is nearly out of memory (#3323)
* wip

* add

* comment

* escape hatch

* update

* object store too

* .2
2018-11-15 12:55:25 -08:00