fyrestone
a6b8bd47b0
[xlang] Cross language serialize ActorHandle ( #7134 )
2020-02-17 20:44:56 +08:00
Richard Liaw
52d9189d5d
[autoscaler] port-forward for attach + redis_port ( #7145 )
...
* port-forward
* fixport
* force redis port in init mode
* test
* Update python/ray/tests/test_ray_init.py
2020-02-14 15:17:00 -08:00
fyrestone
0648bd28ef
[xlang] Cross language Python support ( #6709 )
2020-02-08 13:01:28 +08:00
ijrsvt
0826f95e1c
Including psutil & setproctitle ( #7031 )
2020-02-05 14:16:58 -08:00
Edward Oakes
984490d2be
Collect object IDs during serialization ( #6946 )
2020-02-03 18:38:11 -08:00
Siyuan (Ryans) Zhuang
42cbf801e1
workaround for python3.5 fast numpy serialization ( #6675 )
2020-02-03 13:08:18 -08:00
Eric Liang
8b4b49662b
Force OMP_NUM_THREADS=1 if unset ( #6998 )
...
* force omp
* update
* set
* workers
* link
2020-02-01 11:46:11 -08:00
Edward Oakes
92525f35d1
Remove raylet client from Python worker ( #6018 )
2020-01-31 18:23:01 -08:00
Edward Oakes
341a921d81
Remove vanilla pickle serialization for task arguments ( #6948 )
2020-01-31 16:52:43 -08:00
SangBin Cho
df518849ed
Remove ray.wait timeout warning for milliseconds ( #6980 )
2020-01-30 19:07:52 -08:00
Simon Mo
1e3a34b223
Rewrite the async api documentation ( #6936 )
...
* Rewrite the async api documentation
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com >
* clearify comment
* Add quickstart
* Add reference for async in ray.get ray.wait docstring
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com >
2020-01-30 09:34:09 -08:00
Simon Mo
396d7fafc8
UI improvement for asyncio ( #6905 )
2020-01-27 12:45:51 -08:00
Simon Mo
4dd41844d0
Ignore blocking ray.wait if timeout is zero ( #6891 )
2020-01-22 16:05:34 -08:00
Sven Mika
4ee566129f
Ignore io.UnsupportedOperation error when "Enabling nice stack traces on SIGSEGV etc." in worker.py::connect(). ( #6771 )
...
- Fixes RLlib tf-eager test cases for all agents when run locally on Ubuntu and Mac.
2020-01-13 14:31:13 -08:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Eric Liang
69c5a2bc3c
Warn if OMP_NUM_THREADS is set ( #6729 )
2020-01-08 14:59:07 -08:00
Robert Nishihara
5e43b25e8c
Document fault tolerance behavior. ( #6698 )
2020-01-06 22:34:06 -08:00
Edward Oakes
2a4d2c6e9e
Basic reference counting & pinning ( #6554 )
2020-01-06 17:30:26 -06:00
Robert Nishihara
92e44a5dc8
Deprecate redis_address argument in favor of address. ( #6654 )
2020-01-02 20:18:34 -08:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
Robert Nishihara
480206eef8
Remove some Python 2 compatibility code. ( #6624 )
2019-12-31 17:14:58 -08:00
Eric Liang
e2bc489a18
Port webui nits from original pr that enables it ( #6628 )
...
* backport changes
* Update test_webui.py
2019-12-29 19:19:43 -08:00
Robert Nishihara
8724e5ffd5
Start WebUI by default. ( #6493 )
2019-12-27 13:49:07 -08:00
Edward Oakes
6b1a57542e
Add actor.__ray_kill__() to terminate actors immediately ( #6523 )
2019-12-23 23:12:57 -06:00
Yunzhi Zhang
bac6f3b61e
[Dashboard] Collecting worker stats in node manager and implement webui display in the backend ( #6574 )
2019-12-22 17:50:23 -08:00
Simon Mo
26ec500ef9
Implement async get for direct actor call ( #6339 )
2019-12-18 11:50:21 -08:00
Simon Mo
e530c37b0e
Use localhost and set redis password by default ( #6481 )
2019-12-17 19:41:19 -08:00
Edward Oakes
e2b7459bfc
Fix worker exit cleanup ( #6450 )
...
* working but ugly
* comments
* proper but hanging in grpc server destructor
* grpc server shutdown deadline
* fix disconnect
* lint
* shutdown_only in test
* replace shutdown
2019-12-13 16:52:50 -08:00
Edward Oakes
82f7dbc7a7
Increase TaskID size by 2 bytes, taken from JobID ( #6425 )
...
* Increase TaskID size by 2 bytes, taken from JobID
* comments
* check max job id
* fix doc
* fix local mode
2019-12-11 10:45:14 -08:00
Edward Oakes
044527adb8
Remove ref counting dependencies on ray.get() ( #6412 )
...
* Remove ref counting dependencies on Get()
* comment
* don't send IDs when disabled
* pass through internal config
* fix
* allow reinit
* remove flag
2019-12-10 18:11:34 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py ( #6306 )
...
* multinode failures direct
* Add number of retries allowed for tasks
* Retry tasks
* Add failing test for object reconstruction
* Handle return status and debug
* update
* Retry task unit test
* update
* update
* todo
* Fix max_retries decorator, fix test
* Fix test that flaked
* lint
* comments
2019-12-02 10:20:57 -08:00
Edward Oakes
e4f9b3b7d9
Use process reaper for cleanup ( #6253 )
2019-11-26 22:00:08 -06:00
Simon Mo
1ca8c427e3
Consistent Name for Process Title ( #6276 )
...
* Consistent naming for setprotitle
* Address comments
* Add debug/verbose mode
* Fix test
2019-11-26 11:56:28 -08:00
Philipp Moritz
33c768ebe4
Fix worker signal.SIGTERM handler being installed from outside the main thread ( #6176 )
2019-11-20 11:14:28 -08:00
Ujval Misra
2965dc1b72
[tune] Fault tolerance improvements ( #5877 )
...
* Precede ray.get with ray.wait.
* Trigger checkpoint deletes locally in Trainable
* Clean-up code.
* Minor changes.
* Track best checkpoint so far again
* Pulled checkpoint GC out of Trainable.
* Added comments, error logging.
* Immediate pull after checkpoint taken; rsync source delete on pull
* Minor doc fixes
* Fix checkpoint manager bug
* Fix bugs, tests, formatting
* Fix bugs, feature flag for force sync.
* Fix test.
* Fix minor bugs: clear proc and less verbose sync_on_checkpoint warnings.
* Fix bug: update IP of last_result.
* Fixed message.
* Added a lot of logging.
* Changes to ray trial executor.
* More bug fixes (logging after failure), better logging.
* Fix richards bug and logging
* Add comments.
* try-except
* Fix heapq bug.
* .
* Move handling of no available trials to ray_trial_executor (#1 )
* Fix formatting bug, lint.
* Addressed Richard's comments
* Revert tests.
* fix rebase
* Fix trial location reporting.
* Fix test
* Fix lint
* Rebase, use ray.get w/ timeout, lint.
* lint
* fix rebase
* Address richard's comments
2019-11-18 01:14:41 -08:00
Ujval Misra
e3e3ad4b25
Add timeout param to ray.get ( #6107 )
2019-11-14 00:50:04 -08:00
Philipp Moritz
f24d96ec4f
Revert "Try to enable dashboard (again) ( #6069 )" ( #6159 )
...
This reverts commit 4044af8520 .
2019-11-13 12:32:12 -08:00
Stephanie Wang
35d177f459
Use grpc for communication from worker to local raylet (task submission and direct actor args only) ( #6118 )
...
* Skeleton for SubmitTask proto
* Pass through node manager port, connect in raylet client
* Switch submit task to grpc
* Check port in use
* doc
* Remove default port, set port randomly from driver
* update
* Fix test
* Fix object manager test
2019-11-11 21:17:25 -08:00
Philipp Moritz
decaa65cd6
Use pickle by default for serialization ( #5978 )
2019-11-10 18:12:18 -08:00
Eric Liang
4044af8520
Try to enable dashboard (again) ( #6069 )
...
* Revert "Revert "Enable the Ray dashboard by default (#5976 )" (#6068 )"
This reverts commit 1a3e97cf23 .
* fix tests that assume the dashboard isn't a job
* travis
2019-11-08 10:48:48 -08:00
Eric Liang
4a28306186
Allow large returns from direct actor calls ( #6088 )
2019-11-07 21:28:55 -08:00
Edward Oakes
043d1f4094
Return RayObjects to core worker ( #6052 )
2019-11-04 20:27:57 -08:00
Eric Liang
1a3e97cf23
Revert "Enable the Ray dashboard by default ( #5976 )" ( #6068 )
...
This reverts commit 6166ef3e09 .
2019-11-01 17:08:37 -07:00
Eric Liang
fb34928a2a
[minor] Perf optimizations for direct actor task submission ( #6044 )
...
* merge optimizations
* fix
* fix memory err
* optimize
* fix tests
* fix serialization of method handles
* document weakref
* fix check
* bazel format
* disable on 2
2019-11-01 14:41:14 -07:00
Eric Liang
6166ef3e09
Enable the Ray dashboard by default ( #5976 )
2019-11-01 12:19:01 -07:00
Edward Oakes
e9e78871b9
Remove unused function definition caching ( #6042 )
2019-10-30 16:41:18 -07:00
Eric Liang
b89cac976a
Basic direct actor call support in Python ( #5991 )
2019-10-28 22:09:04 -07:00
Eric Liang
a5523466a2
Enable memstore by default ( #6003 )
2019-10-25 21:59:12 -07:00
Edward Oakes
1ce521a7f3
Remove task context from python worker ( #5987 )
...
Removes duplicated state between the python and C++ workers. Also cleans up the serialization codepaths a bit.
2019-10-25 07:38:33 -07:00
Edward Oakes
6f27d881bd
Fix core worker shutdown errors ( #6004 )
2019-10-24 22:29:05 -07:00