Robert Nishihara
5e76d52868
Improve cluster.wait_for_nodes() API. ( #3712 )
...
* Separate out functionality for querying client table and improve cluster.wait_for_nodes() API.
* Linting
* Add back logging statements.
* info -> debug
2019-01-07 21:26:58 -08:00
Robert Nishihara
c9d70f0dda
Remove num_local_schedulers argument from ray.worker._init. ( #3704 )
...
* Remove num_local_schedulers argument from ray.worker._init.
* Fix
* Fix tests.
2019-01-07 12:44:49 -08:00
Robert Nishihara
586a5c9ffa
Limit default redis max memory to 10GB. ( #3630 )
...
* Limit Redis max memory to 10GB/shard by default.
* Update stress tests.
* Reorganize
* Update
* Add minimum cap size for object store and redis.
* Small test update.
2019-01-03 13:23:54 -08:00
Yuhong Guo
4b23a34c93
Fix multi-thread problem of function manager and Jenkins test ( #3648 )
2019-01-03 17:05:13 +08:00
Robert Nishihara
b6bcd18d65
Split profile table among many keys in the GCS. ( #3676 )
...
* Divide profile table among many keys in GCS.
* Fix, and remove --collect-profiling-data arg.
* Remove reference in doc.
2019-01-02 21:33:01 -08:00
Si-Yuan
93d54110f8
Prevent overriding faulthandler settings ( #3668 )
...
This change ensures that Ray set up fault handlers only if it has not been enabled by other applications. Otherwise some applications could face strange issues when using Ray, and some unittests using xml runners will fail.
2018-12-31 16:36:26 -08:00
Yuhong Guo
c9b8ecca51
Add RayParams to refactor the parameters used by ray python. ( #3558 )
2018-12-29 22:04:27 +08:00
Alexey Tumanov
c4cba98c75
Remove deprecation warnings when running actor tests ( #3563 )
...
* remove deprecation warnings when running actor tests
* replacing logger.warn with logger.warning
* Update worker.py
* Update policy_client.py
* Update compression.py
2018-12-18 17:04:51 -08:00
Yuhong Guo
fb33fa9097
Enable function_descriptor in backend to replace the function_id ( #3028 )
2018-12-18 18:53:59 -05:00
Yuhong Guo
75ddf7cca4
Fix 2 small bugs ( #3573 )
2018-12-18 14:52:21 -05:00
Robert Nishihara
417c7f2d6f
Update arrow and remove plasma_manager references. ( #3545 )
2018-12-15 23:36:02 -08:00
Philipp Moritz
b3bf608608
Update arrow to reduce plasma IPCs. ( #3497 )
2018-12-14 23:49:37 -05:00
Hao Chen
e7b51cbd1b
[xray] Implement Actor Reconstruction ( #3332 )
...
* Implement Actor Reconstruction
* fix
* fix actor handle __del__
* fix lint
* add comment
* Remove actorCreationDummyObjectId
* address comments
* fix
* address comments
* avoid copy
* change log to debug
* fix error name
2018-12-13 21:28:58 -08:00
Si-Yuan
84fae57ab5
Convert the raylet client (the code in local_scheduler_client.cc) to proper C++. ( #3511 )
...
* refactoring
* fix bugs
* create client class
* create client class for java; bug fix
* remove legacy code
* improve code by using std::string, std::unique_ptr rename private fields and removing legacy code
* rename class
* improve naming
* fix
* rename files
* fix names
* change name
* change return types
* make a mutex private field
* fix comments
* fix bugs
* lint
* bug fix
* bug fix
* move too short functions into the header file
* Loose crash conditions for some APIs.
* Apply suggestions from code review
Co-Authored-By: suquark <suquark@gmail.com >
* format
* update
* rename python APIs
* fix java
* more fixes
* change types of cpython interface
* more fixes
* improve error processing
* improve error processing for java wrapper
* lint
* fix java
* make fields const
* use pointers for [out] parameters
* fix java & error msg
* fix resource leak, etc.
2018-12-13 13:39:10 -08:00
Eric Liang
0e00533ed4
Different approach to removing RayGetError ( #3471 )
2018-12-12 20:30:51 -08:00
Eric Liang
cffe8f9806
Add option to evict keys LRU from the sharded redis tables ( #3499 )
...
* wip
* wip
* format
* wip
* note
* lint
* fix
* flag
* typo
* raise timeout
* fix
* optional get
* fix flag
* increase timeout in test
* update docs
* format
2018-12-09 05:48:52 -08:00
Tianming Xu
f6490f9bef
Resolve no handlers could be found for logger 'ray.worker' when importing ray ( #3483 )
2018-12-06 20:46:53 -08:00
Si-Yuan
2e6f9bedf2
Add the extra fallback for serialization ( #3468 )
...
* Add the extra fallback for serialization.
* Better comments & warnings. quotes.
* Update test/runtest.py
Co-Authored-By: suquark <suquark@gmail.com >
* Update test/runtest.py
Co-Authored-By: suquark <suquark@gmail.com >
* linting
* Don't hijack too much errors.
* simplify the test
* Update runtest.py
* simplify
2018-12-05 13:09:08 -08:00
Eric Liang
0d56fc10cc
Move setproctitle to ray[debug] package ( #3415 )
2018-11-27 09:50:59 -08:00
Robert Nishihara
3856533065
Fix incompatibility with most recent version of Redis. ( #3379 )
...
* Fix incompatibility with most recent version of Redis.
* Fix
* Fixes.
2018-11-24 16:36:38 -08:00
Eric Liang
afc48d7b77
Don't setpgid() on actors ( #3347 )
2018-11-19 17:35:26 -08:00
Eric Liang
e0bf9d7305
Add debug string to raylet ( #3317 )
...
* initial debug string
* format
* wip debug string
* fix compile
* fix
* update
* finished
* to file
* logs dir
* use temp root
* fix
* override
2018-11-15 21:47:50 -08:00
Eric Liang
5723291db6
Raise exception if the node is nearly out of memory ( #3323 )
...
* wip
* add
* comment
* escape hatch
* update
* object store too
* .2
2018-11-15 12:55:25 -08:00
Eric Liang
1660c9d627
Kill actor child processes on shutdown ( #3297 )
...
* example
* add env
* test pg
* change to test
* add atexit test
* Update rllib-env.rst
* comment
* revert unnecessary file
* fix title when actor is idle
* Update python/ray/actor.py
Co-Authored-By: ericl <ekhliang@gmail.com >
2018-11-13 19:16:42 -08:00
Stephanie Wang
d950e92f63
Allow multiple threads to call ray.get and ray.wait ( #3244 )
...
* Handle multiple threads calling ray.get
* Multithreaded ray.wait
* Pass in current task ID in java backend
* Add multithreaded actor to tests, add warning messages to worker for multithreaded ray.get
* Fix test
* Some cleanups
* Improve error message
* Add assertion
* Cleanup, throw error in HandleTaskUnblocked if task not actually blocked
* lint
* Fix python worker reset
* Fix references to reconstruct_objects
* Linting
* java lint
* Fix java
* Fix iterator
2018-11-07 22:39:28 -08:00
Richard Liaw
0bab8ed95c
Expose internal config parameters for starting Ray ( #3246 )
...
## What do these changes do?
This PR exposes the CL option for using a config parameter. This is important for certain tests (i.e., FT tests that removing nodes) to run quickly.
Note that this is bad practice and should be replaced with GFLAGS or some equivalent as soon as possible.
#3239 depends on this.
TODO:
- [x] Add documentation to method arguments before merging.
- [x] Add test to verify this works?
## Related issue number
2018-11-07 21:46:02 -08:00
Eric Liang
29e3362905
Better errors on process deaths ( #3252 )
2018-11-07 14:08:16 -08:00
Eric Liang
2e04ffe00c
Change dict serialization warning to debug ( #3230 )
2018-11-06 21:23:07 -08:00
Eric Liang
725df3a485
Set the process title in workers and actors ( #3219 )
2018-11-06 14:59:22 -08:00
Peter Schafhalter
f3efcd2342
Fix password authentication in worker ( #3124 )
2018-11-06 13:40:03 -08:00
Eric Liang
8356a01dd6
Remove suppressing duplicate error message (missed a couple)
2018-11-05 23:37:14 -08:00
Wang Qing
ca7d4c2cf5
Enable to specify driver id by user. ( #3084 )
2018-11-02 19:01:50 -07:00
Robert Nishihara
5822aa2388
Rename get_task -> worker_idle in timeline. ( #3179 )
...
* Rename get_task -> worker_idle in timeline.
* Fix test.
2018-11-02 12:08:46 -07:00
Robert Nishihara
e612e26103
Add use_raylet option for backwards compatibility. ( #3176 )
...
* Add use_raylet option for backwards compatibility.
* Update message.
2018-11-01 14:16:04 -07:00
Robert Nishihara
32f0d6b77e
Deprecate num_workers argument to ray.init and ray start. ( #3114 )
...
* Remove num_workers argument.
* Fix
* Fix
2018-10-28 20:12:49 -07:00
Robert Nishihara
9868af4c7c
Use /tmp instead of /dev/shm for object store on Linux if /dev/shm is too small. ( #3149 )
...
* Use /tmp instead of /dev/shm for object store on Linux if /dev/shm is too small.
* Add logging statement and address comments.
* Fix
2018-10-28 20:09:06 -07:00
Robert Nishihara
658c14282c
Remove legacy Ray code. ( #3121 )
...
* Remove legacy Ray code.
* Fix cmake and simplify monitor.
* Fix linting
* Updates
* Fix
* Implement some methods.
* Remove more plasma manager references.
* Fix
* Linting
* Fix
* Fix
* Make sure class IDs are strings.
* Some path fixes
* Fix
* Path fixes and update arrow
* Fixes.
* linting
* Fixes
* Java fixes
* Some java fixes
* TaskLanguage -> Language
* Minor
* Fix python test and remove unused method signature.
* Fix java tests
* Fix jenkins tests
* Remove commented out code.
2018-10-26 13:36:58 -07:00
Robert Nishihara
5aa29613db
Fix linting errors. ( #3127 )
2018-10-24 16:30:00 -07:00
Robert Nishihara
9c1826ed69
Use XRay backend by default. ( #3020 )
...
* Use XRay backend by default.
* Remove irrelevant valgrind tests.
* Fix
* Move tests around.
* Fix
* Fix test
* Fix test.
* String/unicode fix.
* Fix test
* Fix unicode issue.
* Minor changes
* Fix bug in test_global_state.py.
* Fix test.
* Linting
* Try arrow change and other object manager changes.
* Use newer plasma client API
* Small updates
* Revert plasma client api change.
* Update
* Update arrow and allow SendObjectHeaders to fail.
* Update arrow
* Update python/ray/experimental/state.py
Co-Authored-By: robertnishihara <robertnishihara@gmail.com >
* Address comments.
2018-10-23 12:46:39 -07:00
Richard Liaw
40c4148d4f
Cluster Utilities for Fault Tolerance Tests ( #3008 )
2018-10-20 22:56:29 -07:00
Peter Schafhalter
fa469783d8
Fix bug when connecting to password-secured cluster ( #3083 )
2018-10-18 21:43:03 -07:00
Peter Schafhalter
a41bbc10ef
Add password authentication to Redis ports ( #2952 )
...
* Implement Redis authentication
* Throw exception for legacy Ray
* Add test
* Formatting
* Fix bugs in CLI
* Fix bugs in Raylet
* Move default password to constants.h
* Use pytest.fixture
* Fix bug
* Authenticate using formatted strings
* Add missing passwords
* Add test
* Improve authentication of async contexts
* Disable Redis authentication for credis
* Update test for credis
* Fix rebase artifacts
* Fix formatting
* Add workaround for issue #3045
* Increase timeout for test
* Improve C++ readability
* Fixes for CLI
* Add security docs
* Address comments
* Address comments
* Adress comments
* Use ray.get
* Fix lint
2018-10-16 22:48:30 -07:00
Robert Nishihara
faa31ae018
Introduce concept of resources required for placing a task. ( #2837 )
...
* Introduce concept of resources required for placement.
* Add placement resources to task spec
* Update java worker
* Update taskinfo.java
2018-10-04 10:35:39 -07:00
Si-Yuan
f2dbd3096c
Minor improvements and fixes in Python code. ( #3022 )
...
This commit fix some small defects.
1. Remove a comment that should have been removed in #3003
2. Remove `redis_protected_mode` that is never used in `ray.init()`
3. Fix `object_id_seed` that is forgotten to be passed into `ray._init()`
4. Remove several redundant brackets.
2018-10-03 21:08:20 -07:00
Yuhong Guo
9948e8c11b
Move function/actor exporting & loading code to function_manager.py ( #3003 )
...
Move function/actor exporting & loading code to function_manager.py to prepare the code change for function descriptor for python.
2018-10-03 16:21:04 -07:00
Si-Yuan
cc7e2ecdd5
Change logfile names and also allow plasma store socket to be passed in. ( #2862 )
2018-10-03 10:03:53 -07:00
Eric Liang
bee743c152
Remove log suppression code
...
When running in a screen (or any other time it is hard to scroll up), printing "Suppressing previous error message" is not helpful since the previous error is lost far above past scrollback. Better to just print it repeatedly at the end.
tada 1
2018-09-11 23:28:45 -07:00
Eric Liang
611259b2c7
Re-raise actor initialization errors on method invocation ( #2843 )
...
If an actor constructor fails, save that error and re-raise it on any subsequent attempts to interact with the actor. Related to https://github.com/ray-project/ray/issues/282 and https://github.com/ray-project/ray/issues/1093 .
2018-09-10 10:51:19 -07:00
Robert Nishihara
bd64c940e9
Push error to driver when monitor raises an exception. ( #2834 )
2018-09-07 17:42:45 -07:00
Robert Nishihara
3f6ed537a4
Add ray.is_initialized() function. ( #2818 )
...
* Add ray.is_initialized() function.
* Add assert.
2018-09-06 21:20:59 -07:00