Commit Graph

1839 Commits

Author SHA1 Message Date
Robert Nishihara 240e8f5279 Fix error message when failing to start UI if grpcio not installed. (#6433) 2019-12-11 14:56:13 -08:00
Eric Liang b3eb374817 [tune] Really disable retries by default 2019-12-11 13:12:28 -08:00
Edward Oakes 82f7dbc7a7 Increase TaskID size by 2 bytes, taken from JobID (#6425)
* Increase TaskID size by 2 bytes, taken from JobID

* comments

* check max job id

* fix doc

* fix local mode
2019-12-11 10:45:14 -08:00
Yuhao Yang 3db8faab0d [tune] fix log dir race condition (#6420) 2019-12-10 21:00:19 -08:00
Simon Mo c61db84b8d Bump dev6->dev7 for two files not changed yet. (#6428) 2019-12-10 20:58:14 -08:00
Edward Oakes 044527adb8 Remove ref counting dependencies on ray.get() (#6412)
* Remove ref counting dependencies on Get()

* comment

* don't send IDs when disabled

* pass through internal config

* fix

* allow reinit

* remove flag
2019-12-10 18:11:34 -08:00
Ujval Misra 4e1d1ed00d [tune] Report trials by state fairly (#6395)
* Fairly represented trial states.

* filter test

* Indent

* Add test to BUILD

* Address Eric's comments (show truncation by state).

* Sort trials, only show 20.

* Fix lint
2019-12-10 14:56:54 -08:00
Philipp Moritz 16be483af7 [Projects] Return parameters for a command (#6409) 2019-12-10 10:25:01 -08:00
Chaokun Yang 6272907a57 [Streaming] Streaming data transfer and python integration (#6185) 2019-12-10 20:33:24 +08:00
Rong Rong c1d4ab8bb4 Move top level RayletClient to ray::raylet::RayletClient (#6404) 2019-12-09 21:08:59 -08:00
Eric Liang 304b4f0d3d Shard unit tests into medium sized files for test stability (#6398) 2019-12-09 13:15:29 -08:00
Eric Liang a6bc2b1842 Misc direct call fixes from unit tests (#6394) 2019-12-08 19:34:02 -08:00
visatish e2ba8c1898 [tune] Fixed bug in PBT where initial trial result is empty. (#6351)
* Fixed bug in tune pbt where initial result is empty.

* Updated mock trial executor in test suite.

* Added comment.
2019-12-06 15:30:27 -08:00
Zhijun Fu b88b8202cc fix java build failure (#6062) 2019-12-06 14:38:43 +08:00
Ion 1c638a11a7 Refactor helper methods for new scheduler integration (#6354) 2019-12-05 18:49:25 -08:00
Edward Oakes f63b64310a Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Philipp Moritz dd27bfbb75 Rename .rayproject to ray-project (#6278) 2019-12-05 16:15:42 -08:00
Eric Liang 6223d2ed0b [direct call] Assign resource ids for direct call tasks (#6364) 2019-12-05 10:16:04 -08:00
Eric Liang 4c6739476b [rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
micafan 668ce47360 [GCS]Add abstract interface of actor to GCS Client (#6269) 2019-12-05 13:38:29 +08:00
Eric Liang 1a3b83abf8 [direct call] Fix hang when caller id changes for actor task submission (#6338) 2019-12-04 12:01:35 -08:00
Ujval Misra fa5d62e8ba [tune] Retry restore on timeout (#6284)
* Retry recovery on timeout

* fix bug, revert some code

* Add test for restore time outs.

* Fix lint

* Address comments

* Don't timeout restores.
2019-12-02 20:01:47 -08:00
Edward Oakes dff6017272 Fix "failed to create head node" issue (#6304)
* Fix failed to create head node issue

* comments
2019-12-02 15:22:00 -08:00
Mitchell Stern 43d20fff62 Refactor dashboard codebase to improve modularity (#6330)
* Refactor dashboard codebase to improve modularity

* Simplify feature interface

* Use arrow notation in makeFeature argument types

* Use separate components for node and worker features rather than a single conditionally-rendered component

* Add comments about Ray worker process titles

* Add comments to non-obvious fields in node info API response
2019-12-02 11:05:40 -08:00
Stephanie Wang da41180dc0 [direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306)
* multinode failures direct

* Add number of retries allowed for tasks

* Retry tasks

* Add failing test for object reconstruction

* Handle return status and debug

* update

* Retry task unit test

* update

* update

* todo

* Fix max_retries decorator, fix test

* Fix test that flaked

* lint

* comments
2019-12-02 10:20:57 -08:00
Philipp Moritz 22fa9b564b fix linting (#6322) 2019-12-01 14:06:35 -08:00
Simon Mo 4033d65e4f Fix redis-server stoping in linux (#6296)
* Cleanup test_calling_start_ray_head

* Kill redis-server with args instead of comm

In linux, ps -o pid,comm output just redis-server instead of the
full executable path
2019-11-29 22:50:05 -08:00
Yuhao Yang ffa043d4b7 [tune] replace self.config (#6313) 2019-11-29 11:09:30 -08:00
Stephanie Wang 724a5e3909 Turn on direct calls for test_failure.py (#6291) 2019-11-28 12:28:30 -08:00
Eric Liang b7b655c851 Also use NotifyDirectCallTaskBlock/Unblocked for plasma store accesses (#6249)
* wip

* fix it

* lint

* wip

* fix

* unblock

* flaky

* use fetch only flag

* Revert "use fetch only flag"

This reverts commit 56e938a0ee2024f5c99c9ab2d55fd35558fb15e1.

* restore error resolution

* use worker task id

* proto comments

* fix if
2019-11-27 22:46:15 -08:00
Simon Mo 22b305223a Build Docker Containers for Linux Wheels (#6233) 2019-11-27 17:05:36 -08:00
Stephanie Wang 2797c11b69 [direct task] For serialized object IDs, check with owner before declaring object unreconstructable (#6286)
* Track borrowed vs owned objects

* Serialize owner address with object ID

* serialize owner task id

* Deserialize object IDs

* Pass direct task ID instead of plasma ID

* it works

* Fix ref count test

* Add unit test

* update warning

* we own ray.put objects

* missing file

* doc

* Fix unit test

* comments

* Fix py2

* lint

* update
2019-11-27 15:31:44 -08:00
Edward Oakes e4f9b3b7d9 Use process reaper for cleanup (#6253) 2019-11-26 22:00:08 -06:00
Eric Liang 30b2fc1d81 Fix actor creation hang due to race in SWAP queue (#6280) 2019-11-26 15:21:03 -08:00
Simon Mo 1ca8c427e3 Consistent Name for Process Title (#6276)
* Consistent naming for setprotitle

* Address comments

* Add debug/verbose mode

* Fix test
2019-11-26 11:56:28 -08:00
Robert Nishihara ffb9c0ecae Fix bug in which remote function redefinition doesn't happen. (#6175) 2019-11-26 11:19:19 -06:00
Edward Oakes 7f8de61441 [hotfix] Remove python/ray/tests/__init__.py (#6279)
* Remove python/ray/tests/__init__.py for bazel

* Comment out checks
2019-11-25 17:04:20 -08:00
Eric Liang 64a3a7239e Set RAY_FORCE_DIRECT=1 for run_rllib_tests, test_basic (#6171) 2019-11-25 14:12:11 -08:00
Edward Oakes e72aef2ba6 [hotfix] Fix building linux wheels 2019-11-25 12:45:31 -07:00
Simon Mo c8b69727cd ray stop only kills process with ray keyword (#6257)
* Use psutil to kill processes

* Psutil as core requirement

* Revert "Psutil as core requirement"

This reverts commit d3235ce3d994d2bb7db39e3ad4a46049703898bb.

* Revert "Use psutil to kill processes"

This reverts commit de0ed874fed673f5e98715950688f418bbcc415c.

* Revert back to subproc

* Add comments, grep for ray as well

* SIGTERM
2019-11-24 16:32:07 -08:00
Eric Liang e5b5c98558 Fix python PATH for build (#6260) 2019-11-24 15:32:06 -08:00
Eric Liang 53641f1f74 Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Simon Mo aa8d5d2f6c Rate limit asyncio actor (#6242) 2019-11-24 11:39:28 -08:00
Yuhao Yang f6a5baf844 [tune] minor doc fix (#6248) 2019-11-23 21:54:41 -08:00
Stephanie Wang d2662fecea Miscellaneous bug fixes to throw unreconstructable errors for direct calls (#6245)
* Test cases

* Fix InPlasmaError

* raylet fixes to force errors for direct calls

* Disable lineage logging and task pending checks for direct calls

* move todo

* Clean up tests

* Fix bugs in object store for Contains and Delete

* Use direct call in tests

* Fixes, separate actor creation direct call from normal direct call spec
2019-11-23 15:05:49 -08:00
Eric Liang b052bcf1fc Bazelify tune tests in travis (#6219) 2019-11-22 13:58:50 -08:00
Simon Mo eb6a93c0f0 [hotfix] fix lint (#6236) 2019-11-21 18:30:57 -08:00
Eric Liang 7559fdb141 [rllib/tune] Cache get_preprocessor() calls, default max_failur… (#6211) 2019-11-21 15:55:56 -08:00
Stephanie Wang d3227f2f2d Fix bug in direct task calls for objects that were evicted (#6216)
* Fix bug and add some checks

* rename
2019-11-21 15:38:31 -08:00
Simon Mo 29ba6bfc64 Basic Async Actor Call (#6183)
* Start trying to figure out where to put fibers

* Pass is_async flag from python to context

* Just running things in fiber works

* Yield implemented, need some debugging to make it work

* It worked!

* Remove debug prints

* Lint

* Revert the clang-format

* Remove unnecessary log

* Remove unncessary import

* Add attribution

* Address comment

* Add test

* Missed a merge conflict

* Make test pass and compile

* Address comment

* Rename async -> asyncio

* Move async test to py3 only

* Fix ignore path
2019-11-21 11:56:46 -08:00