Commit Graph

2095 Commits

Author SHA1 Message Date
Eric Liang 221d1663c1 [rllib] switch to python logger (#3098)
* logg

* set rllib logger

* comment

* info

* rlib

* comment

* add format

* fix lint

* add file info

* update

* add ts

* lint

* better docs

* fix value error

* soft log level
2018-10-21 23:43:57 -07:00
Richard Liaw 40c4148d4f Cluster Utilities for Fault Tolerance Tests (#3008) 2018-10-20 22:56:29 -07:00
Wang Qing a4db5bbaea Fill driver id into actor notification when finishing assigned task. (#3080)
## What do these changes do?
Fill driver id into actor notification when finishing assigned task.
Also it improves codes.
2018-10-21 11:12:20 +08:00
Eric Liang 59901a88a0 [rllib] Native support for Dict and Tuple spaces; fix Tuple action spaces; add prev a, r to LSTM (#3051) 2018-10-20 15:21:22 -07:00
Robert Nishihara 9a2b5333ef Add links for latest Python 3.7 wheels to documentation. (#3091) 2018-10-19 12:15:22 -07:00
bibabolynn 9a5c273db7 [java] fix check exception type (#3093)
<!--
Thank you for your contribution!

Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request.
-->

## What do these changes do?
remove TaskExecutionException, use RayException instead
<!-- Please give a short brief about these changes. -->

## Related issue number

<!-- Are there any issues opened that will be resolved by merging this change? -->
2018-10-19 06:43:42 -07:00
Wang Qing b410ee0d29 [Java] Support dynamically defining resources when submitting task. (#3070)
## What do these changes do?
Before this PR, if we want to specify some resources, we must do as following codes:
```java
@RayRemote(Resources={ResourceItem("CPU", 10)})
public static void f1() {
// do sth
}

@RayRemote(Resources={ResourceItem("CPU", 10)})
class Demo {
// sth
}
```
Unfortunately, it's no way for us to create another actor or task with different resources required.

After this PR, the thing will be:
```java
ActorCreationOptions option = new ActorCreationOptions(); 
option.resources.put("CPU", 4.0);
RayActor<Echo> echo1 = Ray.createActor(Echo::new, option);
option.resources.put("Res-A", 4.0);
RayActor<Echo> echo2 = Ray.createActor(Echo::new, option);


//if we don't specify resource,  the resources will be `{"cpu":0.0}` by default.
Ray.call(Echo::echo, echo2, 100);
```


## Related issue number
N/A
2018-10-19 06:22:32 -07:00
Eric Liang 9d23fa03c9 [xray] All messages on main asio event loop should be written asynchronously (#3023)
* copy over ref code

* wip async writes

* compiles

* fix error handling

* add test

* amend

* fix test

* clang fmgt

* clang format

* wip

* yapf

* rename format script

* test error

* clangfmt

* add test to list

* warn

* ref test

* fix test

* comment

* add capture

* Update client_connection.cc

* wip

* fix compile
2018-10-18 21:56:22 -07:00
Peter Schafhalter fa469783d8 Fix bug when connecting to password-secured cluster (#3083) 2018-10-18 21:43:03 -07:00
Devin Petersohn 8fcdafc6ea Adding Python3.7 wheels support (#2546)
* Adding Python3.7 wheels support

* Adding Mac wheels update

* fix

* numpy version

* choose different numpy versions depending on python version

* fix
2018-10-18 17:58:39 -07:00
Yuhong Guo 653c5b114a [c++] Refine Log Code (#2816)
* Support setting logging level from env variable

* Remove Env Variable related code

* lint
2018-10-18 10:51:36 -07:00
Peter Schafhalter b82fd157a7 Remove Redis protected mode (#3073)
Follow-up to #2925 and #2952. Removes the Redis protected mode implementation from Ray which was replaced by Redis port authentication.
2018-10-17 22:48:14 -07:00
Philipp Moritz 2c52d9dfa0 Fix actor handle id creation when actor handle was pickled (#3074) 2018-10-17 18:00:52 -07:00
Richard Liu 3c0803e7e9 [rllib] use ray.wait to get next worker result in async sample optimizer (#2993) 2018-10-17 17:44:51 -07:00
Peter Schafhalter a41bbc10ef Add password authentication to Redis ports (#2952)
* Implement Redis authentication

* Throw exception for legacy Ray

* Add test

* Formatting

* Fix bugs in CLI

* Fix bugs in Raylet

* Move default password to constants.h

* Use pytest.fixture

* Fix bug

* Authenticate using formatted strings

* Add missing passwords

* Add test

* Improve authentication of async contexts

* Disable Redis authentication for credis

* Update test for credis

* Fix rebase artifacts

* Fix formatting

* Add workaround for issue #3045

* Increase timeout for test

* Improve C++ readability

* Fixes for CLI

* Add security docs

* Address comments

* Address comments

* Adress comments

* Use ray.get

* Fix lint
2018-10-16 22:48:30 -07:00
Eric Liang a9e454f6fd [rllib] Include config dicts in the sphinx docs (#3064) 2018-10-16 15:55:11 -07:00
Wang Qing 64e5eb305e [Java] Add jvm-parameters in Config. (#3065) 2018-10-16 15:03:18 -07:00
Praveen Palanisamy 4d8cfc0bf5 [tune] Fix (some more) misleading comments in tune/results.py (#3068)
## What do these changes do?

Fix the misleading comments in code for:
 - `EPISODES_THIS_ITER`
 - `EPISODES_TOTAL`

Had noted it before and planned to fix it along with some other changes but seemed very relevant to stay next to #3058 so sending this now.
2018-10-16 11:07:53 -07:00
Eric Liang 6240ccbc6e [rllib] Add more warnings when multi-agent envs might not be set up right (#3061) 2018-10-15 13:42:56 -07:00
Eric Liang 3c891c6ece [rllib] Parallel-data loading and multi-gpu support for IMPALA (#2766) 2018-10-15 11:02:50 -07:00
Marlon 4dc78b735b [tune] Fix misleading comment (#3058) 2018-10-14 22:25:39 -07:00
Eric Liang 866c7a574c [rllib] Don't crash printing out error message (#3054)
* fix er

* update
2018-10-13 19:50:23 -07:00
Eric Liang 473ee4eb3f [rllib] Add unit test and some better error messages for custom policy states (#3032) 2018-10-13 00:03:52 -07:00
Hanwei Jin 87639b9e26 move make clean before cmake command, avoid always running mvn install plasma java lib (#3047) 2018-10-12 09:03:30 -07:00
Richard Liaw f9b58d7b02 [tune] Tweaks to Trainable and Verbosity (#2889) 2018-10-11 23:42:13 -07:00
Wang Qing 828fe24b39 [Java] Fix loading driver resources issue. (#3046)
## What do these changes do?
Fix the issue how we load driver resources by a specified path.
Also this addressed the comments from the related PR [3044](https://github.com/ray-project/ray/pull/3044).

## Related PRs:
 [#3044](https://github.com/ray-project/ray/pull/3044) and [#3001](https://github.com/ray-project/ray/pull/3001).
2018-10-11 09:45:21 -07:00
Wang Qing 4a2ed47b6c [Java] Improve some Java code (#3040)
This PR improves some java codes,  and removes some duplicated code.
2018-10-10 17:30:23 -07:00
Hanwei Jin 060891a9c9 [cmake] avoid to re-build pyarrow (#2963)
* bugfix: env exists check error

* support to avoid re-build pyarrow in project

* bugfix: adapt gtest for centos lib64

* bugfix: check gtest lib exists in the directory

* bugfix: find gtest with checking all libs exists

* prefix RAY_ to thirdparty env variables to avoid conflicts with other module

* arrow use glog from ray

* change the glog and gtest install dir
2018-10-10 14:33:15 -07:00
Wang Qing ef1f2fde95 Fix the uniqueId toString format. (#3035) 2018-10-08 13:12:14 -07:00
Wang Qing 84bf5fc8f3 [Java] Load driver resources from local path. (#3001)
## What do these changes do?
1. Add a configuration item `driver.resource-path`.
2. Load driver resources from the local path which is specified in the `ray.conf`.

Before this change, we should add all driver resources(like user's jar package, dependencies package and config files) into `classpath`.

After this change, we should add the driver resources into the mount path which we can configure it in `ray.conf`, and we shouldn't configure `classpath` for driver resources any more.

## Related issue number
N/A
2018-10-08 21:05:26 +01:00
Kristian Hartikainen 2d35a97a76 Bug/log syncer fails with parentheses (#2653)
* Update rsync command

* Escape rsync locations

* Fix the accidental variable move

* Update rsync to use -s flag
2018-10-06 00:34:53 -07:00
Richard Liaw ecd8f39580 [core] Improve logging message when plasma store is started. (#3029)
Improve logging message when plasma store is started.
2018-10-05 15:24:24 -07:00
Richard Liaw 0651d3b629 [tune/core] Use Global State API for resources (#3004) 2018-10-04 17:23:17 -07:00
Robert Nishihara faa31ae018 Introduce concept of resources required for placing a task. (#2837)
* Introduce concept of resources required for placement.
* Add placement resources to task spec
* Update java worker
* Update taskinfo.java
2018-10-04 10:35:39 -07:00
Richard Liaw 01bb073569 Suppress errors when worker or driver intentionally disconnects. (#2935) 2018-10-04 00:06:34 -07:00
Si-Yuan f2dbd3096c Minor improvements and fixes in Python code. (#3022)
This commit fix some small defects. 
1. Remove a comment that should have been removed in #3003
2. Remove `redis_protected_mode` that is never used in `ray.init()`
3. Fix `object_id_seed` that is forgotten to be passed into `ray._init()`
4. Remove several redundant brackets.
2018-10-03 21:08:20 -07:00
Yuhong Guo 9948e8c11b Move function/actor exporting & loading code to function_manager.py (#3003)
Move function/actor exporting & loading code to function_manager.py to prepare the code change for function descriptor for python.
2018-10-03 16:21:04 -07:00
Robert Nishihara d73ee36e60 Update links to use latest 0.5.3 wheels instead of 0.5.2. (#3018) 2018-10-03 13:43:40 -07:00
Si-Yuan cc7e2ecdd5 Change logfile names and also allow plasma store socket to be passed in. (#2862) 2018-10-03 10:03:53 -07:00
bibabolynn 9c606ea06c fix bug: (#3000)
before fix,RAY_FUN_CACHE use only get method ,can only get null
  fix : put after create
2018-10-02 22:53:54 -07:00
Robert Nishihara 3ce8eb2d4c Test dying_worker_get and dying_worker_wait for xray. (#2997)
This tests the case in which a worker is blocked in a call to ray.get or ray.wait, and then the worker dies. Then later, the object that the worker was waiting for becomes available. We need to make sure not to try to send a message to the dead worker and then die. Related to #2790.
2018-10-02 00:08:47 -07:00
Eric Liang 2019b4122b [rllib] Remove legacy multiagent support (#2975)
* remove legacy

* remove reshaper
2018-10-01 13:07:11 -07:00
Wang Qing fcef4edd46 [Java] Fix the required-resources issue of actor member function in Java worker. (#3002)
This fixes a bug in which Java actor methods inherit the resource requirements of the actor creation task.
2018-10-01 12:56:36 -07:00
Eric Liang b45bed4bce [rllib] Propagate model options correctly in ARS / ES, to action dist of PPO (#2974)
* fix

* fix

* fix it

* propagate conf to action dist

* move carla example too

* rr

* Update policies.py

* wip

* lint
2018-10-01 12:49:39 -07:00
Eric Liang e4bea8d10e [rllib] Default to truncate_episodes and add some more config validators (#2967)
* update

* link it

* warn about truncation

* fix

* Update rllib-training.rst

* deprecate tests failing
2018-09-30 18:37:55 -07:00
Eric Liang 814c35b7d7 [rllib] Simplify sample batch size and num envs config, n_step adjustment (#2995)
* simplify vec batch requirements

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-models.rst
2018-09-30 18:36:22 -07:00
old-bear 8aa736572b [tune] Fix hyperband edge case for None entries (#2964) 2018-09-30 09:57:43 -07:00
Robert Nishihara ed6289771a Convert runtest.py to use pytest. (#2966)
* Convert runtest.py to use pytest.

* Linting.

* Fix

* Fix

* Fix

* Fix
2018-09-30 07:59:44 -07:00
Eric Liang 65dcafdc3f [rllib] Refactor save() / restore() code of agents and avoid O(n_workers) save size (#2982) 2018-09-30 01:15:13 -07:00
Eric Liang 747253e0f6 [rllib] Don't shuffle samples in PPO when using lstm 2018-09-30 01:13:56 -07:00