Commit Graph

2984 Commits

Author SHA1 Message Date
architkulkarni f371eb334c [Serve] Only install dataclasses on Python 3.6 (#10936) 2020-09-23 00:07:28 +00:00
Amog Kamsetty c636f5bd40 [Ray SGD] FP16 Hotfix (#10931) 2020-09-23 00:07:18 +00:00
Alex Wu 7422f64ea9 [hotfix] CPU Detection (#10821) 2020-09-23 00:07:07 +00:00
Philipp Moritz 2beb12e9bf Don't pin boto instead set lower limit on its version (#10711) 2020-09-23 00:03:44 +00:00
architkulkarni 17c9837abc [Doc] Fix RayServeHandle doc (#10896) 2020-09-23 00:02:58 +00:00
Kai Fricke 938de003ca [tune] update pt tutorial docs (#10925)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-22 23:55:01 +00:00
Sumanth Ratna d52b6b0e36 Use master for links to docs in source (#10866)
Resolved Conflicts:
        .github/PULL_REQUEST_TEMPLATE.md
        rllib/agents/dqn/apex.py
        rllib/agents/dqn/dqn.py
2020-09-22 23:54:44 +00:00
Sumanth Ratna 5610828fe8 Update max_failures kwarg docstring (#10953) 2020-09-22 23:50:15 +00:00
Richard Liaw 45addcf6bd [minor] fix warning about docker cpus (#10768) 2020-09-22 23:48:23 +00:00
Siyuan (Ryans) Zhuang 720cd82667 Fix typo in ray start output (#10667) 2020-09-22 23:48:09 +00:00
Keqiu Hu 83814342ad [cli][ray] update ray cli message (#10823) 2020-09-22 23:47:59 +00:00
Alex Wu 43bf1641a0 Add java yaml example (#10835)
* java example

* Update python/ray/autoscaler/aws/example-java.yaml

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-09-22 23:47:51 +00:00
Eric Liang 1d520bf796 Add accelerator-type to multi node type example YAML (#10871) 2020-09-22 23:47:40 +00:00
SangBin Cho f1fed7f662 [Doc] Document options method (#10830)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-21 17:56:03 +00:00
SangBin Cho c79eb7984d [docs] Placement group documentation (#10555)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-21 17:55:24 +00:00
Hao Chen 96ab025e66 [Java] rename config ray.redis.address to ray.address (#10772)
Resolved Conflicts:
        java/test.sh
2020-09-21 17:48:03 +00:00
Alex Wu 3205119ccb [autoscaler] hotfix calculate_node_resources (#10874) 2020-09-21 17:45:28 +00:00
Eric Liang 5f190b4e18 [autoscaler] Usability improvements in logging (#10764) 2020-09-21 17:44:39 +00:00
Richard Liaw a9830b4dd3 [cli] make test failure less verbose + print ssh (#10767) 2020-09-21 17:44:28 +00:00
Richard Liaw 3ef55578af [cli] Remove extra wording + fix travis (#10726) 2020-09-21 17:44:19 +00:00
Yiran Wang bdac0ac380 [Autoscaler] Change poll interval to 5 sec when checking VMs status (#10462) 2020-09-21 17:44:07 +00:00
Alex Wu 5bbfc548c1 [1.0] Remove args from ray start (#10659)
Resolved Conflicts:
        java/test.sh
        python/ray/tests/test_multi_node.py
2020-09-21 17:43:22 +00:00
Eric Liang ce671b3a94 Restore plasma directory option (#10784) 2020-09-17 18:50:33 +00:00
Ameer Haj Ali 9db51d21c4 Fix abstraction violations in command_runner interface (#10715)
* Fix abstraction violations in command_runner interface

* user guide

* lint

* breaking abstraction in commands

* extra initialization commands

* more cleanup

* small fixes

* fix test_integration_kubernetes.py

* lint

Co-authored-by: root <root@ip-172-31-28-155.us-west-2.compute.internal>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2020-09-17 18:49:36 +00:00
Kai Fricke 2d08b2bb1c [tune] convert fallback representation to numbers in wandb integration (#10799) 2020-09-17 18:49:17 +00:00
Amog Kamsetty 7bf5f1af8b [Ray SGD] use_local flag + Worker group abstraction (#10539)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-17 18:48:57 +00:00
Barak Michener 49ee123c85 Remove superfluous execution of java (#10750) 2020-09-14 17:19:30 +00:00
Alex Wu df77a31242 [Autoscaler] Unmanaged nodes (#10513) 2020-09-14 17:14:06 +00:00
Ian Rodney 049b7b2017 [docker] Revert to rsync & cp instead of file mount for bootstrap config/key (#10734) 2020-09-14 17:13:06 +00:00
Ian Rodney 0592dcacba [autoscaler] Fix rsync file mounts (#10721) 2020-09-14 17:12:33 +00:00
Ian Rodney 686c389562 [autoscaler] use default value (#10706) 2020-09-14 17:12:15 +00:00
Ian Rodney 826a9253c6 [docker] Detect CPUs in container correctly (#10507)
Co-authored-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2020-09-14 17:11:47 +00:00
Richard Liaw fe23f23680 [tune/rllib] revert removal of queue-trials (#10744) 2020-09-14 17:11:20 +00:00
Eric Liang 70305267d2 Remove colorful from ray core (#10723) 2020-09-14 17:09:56 +00:00
Alex Wu 72e19ede28 [hotfix] accelerator_types (#10725)
* .

* .
2020-09-14 17:09:28 +00:00
Alex Wu c4aaeab256 [Autoscaler] Fix utilization calc (#10728) 2020-09-14 17:08:58 +00:00
Alex Wu c2156c3ffa [hotfix] Autoscaler's K8 support (#10766)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-14 17:08:28 +00:00
Richard Liaw cb4ebb86c0 [autoscaler] make commands very explicit on logs (#10713) 2020-09-10 21:08:40 +00:00
Richard Liaw 9c6ab77d54 [autoscaler] Create provider exactly once (#10703)
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2020-09-10 21:08:23 +00:00
Kai Fricke 9bae286f42 [tune] wandb log cleaning to use yaml representer (#10680)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-10 21:07:49 +00:00
Barak Michener fa304a90ee Bump version number everywhere to 1.0.0 2020-09-09 18:35:14 +00:00
Max Fitton 3e8164ff8a [Dashboard] Logical View Actor Class Grouping Details (#10453)
* wip

* wip

* wip

* wip

* Need to track the timestamp actors are created for the dashboard. This adds that functionality back in and deletes unused code

* Add the materialui lab packages to get access to the Alert component and fix up some vulnerabilities with npm audit.

* Finish supporting information on a per-actor-class basis in the logical view, add bug fixes around timestamps and infeasible task names, and add a new warning popup that shows if there are infeasible actors around.

* lint and add seconds annotation to actor lifetime values

* real lint

* remove typo

* Somehow missed something last lint

* Add new comments for actor states

* Add underscores to some private functions

* Add tooltips to the actor states on the logical view

* change test metrics to be aligned with new changes.

* lint

* Remove some unnecessary log lines and catch error that happens when we try to decode data from an unexpected source

* Re-add a function I had removed. It is used in the Java codebase.

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-09-09 10:34:54 -07:00
Richard Liaw 153813936b [tune] auto infer metrics (#10663)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-09-09 09:53:47 -07:00
Richard Liaw 3501ea396c [tune] All examples to use ConcurrencyLimiter (#10662)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-09-09 09:52:15 -07:00
Alex Wu cd5b99e5e0 [hotfix] redis_password -> _redis_password (#10672) 2020-09-09 09:40:49 -07:00
Kai Yang afa0216280 Remove the '--include-java' option (#10594) 2020-09-09 17:01:17 +08:00
Hao Chen d22980a5c3 [Hotfix] fix bug about code_search_path in JobConfig (#10666) 2020-09-09 15:28:45 +08:00
Kai Fricke d7c7aba99c [tune] Tune experiment analysis improvements (#10645)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-08 21:00:52 -07:00
Alex Wu d9c68fca5c [Core] Logging improvements (#10625)
* other stuff
:

* lint

* .

* .

* lint

* comment

* lint

* .
2020-09-08 20:58:05 -07:00
Kai Fricke 756a9ea641 [tune] add mode/metric parameters to tune.run (#10627)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-08 17:06:21 -07:00