Commit Graph

2858 Commits

Author SHA1 Message Date
krfricke efa1d51aea [tune] Added external PyTorch tutorial test (#10192) 2020-08-31 15:24:31 -07:00
Zhe Zhang 4356abeb1c [Doc] Fix errors in ActorPool documentation (#10410)
* Fix errors in ActorPool documentation

1. map() and map_unordered() name
2. The print statement doesn't work

* Correctly change lines

* Address comments from pr
2020-08-31 13:57:10 -07:00
raoul-khour-ts 25f5614691 [tune] Rrk/sigopt_searcher_improvements (#10446) 2020-08-31 13:15:12 -07:00
PidgeyBE 6917efabc4 [k8s] Replace k8s service name everywhere in ingress manifest (#10445)
Co-authored-by: Pieterjan <pieterjan.soetaert@robovision.eu>
2020-08-31 13:14:40 -07:00
Robert Nishihara 0bba5485d3 [cli] Add prompt command for CLI logger. (#9897)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-31 11:03:53 -07:00
Amog Kamsetty afde3db4f0 [Tune] Synchronous Mode for PBT (#10283) 2020-08-31 00:00:47 -07:00
SangBin Cho 3e5cac80d8 [Tests] Fix Broken GCS restart test. (#10417) 2020-08-30 00:44:34 -07:00
SangBin Cho c8b14fd7e9 [Tests] Enable large test (#10391)
* Enable large tests.

* Lint.

* Fix issue.

* Skip all tests.
2020-08-30 00:44:05 -07:00
Max Fitton 63ad2e3340 [Dashboard] Fix Issue #10319 - Dashboard autoscaler crash (#10323)
* Patch error that occurred when there was an entry in the dashboard logs or errors internal data structures, and a worker was removed from the cluster. This would crash the cluster with a KeyError.

* lint

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-29 23:18:23 -07:00
Richard Liaw 8c753818ab update-scripts (#10425) 2020-08-29 23:16:25 -07:00
Siyuan (Ryans) Zhuang f0c3910d59 [Serialization] Update CloudPickle to 1.6.0 (#9694)
* update cloudpickle to 1.6.0

* fix CI timeout
2020-08-29 23:11:28 -07:00
fyrestone e9b046306a [Dashboard] Dashboard basic modules (#10303)
* Improve reporter module

* Add test_node_physical_stats to test_reporter.py

* Add test_class_method_route_table to test_dashboard.py

* Add stats_collector module for dashboard

* Subscribe actor table data

* Add log module for dashboard

* Only enable test module in some test cases

* CI run all dashboard tests

* Reduce test timeout to 10s

* Use fstring

* Remove unused code

* Remove blank line

* Fix dashboard tests

* Fix asyncio.create_task not available in py36; Fix lint

* Add format_web_url to ray.test_utils

* Update dashboard/modules/reporter/reporter_head.py

Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* Add DictChangeItem type for Dict change

* Refine logger.exception

* Refine GET /api/launch_profiling

* Remove disable_test_module fixture

* Fix test_basic may fail

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <mfitton@berkeley.edu>
2020-08-29 23:09:34 -07:00
Richard Liaw cb438be146 [core] Move log_to_driver back to public (#10422) 2020-08-29 16:35:14 -07:00
Stephanie Wang 9a31166050 Option to disable profiling and task timeline (#10414) 2020-08-29 11:35:22 -07:00
Eric Liang 910d5d2550 [hotfix] Bad merge with num_return_vals (#10418) 2020-08-28 22:45:35 -07:00
Alex Wu bd92cefbf7 [Autoscaler] Move Resource Demand Scheduler Test to Small (#10399) 2020-08-28 21:26:29 -07:00
Ian Rodney d6f2b0d933 [docker] Run profiling without sudo (#10388)
* fix profiling for docker

* small fixes

* use name

* do not import pwd on windows
2020-08-28 21:25:10 -07:00
Eric Liang f6a1698bab [autoscaler] Add documentation for multi node type autoscaling (#10405) 2020-08-28 19:57:21 -07:00
Eric Liang 2a204260a8 [api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
Alex Wu b1f3c9e10e [Autoscaler] Fix resource passing bug fix (#10397) 2020-08-28 15:43:18 -07:00
Kishan Sagathiya 2afb54c99c Validate non-integral args to ray.remote (#10221) 2020-08-28 17:18:01 -05:00
Eric Liang 519354a39a [api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
Edward Oakes 9c25ca6f5e [hotfix] Fix test_cli.py (#10403) 2020-08-28 15:43:58 -05:00
SangBin Cho 68c2dcd12b Fix. (#10390) 2020-08-28 08:22:23 -07:00
Edward Oakes c3ed403def fix typo (#10382) 2020-08-28 09:57:04 -05:00
SangBin Cho 7b29eb7949 [Build] Try parallel Python builds. (#10291)
* Trial 1.

* Parallelize even more.
2020-08-28 00:06:52 -07:00
SongGuyang cb70864c04 [cpp worker] support cluster mode and object Put/Get works (#9682) 2020-08-28 13:53:36 +08:00
Richard Liaw 0d22c0b653 [tune] Avoid recreating actor multiple times (#10374) 2020-08-27 18:02:26 -07:00
Richard Liaw 922bf9f45a [cli] improve error handling, don't swallow errors (#10370) 2020-08-27 17:59:44 -07:00
Richard Liaw ed5de89470 FIX: Lint (#10384) 2020-08-27 17:56:39 -07:00
SangBin Cho f35339b5ff [Dashboard] Change default ip address for the dashboard to ipv4 (#10287)
* Done.

* Add todo.

* Addressed code review.

* Fix issue.

* Fix test failure.

* Fix a test.
2020-08-27 14:43:10 -07:00
Alex Wu 7dbc1f439c [hotfix] Autoscaler monitor fix unit tests 2020-08-27 14:26:41 -07:00
Alex Wu 76898d4ebc [Autoscaler][hotfix] Remove additionalProperties from available_node_types schema (#10366) 2020-08-27 13:56:44 -07:00
Eric Liang bd245a1c18 [api] Clean up and document Actor name / lifetime API (#10332) 2020-08-27 13:38:39 -07:00
Clark Zinzow 0178d6318e [Core] Expand job ID to 4 bytes by removing object flag bytes. (#10187) 2020-08-27 14:08:17 -05:00
Philipp Moritz b8673e5697 [autoscaler] Make KeyName optional in AWS autoscaler (#10336) 2020-08-27 11:08:44 -07:00
architkulkarni eea7a86163 [Serve] add type hints for controller and backend_worker (#10288) 2020-08-27 10:20:36 -07:00
Stephanie Wang f75dfd60a3 [api] API deprecations and cleanups for 1.0 (internal_config and Checkpointable actor) (#10333)
* remove

* internal config updates, remove Checkpointable

* Lower object timeout default

* remove json

* Fix flaky test

* Fix unit test
2020-08-27 10:19:53 -07:00
Amog Kamsetty 0aec4cbccb [Tune] Update PBT Transformers Example (#10289)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
2020-08-27 08:25:05 -07:00
krfricke 53ab228b75 [tune] Fix log to file on actor reuse (#10363) 2020-08-27 08:22:19 -07:00
Alex Wu 6d2af33a01 [Autoscaler] Proper resource demand plumbing (#10329) 2020-08-26 23:36:01 -07:00
Lixin Wei fe6daef85e [Core]Add runtime context for python worker (#10309)
* add runtime context for python

* fixed

* code fixed

* test added

* lint

* lint
2020-08-26 20:11:42 -07:00
Ameer Haj Ali 17c8c63e7e Metadata schema (#10328)
* metadata

* Eric

Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2020-08-26 15:43:03 -07:00
Richard Liaw 29e8a664c4 [cli] make sure old-style works (#10344) 2020-08-26 15:26:24 -07:00
Ian Rodney dc378a80b7 [autoscaler/docker] Docker Inititialization Revamp (#9515)
* Basic idea

* Small fixes

* dockerize start commands in Command Runner

* Remove run_init from CommandRunnerInterface

* Add Parens

Co-authored-by: Simon Mo <simon.mo@hey.com>

* Cleaning up

* Response to richards comments

* Further small fixes

* Fix Json

* schema format fix

* cleanup

* run more often

* fix indent

* Fix richards responses

* fix ups

* remove docker_commands from schema

* default to list

* fix docker cmd runner test

* lint fix

Co-authored-by: Simon Mo <simon.mo@hey.com>
2020-08-26 10:29:06 -07:00
Michael Luo 4e9888ce2f [RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
Alex Wu 9ca159aa0b [Autoscaler] Multi node commands (#10236) 2020-08-25 23:35:38 -07:00
Amog Kamsetty 8c0503ddd3 [Tune] Convert PBT DCGAN Example to Function API (#10246)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-25 22:34:19 -07:00
Antoni Baum 87ed20738e [tune] Add on_pause, on_unpause to ConcurrencyLimiter (#10320)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-25 22:33:17 -07:00
Simon Mo ed3fdd2c0b [Serve] Remove register_custom_serializer (#10331) 2020-08-25 21:20:43 -07:00