Commit Graph

96 Commits

Author SHA1 Message Date
Barak Michener 4304aeabc0 set version number to 1.0.0rc2 2020-09-23 00:13:53 +00:00
Barak Michener 9c308f033b Revert "Bump version number everywhere to 1.0.0"
This reverts commit fa304a90ee.
2020-09-23 00:12:18 +00:00
Amog Kamsetty c636f5bd40 [Ray SGD] FP16 Hotfix (#10931) 2020-09-23 00:07:18 +00:00
SangBin Cho c79eb7984d [docs] Placement group documentation (#10555)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-21 17:55:24 +00:00
Amog Kamsetty 7bf5f1af8b [Ray SGD] use_local flag + Worker group abstraction (#10539)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-17 18:48:57 +00:00
Ian Rodney 826a9253c6 [docker] Detect CPUs in container correctly (#10507)
Co-authored-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2020-09-14 17:11:47 +00:00
Alex Wu 72e19ede28 [hotfix] accelerator_types (#10725)
* .

* .
2020-09-14 17:09:28 +00:00
Barak Michener fa304a90ee Bump version number everywhere to 1.0.0 2020-09-09 18:35:14 +00:00
Amog Kamsetty 415be78cc0 [RaySGD] Simplify Builder Process (#10321)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-08 15:19:40 -07:00
Clark Zinzow 36e1f20e9c Add Dask-Ray scheduler callbacks. (#10519)
Improve Dask-on-Ray documentation.

Move to RayCallback(s) namedtuples, and use top-level CBS tuple as source-of-truth for callback methods.
2020-09-08 13:00:58 -07:00
Simon Mo fdd3acd492 Promote ray.experimental.queue to ray.util (#10624) 2020-09-08 12:56:53 -07:00
Alex Wu d6a9f0e2e4 [Core] Accelerator type API (#10561) 2020-09-06 20:58:40 -07:00
Eric Liang 8ee7c182f5 [1.0] move placement groups from experimental to util. Note they are still undocumented. (#10554)
* move files

* Update __init__.py

* remove

* Update __init__.py
2020-09-04 19:01:24 -07:00
Eric Liang da83bbd764 [1.0] Move dask scheduler from experimental to util (#10553)
* move dask

* fix dask
2020-09-04 12:16:32 -07:00
krfricke 06af62ba91 [tune] refactor tune search space (#10444)
* Added basic functionality and tests

* Feature parity with old tune search space config

* Convert Optuna search spaces

* Introduced quantized values

* Updated Optuna resolving

* Added HyperOpt search space conversion

* Convert search spaces to AxSearch

* Convert search spaces to BayesOpt

* Added basic functionality and tests

* Feature parity with old tune search space config

* Convert Optuna search spaces

* Introduced quantized values

* Updated Optuna resolving

* Added HyperOpt search space conversion

* Convert search spaces to AxSearch

* Convert search spaces to BayesOpt

* Re-factored samplers into domain classes

* Re-added base classes

* Re-factored into list comprehensions

* Added `from_config` classmethod for config conversion

* Applied suggestions from code review

* Removed truncated normal distribution

* Set search properties in tune.run

* Added test for tune.run search properties

* Move sampler initializers to base classes

* Add tune API sampling test, fixed includes, fixed resampling bug

* Add to API docs

* Fix docs

* Update metric and mode only when set. Set default metric and mode to experiment analysis object.

* Fix experiment analysis tests

* Raise error when delimiter is used in the config keys

* Added randint/qrandint to API docs, added additional check in tune.run

* Fix tests

* Fix linting error

* Applied suggestions from code review. Re-aded tune.function for the time being

* Fix sampling tests

* Fix experiment analysis tests

* Fix tests and linting error

* Removed unnecessary default_config attribute from OptunaSearch

* Revert to set AxSearch default metric

* fix-min-max

* fix

* nits

* Added function check, enhanced loguniform error message

* fix-print

* fix

* fix

* Raise if unresolved values are in config and search space is already set

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-03 09:06:13 -07:00
Zhe Zhang 4356abeb1c [Doc] Fix errors in ActorPool documentation (#10410)
* Fix errors in ActorPool documentation

1. map() and map_unordered() name
2. The print statement doesn't work

* Correctly change lines

* Address comments from pr
2020-08-31 13:57:10 -07:00
Richard Liaw cb438be146 [core] Move log_to_driver back to public (#10422) 2020-08-29 16:35:14 -07:00
Eric Liang 519354a39a [api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
Yu Shan 5264f888e4 fix iterable dataset (issue 9899) (#9952) 2020-08-22 19:40:38 -07:00
SangBin Cho 92664249e8 Partially Use f string (#10218)
* flynt. trial 1.

* Trial 1.

* Addressed code review.
2020-08-20 18:21:16 -07:00
Amog Kamsetty 9ff687c093 [SGD][Docs] docs for training/ validation results (#10181) 2020-08-19 17:22:28 -07:00
SangBin Cho 44826878ff [Core] Remove Legacy Raylet Code (#9936)
* Remove a flag and some methods in node manager including HandleDisconnectedActor, ResubmitTask, and HandleTaskReconstruction

* Make actor creator always required + remove raylet transport

* Remove actor reporter + remove FinishAssignedActorCreationTask

* Remove actor tasks.

* Remove finishactortask and switched it to finishactorcreation task

* Remove reconstruction policy.

* Remove lineage cache.

* Formatting.

* Remove actor frontier code.

* Removed build error.

* Revert "Remove reconstruction policy."

This reverts commit 9d25c9bced4da5fbcac5d484d51013345f16513b.

* Recover HandleReconstruction to mark expired objects as failed.
2020-08-06 16:37:50 -07:00
Richard Liaw 0c3b9ebeef [tune/sgd] Document func_trainable and add checkpoint context (#9739)
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-30 09:46:37 -07:00
Richard Liaw f3fdb5c5db [tune] distributed torch wrapper (#9550)
* changes

* add-working

* checkpoint

* ccleanu

* fix

* ok

* formatting

* ok

* tests

* some-good-stuff

* fix-torch

* ddp-torch

* torch-test

* sessions

* add-small-test

* fix

* remove

* gpu-working

* update-tests

* ok

* try-test

* formgat

* ok

* ok
2020-07-26 09:37:22 -07:00
krfricke 9f3570828a [tune] move jenkins tests to travis (#9609)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-24 21:22:54 -07:00
krfricke ea4797bf38 [RaySGD] revised existing transformer example to work with transformers>=3.0 (#9661)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-23 10:58:09 -07:00
Philipp Moritz a5f4659d9f Support ray task type checking (#9574) 2020-07-21 19:05:42 -07:00
mehrdadn aa8928fac2 Make more tests compatible with Windows (#9303) 2020-07-15 11:34:33 -05:00
Hao Chen d49dadf891 Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
SangBin Cho 8f19f1eafb [Core] Actor handle refactoring (#8895)
* Marking needed changes.

* Resolve basic dependencies.

* In progress.

* linting.

* In progress 2.

* Linting.

* Refactor done. Cleanup needed.

* Linting.

* Recover kill actor in core worker because it is used inside raylet

* Cleanup.

* Use unique pointer instead. Unit tests are broken now.

* Fix the upstream change.

* Addressed code review 1.

* Lint.

* Addressed code review 2.

* Fix weird github history.

* Lint.

* Linting using clang 7.0.

* Use a better check message.

* Revert cpp stuff.

* Fix weird linting errors.

* Manuall fix all lint issues.

* Update a newline.

* Refactor some interface.

* Addressed all code review.

* Addressed code review
2020-07-07 11:11:41 -07:00
Richard Liaw d35f0e40d0 [tune] Use public methods for trainable (#9184) 2020-07-01 11:00:00 -07:00
Eric Liang 4522038259 [iter] Add .transform() function for arbitrary generator transforms (#8978) 2020-06-25 11:04:14 -07:00
Xianyang Liu b449ece2ea [SGD] Variable worker CPU requirements (#8963) 2020-06-23 00:43:27 -07:00
Alex Wu 40c15b1ba0 [ParallelIterator] Fix for_each concurrent test cases/bugs (#8964)
* Everything works

* Update python/ray/util/iter.py

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>

* .

* .

* removed print statements

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-06-22 18:26:45 -07:00
Richard Liaw e2330ffc35 [sgd] Cleanup code from last PR (#9076) 2020-06-22 15:17:07 -07:00
Richard Liaw acdd873481 [docs/sgd] Fix test failure + make slack link large (#9051) 2020-06-21 15:55:06 -07:00
Richard Liaw 58efec0f2b [sgd] simplify cuda visible device setting (#8775) 2020-06-12 13:53:32 -07:00
SangBin Cho 731ed8d232 [Core] Fix a detached actor bug fix when GCS actor management is off. (#8843) 2020-06-09 15:46:17 -07:00
SangBin Cho 3388864768 [Core] Clean up detached actors (#8759) 2020-06-08 11:22:01 -05:00
Alex Wu a2ec282033 [Doc] Dataset lint fix (#8719) 2020-06-01 19:43:06 -07:00
Alex Wu dcf58a43dc [SGD] Dataset API (#7839) 2020-06-01 15:48:15 -07:00
Edward Oakes c64b694560 Update RaySGD test to use ray.kill instead of __ray_kill__ (#8662) 2020-05-28 22:38:05 -05:00
Amog Kamsetty ae2e1f0883 [Parallel Iterators] Batching + Pipelining optimizations (#7931)
* batching + get_shard pipelining

* duplicate fix

* formatting

* adding performance benchmark

* minor changes

* turn batching off by default
2020-05-26 00:37:57 -07:00
Edward Oakes 860eb6f13a Update named actor API (#8559) 2020-05-24 20:08:03 -05:00
Eric Liang 9a83908c46 [rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Eric Liang aa7a58e92f [rllib] Support training intensity for dqn / apex (#8396) 2020-05-20 11:22:30 -07:00
Max Fitton 13231ba63b Rename redis-port to port and add default (#8406) 2020-05-18 13:25:34 -05:00
Eric Liang 9d012626e5 [rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Edward Oakes 2677b71003 Implement named actors using the GCS service (#8328) 2020-05-09 08:58:10 -05:00
Eric Liang 9f04a65922 [rllib] Add PPO+DQN two trainer multiagent workflow example (#8334) 2020-05-07 23:40:29 -07:00