Commit Graph

703 Commits

Author SHA1 Message Date
Kai Fricke 9a413144b1 [tune] dynamic global checkpointing interval (#13736)
* Add scalability tests

* Move experiment checkpointing into a manager class

* Dynamic global checkpointing

* Actually write checkpoints

* Remove debug message

* Pass `force`

* Pre-review

* Revert scalability commits

* Revert scalability commits

* Apply suggestions from code review
2021-01-29 17:14:46 +01:00
Lena Kashtelyan c583113d66 [Ax] Align optimization mode and reported SEM with Ax (#13611)
* [Ax] Align optimization mode and reported SEM with Ax

Ensure that `mode` aligns with the mode set in Ax + report SEM as None rather than as 0.0 to make use of Ax noise inference

* Account for review

* Update ax.py

* Fix lint

* Fix tests, ad additional checks

* Fix tests for python 3.6

Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-01-28 19:01:51 +01:00
Zhe Zhang 0e7343ec19 [docs] Fix MLflow / Tune example in documentation (#13740)
Minor fixes to make it runnable
2021-01-27 17:16:29 -08:00
Kai Fricke c5b645e3da [tune] add type hints to tune.run(), fix abstract methods of ProgressReporter (#13684) 2021-01-27 16:43:50 +01:00
Kai Fricke 2664a2a8f6 [tune] fix non-deterministic category sampling by switching back to np.random.choice (#13710)
* Enable zoopt tests again, but wait for next release

* Add test and preserve state in trial executor

* Add baseline check with integers

* [tune] fix non-deterministic category sampling, re-enable zoopt tests

* Remove random import

* Disable zoopt tests
2021-01-27 16:42:44 +01:00
Kai Fricke 17760e1510 [tune] update Optuna integration to 2.4.0 API (#13631)
Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-01-23 00:32:37 -08:00
Amog Kamsetty 01d74af89d [horovod] Horovod+Ray Pytorch Lightning Accelerator (#13458) 2021-01-22 16:30:10 -08:00
Kai Fricke 6c23bef2a7 [tune] Allow actor reuse for new trials (#13549)
* Allow actor reuse for new trials

* Fix tests and update conf when starting new trial

* Move magic config to `reset_trial`
2021-01-20 11:25:33 +01:00
Daan Klijn 800304acfb [tune] wandb - WandbLogger now also accepts wandb.data_types.Video (#13169) 2021-01-20 01:19:54 -08:00
Amog Kamsetty 20016c983f [Tune] MLflow Credentials (#13533) 2021-01-19 11:55:13 -08:00
Richard Liaw 7a2997ea8c [tune] support experiment checkpointing for grid search (#13357) 2021-01-18 19:24:36 -08:00
Kai Fricke dc42abb2f5 [tune] placement group support (#13370) 2021-01-18 11:58:57 -08:00
Amog Kamsetty 3f42e6bafe [Tune] Pin Transitive Dependencies (#13358) 2021-01-13 19:10:21 -08:00
Edward Oakes c6fc7124d1 [tune] Fix f-string in error message (#13423) 2021-01-13 18:34:21 -06:00
Kai Fricke 518427627b [tune] buffer trainable results (#13236)
* Working prototype

* Pass buffer length, fix tests

* Don't buffer per default

* Dispatch and process save in one go, added tests

* Fix tests

* Pass adaptive seconds to train_buffered, stop result processing after STOP decision

* Fix tests, add release test

* Update tests

* Added detailed logs for slow operations

* Update python/ray/tune/trial_runner.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Apply suggestions from code review

* Revert tests and go back to old tuning loop

* nit

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-12 18:52:47 +01:00
Amog Kamsetty 0452a3a435 [Tune] Rename MLFlow to MLflow (#13301) 2021-01-11 17:36:55 -08:00
Kai Fricke d4b0a9fadf [tune] convert search spaces: parse spec before flattening (#12785)
* Parse spec before flattening

* flatten after parse

* Test for ValueError if grid search is passed to search algorithms
2021-01-09 18:21:49 +01:00
Amog Kamsetty f68922d043 [Tune] Improve error message for Session Detection (#13255)
* Improve error message

* log once
2021-01-07 22:40:44 +01:00
Kai Fricke 97211a6170 [Tune] Fix tune serve integration example (#13233) 2021-01-06 17:02:04 +01:00
Amog Kamsetty bd19ed31e7 [Tune] Fix PBT Transformers Example (#13174) 2021-01-05 16:31:11 -08:00
Kai Fricke 96c2d3d2b5 [tune] better signature check for tune.sample_from (#13171)
* [tune] better signature check for `tune.sample_from`

* Update python/ray/tune/sample.py

Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>

Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
2021-01-05 08:04:18 -08:00
Amog Kamsetty 15e86581bd [XGboost] Update Documentation (#13017)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-04 17:21:04 -08:00
Amog Kamsetty 7120f3a6ab [Tune] Update URL to fix 403 not found error in PBT tranformers test case (#13131) 2020-12-31 10:45:57 -05:00
Lavanya Shukla 350917958c [docs] fix wandb url (#13094) 2020-12-28 17:19:17 -08:00
Antoni Baum a4f2dd2138 [Tune]Add integer loguniform support (#12994)
* Add integer quantization and loguniform support

* Fix hyperopt qloguniform not being np.log'd first

* Add tests, __init__

* Try to fix tests, better exceptions

* Tweak docstrings

* Type checks in SearchSpaceTest

* Update docs

* Lint, tests

* Update doc/source/tune/api_docs/search_space.rst

Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>

Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2020-12-23 09:27:16 -08:00
Richard Liaw 038a50af52 [tune] skopt fix-extra-import (#12970)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-20 01:01:09 -08:00
Amog Kamsetty 5d3c9c8861 [Tune] Mlflow Integration (#12840)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-19 00:40:02 -08:00
Kai Fricke 55ae567f7a [tune] Fix and enable SigOpt tests (#12877)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-18 01:33:12 -08:00
Kai Fricke 426f8a8d15 [tune] Fix tutorial training on GPU (#12914) 2020-12-18 01:31:40 -08:00
Farzan Taj 53378170e0 [tune] Change pickle to ray.cloudpickle -- support large models (#12958)
* Change pickle to ray.cloudpickle

* Change pickle import to ray.cloudpickle
2020-12-17 19:17:08 -08:00
Kai Fricke 3d72000826 [tune] Add points_to_evaluate to BasicVariantGenerator (#12916)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-17 19:16:03 -08:00
Kai Fricke ea1228074d [tune] enable points_to_eval for all search algorithms (#12790)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-15 11:51:53 -08:00
Kai Fricke 5f04ade6ef [tune] add more stoppers and stopper documentation (#12750)
* Add new stoppers & docs

* Add tests for maximum iteration stopper and trial plateau stopper

* Update python/ray/tune/stopper.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/tune/api_docs/stoppers.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/tune/api_docs/stoppers.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Apply suggestions from code review

* Apply suggestions from code review

* Update python/ray/tune/stopper.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-12 01:47:19 -08:00
Kai Fricke 905652cdd6 [tune] migrate xgboost callback api (#12745)
* Migrate to new-style xgboost callbacks

* Fix flaky progress reporter test

* Fix import error

* Take last value (not first)
2020-12-12 01:42:20 -08:00
Kai Fricke 42c70be073 [tune] Hyperopt: Directly accept category variables instead of indices (#12715)
* [tune] Hyperopt: Directly accept category variables instead of indices

* Fix interrupt test

* Update python/ray/tune/suggest/hyperopt.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Apply suggestions from code review

* Update python/ray/tune/suggest/hyperopt.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* lint

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-12 01:40:53 -08:00
Eric Squires 9f70293700 Remove debug extras from setup.py (#12751) 2020-12-10 16:23:11 -06:00
Richard Liaw 974570b4fb oops (#12728)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-09 13:38:10 -08:00
Kai Fricke df10b84113 [Release] release tests yamls for Tune & GPU (#12496) 2020-12-08 10:15:07 -08:00
Kai Fricke 1c0d10f67e [tune] Add xgboost_ray integration (#12572) 2020-12-04 13:59:20 -08:00
Kai Fricke 219c445648 [tune] verbosity refactor second attempt (#12571)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-04 13:56:26 -08:00
Marci f965537ae9 [tune] Callable accepted for register_env (#12618) 2020-12-04 12:21:25 -08:00
Richard Liaw 1ce5e0e99f [tune] Fix file descriptor leak by syncer (#12590) 2020-12-03 13:39:04 -08:00
Richard Liaw 7c58a85fed [tune] fix Tensorboard file descriptor leak (#12425) 2020-12-03 00:06:54 -08:00
Kaushik B 7422abddb4 [tune] trim kwargs in shim instantiation functions (#12544) 2020-12-02 12:07:00 -08:00
Richard Liaw da42bf29d0 [tune] horovod release test (#12495) 2020-12-02 12:04:54 -08:00
Richard Liaw a21523c709 [tune/core] serialization debugging utility (#12142)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-12-02 00:52:17 -08:00
Richard Liaw 4dc16730a7 [tune] with-params fix (#12522) 2020-12-01 16:47:03 -08:00
Richard Liaw 9ce7ad17fd [tune] remove some bottlenecks in trialrunner (#12476) 2020-11-30 14:54:25 -08:00
Richard Liaw 323941c745 [tune] fix pbt flakey test (#12418) 2020-11-25 16:58:37 -08:00
Kai Fricke b94bfdfa99 [tune] use default anonymous metric _metric if at least a mode is set (#12159)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-23 20:09:33 -08:00