Commit Graph

1196 Commits

Author SHA1 Message Date
Michael Luo 4e9888ce2f [RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
Amog Kamsetty 8c0503ddd3 [Tune] Convert PBT DCGAN Example to Function API (#10246)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-25 22:34:19 -07:00
fyrestone 08adbb371f Cross language exception (#10023) 2020-08-26 10:46:05 +08:00
Eric Liang deea1861ab [rllib] Try fixing torch GPU and masking errors (#10168) 2020-08-25 18:34:19 -07:00
Matthew Strawbridge 7a5af7e744 Fix links to ddpg tuned examples (#9713) 2020-08-25 11:30:13 -07:00
krfricke 5a787a8253 [tune] added FAQ to docs (#10222)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-24 21:51:02 -07:00
Max Fitton 832f5cdccb [Dashboard] Memory View Group by Stack Trace and UI Overhaul (#10227) 2020-08-24 14:54:42 -05:00
Richard Liaw 6bd5458bef [tune] cleanup error messaging/diagnose_serialization helper (#10210) 2020-08-22 11:50:49 -07:00
Simon Mo 6b93ad11d0 [Doc] Add Architecture Doc for Ray Serve (#10204) 2020-08-20 11:40:47 -07:00
Sven Mika d14b501692 [RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115) 2020-08-20 17:05:57 +02:00
Amog Kamsetty 9ff687c093 [SGD][Docs] docs for training/ validation results (#10181) 2020-08-19 17:22:28 -07:00
Simon Mo a785106b47 [Doc] Remove experimental marker for asyncio API (#10202) 2020-08-19 16:52:50 -07:00
architkulkarni a3a9421787 added single quotes in pip install 'ray[rllib]' 2020-08-19 15:34:49 -07:00
Edward Oakes 888f0a2c60 [serve] Use ray.experimental.metrics (#10185) 2020-08-19 13:03:22 -05:00
Arya Irani f733d2648b [docs] fix typo in deployment.rst (#10074)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 00:05:18 -07:00
Richard Liaw 927a073226 [tune] Update node syncing documentation (#10126) 2020-08-17 18:08:27 -07:00
Sven Mika fe0bdb23ff [RLlib] Attention Net/Transformers docs improvement. 2020-08-17 13:07:17 -07:00
Ian Rodney a079f46c25 [autoscaler]/[docker] Cleanup YAMLs & Use RAY docker images (#10108) 2020-08-17 09:49:28 -07:00
krfricke 8f0f7371a0 [tune] Added Kubernetes syncer and sync client (#10097)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
Amog Kamsetty f87a4aa45d [Tune] Pbt Function API (#9958)
* adding function convnet example

* add unit test

* update test

* update example

* wip

* move error from experiment to tune

* wip

* Fix checkpoint deletion

* updating code

* adding smoke test

* updating pbt guide

* formatting

* fix build

* add best checkpoint analysis util

* update test

* add comments

* remove class api

* fix example

* add setup and teardown to tests

* formatting

* Update python/ray/tune/tests/test_trial_scheduler_pbt.py

Co-authored-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-14 17:52:30 -07:00
Sven Mika 66d204e078 [RLlib] Model documentation enhancements. (#10011) 2020-08-13 13:36:40 +02:00
krfricke 16486a8df3 [tune] Add OptunaSearcher wrapper around Optuna samplers (#10044)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-12 16:13:22 -07:00
Eric Liang 7e3e4cd321 [rllib] Execution plan API documentation (#10000)
* wip

* updte

* comments
2020-08-11 23:58:41 -07:00
Richard Liaw 5560272556 [cli] install nightly wheels via ray install-nightly (#10054) 2020-08-11 20:08:22 -07:00
Simon Mo f1ede1099f [Hotfix] Pin opencv-python-headless==4.3.0.36 (#10049) 2020-08-11 15:58:18 -07:00
yncxcw 32cd94b750 [Core] Do not convert gpu id to int (#9744)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Amog Kamsetty 4f8fef134e [Tune] Remove checkpoint_at_end from tune+serve docs (#10034) 2020-08-11 09:05:57 -07:00
PidgeyBE 6ad2fc4831 [autoscaler] Service and Ingress per worker pod (#9359) 2020-08-10 14:13:52 -05:00
Richard Liaw 328b450cde [tune] fix testing for serve-tune test breaking (#9993) 2020-08-07 23:05:18 -07:00
SangBin Cho 39088ab6f2 [Stats] Metrics Export User Interface Part 2 (Prometheus Service Discovery) (#9970)
* In progress.

* In Progress.

* Finish the working version.

* Write a documentation.

* Addressed code review.

* Fix lint error.

* Lint.

* Addressed code review. Make test less flaky.

* Use a random port for ray start.

* Modify doc.

* Make write atomic.
2020-08-07 21:59:24 -07:00
Barak Michener 8e76796fd0 ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
krfricke dee3322ab0 [tune] Ray Tune + Serve end-to-end integration example (#9908)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-07 15:32:49 -07:00
krfricke 0ef8224446 [tune] PBT replay utility scheduler (#9953)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-07 12:41:49 -07:00
Eric Liang 668f555755 [rllib] Clean up outdated docs #9915 2020-08-06 18:29:04 -07:00
SangBin Cho ec2f1a225e [Stats] Metrics Export User Interface Part 1 (#9913)
* Metrics export port expose done.

* Support exposing metrics port + metrics agent service discovery through ray.nodes()

* Formatting.

* Added a doc.

* Linting.

* Change the location of metrics agent port.

* Addressed code review.

* Addressed code review.
2020-08-06 16:16:29 -07:00
Max Fitton 538ad04e96 [Dashboard] Update ActorState in dashboard to support new actor states (#9855)
* Update ActorState in dashboard to support new actor states

* Update dashboard documentation for new states

* Add missing state to doc

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-05 10:35:18 -07:00
Amog Kamsetty 5af7d24f66 [Tune] Transformer blog example (#9789)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-04 22:05:01 -07:00
Ameer Haj Ali 65a2886b0a Docs For on Prem Cluster manager (#9873) 2020-08-04 11:31:09 -07:00
kisuke95 28b1f7710c [Core] Error info pubsub (Remove ray.errors API) (#9665) 2020-08-04 14:04:29 +08:00
Richard Liaw c6404e8cf6 [tune] Search alg checkpointing during training (#9803)
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
2020-08-03 15:07:31 -07:00
Richard Liaw db09f70315 Fix docs build (#9879) 2020-08-03 14:06:05 -07:00
krfricke c741d1cf9c [tune] stdout/stderr logging redirection (#9817)
* Add `log_to_file` parameter, pass to Trainable config, redirect stdout/stderr.

* Add logging handler to root ray logger

* Added test for `log_to_file` parameter

* Added logs, reuse test

* Revert debug change

* Update logdir on reset, flush streams after each train() step

* Remove magic keys from visible config

Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-03 11:18:34 -07:00
Hao Chen 6fb6bd3e61 Refine Java "Ray Core Walkthrough" doc (#9836) 2020-07-31 15:35:43 +08:00
mehrdadn a7b97b6f8a Add shellcheck support (#8574) 2020-07-30 18:39:28 -05:00
krfricke 619e44e54a [tune] Added WandbLogger (#9725)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-30 13:09:03 -07:00
Barak Michener 68f3fec744 *: Centralize requirements.txt and unify dependency versions (#9759)
* python_test: fix cython_examples in doc/ and tests/

* update setup.py to parse the bazel version string better

* all: centralize all python deps into stackable requirements files in python/

* format

* Move cython test into the proper package

* Add cross-reference dependency comments for requirements and setup.py

* re-enable version pinning on CI, fix formatting

* fix up torchvision version

* fix case in shell
2020-07-30 11:22:56 -07:00
Richard Liaw 0c3b9ebeef [tune/sgd] Document func_trainable and add checkpoint context (#9739)
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-30 09:46:37 -07:00
Hao Chen 260bc52254 Java doc: "Ray Core Walkthrough" page (#8595) 2020-07-30 11:13:38 +08:00
Bill Chambers 067c2752f8 [TUNE] Tune Docs re-organization (#9600)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-07-29 11:22:44 -07:00
Bill Chambers 2e9d748100 [Cluster Launcher] Re Org the cluster launcher pages. (#9687) 2020-07-27 13:47:06 -07:00