Commit Graph

4326 Commits

Author SHA1 Message Date
simon-mo 89aa47012a Bump version key to 0.8.4 ray-0.8.4 2020-04-01 13:24:18 -07:00
mehrdadn 15dbd88c5d Python 3.8 compatibility (#7754) 2020-04-01 10:06:31 -07:00
Simon Mo 1ab98155eb [Release] Revert Enable GCS Server by Default (#7840) 2020-04-01 10:05:07 -07:00
SangBin Cho c23e56ce9a Metrics Export Service (#7809) 2020-03-30 23:28:32 -07:00
fangfengbin bfb9248532 fix gcs server resolver error (#7822)
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-03-30 22:57:55 -07:00
mehrdadn 8958728139 Windows bug fixes (#7740) 2020-03-30 20:39:23 -05:00
Simon Mo dc9b62e007 Deserialize Args in Event Loop Thread (#7806) 2020-03-30 18:28:13 -07:00
mehrdadn f86e623095 Fix & improve GitHub Actions CI builds (#7784) 2020-03-30 16:29:54 -07:00
Sven Mika e356e97eb2 [RLlib] Assert correct policy class being used in Worker. (#7769) 2020-03-30 14:03:29 -07:00
Richard Liaw fbf02fa7f7 [Hotfix] Lint for Documentation (#7817) 2020-03-30 11:49:05 -07:00
Richard Liaw 18327254b6 [docs] Fix readthedocs rendering (#7810) 2020-03-30 11:40:08 -07:00
Richard Liaw 86cff17e7e [tune/raysgd] Tune API for TorchTrainer + Fix State Restoration (#7547) 2020-03-30 12:58:49 -05:00
Edward Oakes 3a53ea60d9 [Serve] Push route table updates to HTTP proxy (#7774) 2020-03-30 09:53:05 -07:00
Tianyi Chen f889f938e5 [streaming] Use enum to define resource type. (#7813) 2020-03-31 00:03:49 +08:00
Philipp Moritz eb61036ba2 Revert "Pyarrow Segfault Regression Test (#7568)" (#7805)
This reverts commit 57599f075c.
2020-03-29 20:59:05 -07:00
ijrsvt 57599f075c Pyarrow Segfault Regression Test (#7568) 2020-03-29 16:15:24 -07:00
Simon Mo 353d7e107f [Serve] Improve Serialization (#7688) 2020-03-29 14:57:19 -07:00
mehrdadn fc23f79f82 Windows process issues (#7739) 2020-03-29 12:48:32 -07:00
fangfengbin 6ce8b63bb6 fix TestTaskLeaseRenewal test failure (#7765) 2020-03-29 11:18:47 +08:00
Edward Oakes d87563937e Revert "[Dashboard] Metrics Export Service. (#7728)" (#7789) 2020-03-28 19:27:34 -07:00
Eric Liang d6255c3395 Fix build breakage due to soft torch import (#7790) 2020-03-28 19:08:31 -07:00
Sven Mika e4bd5db4d8 [RLlib] Minimal ParamNoise PR. (#7772) 2020-03-28 16:16:30 -07:00
Eric Liang 5cebee68d6 [rllib] Add scaling guide to documentation, improve bandit docs (#7780)
* update

* reword

* update

* ms

* multi node sgd

* reorder

* improve bandit docs

* contrib

* update

* ref

* improve refs

* fix build

* add pillow dep

* add pil

* update pil

* pillow

* remove false
2020-03-27 22:05:43 -07:00
Maksim Smolin 7b27ce2b23 [RaySGD] Convert the head worker to a local model (#7746)
Why are these changes needed?

Running a worker on head (locally, not as a Ray actor) allows for easier handling of stateful stuff like logging and for easier debugging.
2020-03-27 20:19:15 -07:00
Richard Liaw 875309fc48 Revert wide docs (#7782) 2020-03-27 17:46:08 -07:00
Richard Liaw e10dc91821 Fix doc build (#7781) 2020-03-27 17:39:38 -07:00
Mitchell Stern 090a8474b0 [Dashboard] Update dependencies and add linting rules (#7779) 2020-03-27 16:53:49 -07:00
Carl Balmer 0cfb6488a7 changed get_agent_class to from get_trainable_cls (#7758) 2020-03-27 12:17:16 -07:00
Simon Mo 838c1e854f Add results from 0.8.3 release (#7745) 2020-03-27 11:14:15 -07:00
SangBin Cho 86e19959a5 [Dashboard] Tune dashboard bug fix (#7766)
* Figured out why Tune was unavailable.

* Minor fix.
2020-03-27 09:02:30 -07:00
Kai Yang 6a3503c494 Fix reusing the cached hash of nil ID (#7753) 2020-03-27 23:40:03 +08:00
SongGuyang c195dc8f88 Basic C++ worker implementation (#6125) 2020-03-27 23:01:08 +08:00
Sven Mika 93b5c38b7d [RLlib] Noisy layers in DQN throw different errors (issue #7635). (#7750)
* Rollback.

* Fix issue 7635.

* Fix issue 7635.

* LINT and bug fix.
2020-03-26 22:08:34 -07:00
Sven Mika 369a3417c4 [RLlib] Add tf-graph by default when doing Policy.export_model(). (#7759)
* Rollback.

* WIP.

* WIP.

* Fix.

* LINT.
2020-03-26 22:07:10 -07:00
SangBin Cho 7a0befb0a7 [Dashboard] Metrics Export Service. (#7728) 2020-03-26 14:03:00 -07:00
hhoke af3a5705ca --redis-address -> --address (#7760)
Exception tells user to use --redis-address, but it deprecated. This tells the user to use the current --address.
2020-03-26 13:52:39 -07:00
Saurabh Gupta 6ddf84b019 Contextual Bandit algorithms (WIP) (#7642) 2020-03-26 13:41:16 -07:00
Cloud Han c1b05b720d calling register_custom_serializer require ray to be initialized (#7752) 2020-03-26 10:24:06 -07:00
Sven Mika bcf963a53b [RLlib] Bug default policy overrides torch policy. (#7756)
* Rollback.

* Bug fix!
2020-03-26 10:03:20 -07:00
fangfengbin e196fcdbaf Add gcs_service_enabled function to avoid getting environment variable directly (#7742) 2020-03-26 22:02:53 +08:00
Richard Liaw ca6eabc9cb [tune] Fail Fast (#7528)
* pytest

* init cancel

* testing

* Update python/ray/tune/tests/test_tune_server.py

Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>

* change-test

* Apply suggestions from code review

* Apply suggestions from code review

* finished

* set_finished

* tune

* fix

Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-26 00:04:09 -07:00
hubcity 3d0a8662b3 #7246 - Fixing broken links (#7247)
* #7246 - Fixing broken links

* Apply suggestions from code review

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-25 21:46:13 -07:00
Eric Liang 23b6fdcda1 ray memory should collect statistics from all nodes (#7721) 2020-03-25 16:31:31 -07:00
Stephanie Wang 46404d8a0b [core] Pin lineage of plasma objects that are still in scope (#7690)
* Fix deadlock in DrainAndShutdown

* Revert "[core] Revert lineage pinning (#7499) (#7692)"

This reverts commit ba86a02b37.

* debug rllib

* debug rllib

* turn on all rllib tests again

* debug rllib

* Fix drain bug, check number of pending tasks

* revert rllib debug

* remove todo

* Trigger rllib tests

* revert rllib debug commit
2020-03-25 09:29:32 -07:00
Richard Liaw 82b792be33 [tune] IP Check, Flatten Results for TBX (#7705)
* support_flattened

* loggers

* Format logger changes

Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-25 09:18:03 +00:00
Maksim Smolin e95455b7d7 [RaySGD] Add tqdm logging to TorchTrainer (#7588)
* Update issue templates

* Init fp16

* fp16 and schedulers

* scheduler linking and fp16

* to fp16

* loss scaling and documentation

* more documentation

* add tests, refactor config

* moredocs

* more docs

* fix logo, add test mode, add fp16 flag

* fix tests

* fix scheduler

* fix apex

* improve safety

* fix tests

* fix tests

* remove pin memory default

* rm

* fix

* Update doc/examples/doc_code/raysgd_torch_signatures.py

* fix

* migrate changes from other PR

* ok thanks

* pass

* signatures

* lint'

* Update python/ray/experimental/sgd/pytorch/utils.py

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* should address most comments

* comments

* fix this ci

* first_pass

* add overrides

* override

* fixing up operators

* format

* sgd

* constants

* rm

* revert

* Checkpoint the basics

* End of day checkpoint

* Checkpoint log-to-head implementation

* Checkpoint

* Add actor-based batch log reporting, currently segfaults

* Work around progress segfault

* Fix some stuff in quicktorch

* Make things more customizable

* Quality of life fixes

* More quality of life

* Move tqdm logic to training_operator

* Update examples

* Fix some minor bugs

* Fix merge

* Fix small things, add pbar to dcgan

* Run format.sh

* Fix missing epoch number for batch pbar

* Address PR comments

* Fix float is not subscriptable

* Add train_loss to pbar by default

* Isolate tqdm code into a handler system

* Format

* Remove the batch_logs_reporter from distributed runner as well

* Check if the train_loss is avaialbale before using it

* Enable tqdm in the dcgan example

* Fix a crash in no-handler trainers

* Fix

* Allow not calling set_reporters for tests

Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-03-24 23:43:56 -07:00
Richard Liaw 54a892bb84 [tune] Cancel Experiment via Client (#7719)
* init cancel

* testing

* Update python/ray/tune/tests/test_tune_server.py

Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>

* Apply suggestions from code review

* Apply suggestions from code review

* finished

* set_finished

Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-24 20:30:12 -07:00
Simon Mo a519b4f2a9 [Serve] Enhancement in HTTP Methods and Multi-route support (#7709) 2020-03-24 20:25:05 -07:00
Stephanie Wang a1cee6af7b Revert "New scheduler local node (#7441)" (#7732)
This reverts commit 6141fdab95.
2020-03-24 18:32:16 -07:00
Xianyang Liu cc0490b55b Several small fixes for function_manager (#7685) 2020-03-24 14:28:15 -07:00