Commit Graph

6812 Commits

Author SHA1 Message Date
Simon Mo e08b5d0cae [Serve] Add a minimal cli (#5854)
* Add a minimal cli

* Integrate serve_cli with ray scripts
2019-10-28 09:51:31 -07:00
Richard Liaw 085a6713a0 [docs] Add documentation for Dynamic Custom Resources (#6000) 2019-10-27 17:58:04 -07:00
Philipp Moritz 80c01617a3 Optimize python task execution (#6024) 2019-10-27 00:43:34 -07:00
mehrdadn e706cb63cc Fix missing double quotes for spaces in paths (#6026) 2019-10-26 20:46:55 -07:00
Stephanie Wang eb41c945a1 Add gRPC endpoint to raylet to expose metrics (#6005) 2019-10-26 16:37:39 -07:00
Philipp Moritz 010270b3dc Cleanup left over shell scripts in build process (#6017) 2019-10-26 15:46:46 -07:00
Eric Liang a0dcb45dc3 [rllib] Fix APEX priorities returning zero all the time (#5980)
* fix

* move example tests to end

* level err

* guard against none

* no trace test

* ignore thumbs

* np

* fix multi node

* fix
2019-10-26 13:23:42 -07:00
Philipp Moritz 0bb922c29f Revert "Use plasma with batched CreateAndSeal implemented (#5864)" (#6022)
This reverts commit 875c84ed63.
2019-10-25 23:02:21 -07:00
Eric Liang a5523466a2 Enable memstore by default (#6003) 2019-10-25 21:59:12 -07:00
Simon Mo f1d2eb5247 Apply shallow-since and sha256 (#6019) 2019-10-25 19:48:04 -07:00
Edward Oakes d4055d70e3 Remove CoreWorkerTaskExecutionInterface (#6009) 2019-10-25 16:33:44 -07:00
Edward Oakes e6141a0b8b Remove UsePush logic from raylet (#6015) 2019-10-25 14:52:19 -07:00
Edward Oakes f8a6ed7832 Spawn processes in background sessions (#6008)
Allows us to properly handle KeyboardInterrupts in interactive python interpreters.
2019-10-25 13:01:35 -07:00
Edward Oakes 1ce521a7f3 Remove task context from python worker (#5987)
Removes duplicated state between the python and C++ workers. Also cleans up the serialization codepaths a bit.
2019-10-25 07:38:33 -07:00
Ujval Misra cf16b2f0c4 Add timesteps and remove ID from progress output (#5999) 2019-10-25 00:48:42 -07:00
Eric Liang 4edae7ea2b Speed up task submissions a bit (#5992) 2019-10-25 00:10:37 -07:00
Edward Oakes 6f27d881bd Fix core worker shutdown errors (#6004) 2019-10-24 22:29:05 -07:00
Edward Oakes 71a2f4c63d fix comment (#6006) 2019-10-24 18:07:49 -07:00
Edward Oakes 436dd936d2 Update profiling numbers (#5989) 2019-10-24 18:02:44 -07:00
Edward Oakes c73fdb7425 Ignore errors in ObjectID.__dealloc__ (#5997) 2019-10-24 16:48:47 -07:00
Edward Oakes c69e9aafdc Update release doc (#5988)
* Update release doc

* Add comment about get_contributors.py
2019-10-24 11:13:37 -07:00
Eric Liang 34fbc7fb4c rllib] Fix leak of TensorFlow assign operations in DQN/DDPG 2019-10-23 00:28:15 -07:00
Philipp Moritz 09d05bb3fa Reduce actor submission python overhead (#5949) 2019-10-23 00:11:32 -07:00
Danyang Zhuo 875c84ed63 Use plasma with batched CreateAndSeal implemented (#5864) 2019-10-22 21:32:19 -07:00
Edward Oakes 02931e08f3 [core worker] Python core worker task execution (#5783)
Executes tasks via the event loop in the C++ core worker. Also properly handles signals (including KeyboardInterrupt), so ctrl-C in a python interactive shell works now (if connecting to an existing cluster).
2019-10-22 20:15:59 -07:00
Siyuan (Ryans) Zhuang 95241f6686 Fix the incorrect serialization behavior with pickle (#5960) 2019-10-22 18:08:36 -07:00
Philipp Moritz b6e7ed20ce Fix random numbers on linux wheel build (#5975) 2019-10-22 17:52:12 -07:00
Eric Liang f7bda0abad [rllib] Fix rnn shape with multi-dimensional data (#5939)
* fix shape

* add test

* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Richard Liaw 81dd0dfb0a [tune] fix conditional identifier (#5971)
* fix conditional identifier

* fix

* doc
2019-10-22 02:00:49 -07:00
Leo Sklyut 832b5ce1f6 [docs] fix code block display (#5967) 2019-10-22 00:45:38 -07:00
Richard Liaw 252a5d13ed [sgd/tune][minor] more tf ports (#5953) 2019-10-21 16:46:16 -07:00
Mitchell Stern 235dec8aa3 [Dashboard] Remove token authentication from dashboard (#5888) 2019-10-21 12:48:48 -07:00
Richard Liaw 26a724c5e6 [core] Support kwargs and positionals in Ray remote calls (#5606) 2019-10-20 22:40:54 -07:00
Edward Oakes fc56872012 Send active object IDs to the raylet (#5803)
* Send active object IDs to the raylet

* comment

* comments

* dedup

* signed int in config

* comments

* Remove object ID from monitor

* Fix test

* re-add check

* fix cast

* check if core worker

* Add comment

* Reservoir sampling

* Fix lint

* Pointer return

* tmp

* Fix merge

* Initialize object ids properly

* Fix lint
2019-10-20 22:05:28 -07:00
Zhuohan Li f286356e06 [docs] add pages about examples on training language models with fairseq (#5755)
* add pages about examples on training language models with fairseq and ray autoscaler

* better format

* update ray_train.sh

* Move EFS to the autoscaler file

* nits

* add comments to the code & use a new way to implement checkpoint hook

* small bug fix

* polish the doc

* fix formatting

* yaml

* update docs

* fix the bugs and add preprocess.sh

* fix lint

* Reduce batch size & fix lint

* shorttitle
2019-10-20 20:28:16 -07:00
Simon Mo 6b36ef1138 [Serve] Ensure strict traffic splitting (#5929)
* [Serve] Ensure strict traffic splitting

* Fix test
2019-10-20 20:18:14 -07:00
Stephanie Wang bc4a0de4da Fix multiple drivers for named actors and add test (#5956) 2019-10-20 16:04:21 -07:00
Richard Liaw 74852c80cb [docs] Improve more serialization Errors (#5658) 2019-10-20 14:06:00 -07:00
Richard Liaw 91acecc9f9 [tune][minor] gpu warning (#5948)
* gpu

* formaat

* defaults

* format_and_check

* better registration

* fix

* fix

* trial

* foramt

* tune
2019-10-19 17:09:48 -07:00
Philipp Moritz d23696de17 Introduce flag to use pickle for serialization (#5805) 2019-10-18 22:29:36 -07:00
Philipp Moritz 29eee7f970 Forward multiple ports for autoscaler (#5893) 2019-10-18 16:50:46 -07:00
Richard Liaw 48ba484640 [tune] Test TF2.0, TF1.14, TF1.12 Tensorboard support (#5931) 2019-10-18 13:50:42 -07:00
Stephanie Wang 697f765efc Refactor CoreWorker to remove TaskInterface (#5924)
* Remove TaskInterface

* Remove Status return value

* Remove CActorHandle, some return values, TaskSubmitter

* lint

* doc

* doc

* fix build

* lint

* Return Status, guarded by annotation, fail tasks for RECONSTRUCTING actors

* fix

* move annotation

* revert

* Fix core worker test

* nits
2019-10-18 00:03:57 -04:00
Stephanie Wang 3ac8592dcf Remove actor handle IDs (#5889)
* Remove actor handle ID from main ActorHandle constructor

* Set the actor caller ID when calling submit task instead of in the actor handle

* Remove ActorHandle::Fork, remove actor handle ID from protobuf

* Make inner actor handle const, remove new_actor_handles

* Move caller ID into the common task spec, start refactoring raylet

* Some fixes for forking actor handles

* Store ActorHandle state in CoreWorker, only expose actor ID to Python

* Remove some unused fields

* lint

* doc

* fix merge

* Remove ActorHandleID from python/cpp

* doc

* Fix core worker test

* Move actor table subscription to CoreWorker, reset actor handles on actor failure

* lint

* Remove GCS client from direct actor

* fix tests

* Fix

* Fix tests for raylet codepath

* Fix local mode

* Fix multithreaded test

* Fix AsyncSubscribe issue...

* doc

* fix serve

* Revert bazel
2019-10-17 12:36:34 -04:00
Stefan Otte d70abcfd70 Fix typo in examples/centralized_critic.py (#5943)
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Alexander Scammon 4d08d3c188 Add dependencies for dashboard to installation.rst (#5942)
Updating the docs to include pip installing `aiohttp` and `psutil`, both of which the dashboard requires.  Since the whole dashboard section is optional, I thought I'd just add it in the docs rather than make it an explicit requirement of the project.  Tell me if you'd prefer them as requirements in the `setup.py`, though.
2019-10-17 00:39:56 -07:00
Philipp Moritz 32b2907457 Update max resource label and give better error message (#5916) 2019-10-16 22:37:01 -07:00
Peter Schafhalter 6c11b534c8 [Autoscaler] Update AWS Deep Learning AMI to version 24.3 (#5932) 2019-10-16 16:50:54 -07:00
Richard Liaw d52a4983af Update TF documentation (#5918) 2019-10-16 01:31:27 -07:00
Richard Liaw 9f23620412 [tune] tf2.0 mnist example (#5898)
* tfmnistexample

* tfmnist

* add_to_ci

* format

* exampledownlaod

* fix
2019-10-15 22:25:01 -07:00