Recreate actors when local schedulers die. (#804)

* Reconstruct actor state when local schedulers fail.

* Simplify construction of arguments to pass into default_worker.py from local scheduler.

* Remove deprecated ray.actor.

* Simplify actor reconstruction method.

* Fix linting.

* Small fixes.
This commit is contained in:
Robert Nishihara
2017-08-02 18:02:52 -07:00
committed by Philipp Moritz
parent 37282330c0
commit cb84972f6b
13 changed files with 441 additions and 79 deletions
+5 -2
View File
@@ -183,7 +183,7 @@ def select_local_scheduler(driver_id, local_schedulers, num_gpus,
def publish_actor_creation(actor_id, driver_id, local_scheduler_id,
redis_client):
reconstruct, redis_client):
"""Publish a notification that an actor should be created.
This broadcast will be received by all of the local schedulers. The local
@@ -197,11 +197,14 @@ def publish_actor_creation(actor_id, driver_id, local_scheduler_id,
driver_id: The ID of the driver responsible for the actor.
local_scheduler_id: The ID of the local scheduler that is suposed to
create the actor.
reconstruct: True if the actor should be created in "reconstruct" mode.
redis_client: The client used to interact with Redis.
"""
reconstruct_bit = b"1" if reconstruct else b"0"
# Really we should encode this message as a flatbuffer object. However,
# we're having trouble getting that to work. It almost works, but in Python
# 2.7, builder.CreateString fails on byte strings that contain characters
# outside range(128).
redis_client.publish("actor_notifications",
actor_id + driver_id + local_scheduler_id)
actor_id + driver_id + local_scheduler_id +
reconstruct_bit)