[docs] Make walkthrough and starting Ray materials clear (#7099)

* make starting ray a separate page

* concept

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* more fics

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
This commit is contained in:
Richard Liaw
2020-02-11 23:17:30 -08:00
committed by GitHub
parent 305eaaabe9
commit fc9352c588
10 changed files with 215 additions and 29 deletions
+4
View File
@@ -1,3 +1,5 @@
.. _ref-automatic-cluster:
Automatic Cluster Setup
=======================
@@ -36,6 +38,8 @@ Test that it works by running the following commands from your local machine:
.. tip:: For the AWS node configuration, you can set ``"ImageId: latest_dlami"`` to automatically use the newest `Deep Learning AMI <https://aws.amazon.com/machine-learning/amis/>`_ for your region. For example, ``head_node: {InstanceType: c5.xlarge, ImageId: latest_dlami}``.
.. note:: You may see a message like: ``bash: cannot set terminal process group (-1): Inappropriate ioctl for device bash: no job control in this shell`` This is a harmless error. If the cluster launcher fails, it is most likely due to some other factor.
GCP
~~~
+3 -1
View File
@@ -1,3 +1,5 @@
.. _configuring-ray:
Configuring Ray
===============
@@ -5,7 +7,7 @@ This page discusses the various way to configure Ray, both from the Python API
and from the command line. Take a look at the ``ray.init`` `documentation
<package-ref.html#ray.init>`__ for a complete overview of the configurations.
.. important:: For the multi-node setting, you must first run `ray start` on the command line before ``ray.init`` in Python. On a single machine, you can run ``ray.init()`` without `ray start`.
.. important:: For the multi-node setting, you must first run ``ray start`` on the command line to start the Ray cluster services on the machine before ``ray.init`` in Python to connect to the cluster services. On a single machine, you can run ``ray.init()`` without ``ray start``, which will both start the Ray cluster services and connect to them.
Cluster Resources
+1
View File
@@ -235,6 +235,7 @@ Getting Involved
:maxdepth: -1
:caption: Ray Core
walkthrough.rst
using-ray.rst
configure.rst
cluster-index.rst
+19
View File
@@ -13,6 +13,9 @@ You can install the latest stable version of Ray as follows.
pip install -U ray # also recommended: ray[debug]
.. _install-nightlies:
Latest Snapshots (Nightlies)
----------------------------
@@ -40,6 +43,22 @@ master branch). To install these wheels, run the following command:
.. _`MacOS Python 3.6`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.9.0.dev0-cp36-cp36m-macosx_10_13_intel.whl
.. _`MacOS Python 3.5`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.9.0.dev0-cp35-cp35m-macosx_10_13_intel.whl
Installing from a specific commit
---------------------------------
You can install the Ray wheels of any particular commit on ``master`` with the following template. You need to specify the commit hash, Ray version, Operating System, and Python version:
.. code-block::
pip install https://ray-wheels.s3-us-west-2.amazonaws.com/master/{COMMIT_HASH}/ray-{RAY_VERSION}-{PYTHON_VERSION}-{PYTHON_VERSION}m-{OS_VERSION}_intel.whl
For example, here are the Ray 0.9.0.dev0 wheels for Python 3.5, MacOS for commit ``a0ba4499ac645c9d3e82e68f3a281e48ad57f873``:
.. code-block::
pip install https://ray-wheels.s3-us-west-2.amazonaws.com/master/a0ba4499ac645c9d3e82e68f3a281e48ad57f873/ray-0.9.0.dev0-cp35-cp35m-macosx_10_13_intel.whl
Installing Ray with Anaconda
----------------------------
+52 -13
View File
@@ -3,8 +3,58 @@ Memory Management
This page describes how memory management works in Ray, and how you can set memory quotas to ensure memory-intensive applications run predictably and reliably.
Overview
--------
Summary
-------
You can set memory quotas to ensure your application runs predictably on any Ray cluster configuration. If you're not sure, you can start with a conservative default configuration like the following and see if any limits are hit.
For Ray initialization on a single node, consider setting the following fields:
.. code-block:: python
ray.init(
memory=2000 * 1024 * 1024,
object_store_memory=200 * 1024 * 1024,
driver_object_store_memory=100 * 1024 * 1024)
For Ray usage on a cluster, consider setting the following fields on both the command line and in your Python script:
.. tip:: 200 * 1024 * 1024 bytes is 200 MiB. Use double parentheses to evaluate math in Bash: ``$((200 * 1024 * 1024))``.
.. code-block:: bash
# On the head node
ray start --head --redis-port=6379 \
--object-store-memory=$((200 * 1024 * 1024)) \
--memory=$((200 * 1024 * 1024)) \
--num-cpus=1
# On the worker node
ray start --object-store-memory=$((200 * 1024 * 1024)) \
--memory=$((200 * 1024 * 1024)) \
--num-cpus=1 \
--address=$RAY_HEAD_ADDRESS:6379
.. code-block:: python
# In your Python script connecting to Ray:
ray.init(
address="auto", # or "<hostname>:<port>" if not using the default port
driver_object_store_memory=100 * 1024 * 1024
)
For any custom remote method or actor, you can set requirements as follows:
.. code-block:: python
@ray.remote(
memory=2000 * 1024 * 1024,
)
Concept Overview
----------------
There are several ways that Ray applications use memory:
@@ -79,14 +129,3 @@ Object store shared memory
--------------------------
Object store memory is also used to map objects returned by ``ray.get`` calls in shared memory. While an object is mapped in this way (i.e., there is a Python reference to the object), it is pinned and cannot be evicted from the object store. However, ray does not provide quota management for this kind of shared memory usage.
Summary
-------
You can set memory quotas to ensure your application runs predictably on any Ray cluster configuration. If you're not sure, you can start with a conservative default configuration like the following and see if any limits are hit:
.. code-block:: python
@ray.remote(
memory=2000 * 1024 * 1024,
object_store_memory=200 * 1024 * 1024)
+107
View File
@@ -0,0 +1,107 @@
Starting Ray
============
This page covers how to start Ray on your single machine or cluster of machines.
.. contents:: :local:
Installation
------------
Install Ray with ``pip install -U ray``. For the latest wheels (a snapshot of the ``master`` branch), you can use the instructions at :ref:`install-nightlies`.
Starting Ray on a single machine
--------------------------------
You can start Ray by calling ``ray.init()`` in your Python script. This will start the local services that Ray uses to schedule remote tasks and actors and then connect to them. Note that you must initialize Ray before any tasks or actors are called (i.e., ``function.remote()`` will not work until `ray.init()` is called).
.. code-block:: python
import ray
ray.init()
To stop or restart Ray, use ``ray.shutdown()``.
.. code-block:: python
import ray
ray.init()
... # ray program
ray.shutdown()
To check if Ray is initialized, you can call ``ray.is_initialized()``:
.. code-block:: python
import ray
ray.init()
assert ray.is_initialized() == True
ray.shutdown()
assert ray.is_initialized() == False
See the `Configuration <configure.html>`__ documentation for the various ways to configure Ray.
Using Ray on a cluster
----------------------
There are two steps needed to use Ray in a distributed setting:
1. You must first start the Ray cluster.
2. You need to add the ``address`` parameter to ``ray.init`` (like ``ray.init(address=...)``). This causes Ray to connect to the existing cluster instead of starting a new one on the local node.
If you have a Ray cluster specification (:ref:`ref-automatic-cluster`), you can launch a multi-node cluster with Ray initialized on each node with ``ray up``. **From your local machine/laptop**:
.. code-block:: bash
ray up cluster.yaml
You can monitor the Ray cluster status with ``ray monitor cluster.yaml`` and ssh into the head node with ``ray attach cluster.yaml``.
Your Python script **only** needs to execute on one machine in the cluster (usually the head node). To connect your program to the Ray cluster, add the following to your Python script:
.. code-block:: python
ray.init(address="auto")
.. note:: Without ``ray.init(address...)``, your Ray program will only be parallelized across a single machine!
Manual cluster setup
~~~~~~~~~~~~~~~~~~~~
You can also use the manual cluster setup (:ref:`ref-cluster-setup`) by running initialization commands on each node.
**On the head node**:
.. code-block:: bash
# If the ``--redis-port`` argument is omitted, Ray will choose a port at random.
$ ray start --head --redis-port=6379
The command will print out the address of the Redis server that was started (and some other address information).
**Then on all of the other nodes**, run the following. Make sure to replace ``<address>`` with the value printed by the command on the head node (it should look something like ``123.45.67.89:6379``).
.. code-block:: bash
$ ray start --address=<address>
Turning off parallelism
-----------------------
.. caution:: This feature is maintained solely to help with debugging, so it's possible you may encounter some issues. If you do, please `file an issue <https://github.com/ray-project/ray/issues>`_.
By default, Ray will parallelize its workload. However, if you need to debug your Ray program, it may be easier to do everything on a single process. You can force all Ray functions to occur on a single process with ``local_mode`` by calling the following:
.. code-block:: python
ray.init(local_mode=True)
Note that some behavior such as setting global process variables may not work as expected.
What's next?
------------
Check out our `Deployment section <cluster-index.html>`_ for more information on deploying Ray in different settings, including Kubernetes, YARN, and SLURM.
+3 -1
View File
@@ -105,7 +105,7 @@ Launching a cloud cluster
If you have already have a list of nodes, go to the `Local Cluster Setup`_ section.
Ray currently supports AWS and GCP. Below, we will launch nodes on AWS that will default to using the Deep Learning AMI. See the `cluster setup documentation <autoscaling.html>`_. Save the below cluster configuration (``tune-default.yaml``):
Ray currently supports AWS and GCP. Follow the instructions below to launch nodes on AWS (using the Deep Learning AMI). See the `cluster setup documentation <autoscaling.html>`_. Save the below cluster configuration (``tune-default.yaml``):
.. literalinclude:: ../../python/ray/tune/examples/tune-default.yaml
:language: yaml
@@ -119,6 +119,8 @@ Ray currently supports AWS and GCP. Below, we will launch nodes on AWS that will
``ray submit --start`` starts a cluster as specified by the given cluster configuration YAML file, uploads ``tune_script.py`` to the cluster, and runs ``python tune_script.py [args]``.
.. note:: You may see a message like: ``bash: cannot set terminal process group (-1): Inappropriate ioctl for device bash: no job control in this shell`` This is a harmless error. If the cluster launcher fails, it is most likely due to some other factor.
.. code-block:: bash
ray submit tune-default.yaml tune_script.py --start --args="--ray-address=localhost:6379"
+2
View File
@@ -1,3 +1,5 @@
.. _ref-cluster-setup:
Manual Cluster Setup
====================
+2 -2
View File
@@ -3,14 +3,14 @@ Using Ray
If youre brand new to Ray, we recommend starting with our `tutorials <https://github.com/ray-project/tutorial>`_.
Below, you'll find information ranging from beginner material (like our `walkthrough <walkthrough.html>`_) to `advanced usage <advanced.html>`_. There are also detailed instructions on how to work with Ray concepts such as Actors and managing GPUs.
Below, you'll find information ranging from how to `start Ray <starting-ray.html>`_ to `advanced usage <advanced.html>`_. There are also detailed instructions on how to work with Ray concepts such as Actors and managing GPUs.
Finally, we've also included some content on using core Ray APIs with `Tensorflow <using-ray-with-tensorflow.html>`_ and `PyTorch <using-ray-with-pytorch.html>`_.
.. toctree::
:maxdepth: -1
walkthrough.rst
starting-ray.rst
actors.rst
using-ray-with-gpus.rst
serialization.rst
+22 -12
View File
@@ -1,14 +1,24 @@
Walkthrough
===========
Ray Core Walkthrough
====================
This walkthrough will overview the core concepts of Ray:
1. Using remote functions (tasks) [``ray.remote``]
2. Fetching results (object IDs) [``ray.put``, ``ray.get``, ``ray.wait``]
3. Using remote classes (actors) [``ray.remote``]
1. Starting Ray
2. Using remote functions (tasks) [``ray.remote``]
3. Fetching results (object IDs) [``ray.put``, ``ray.get``, ``ray.wait``]
4. Using remote classes (actors) [``ray.remote``]
With Ray, your code will work on a single machine and can be easily scaled to a
large cluster. To run this walkthrough, install Ray with ``pip install -U ray``.
With Ray, your code will work on a single machine and can be easily scaled to large cluster.
Installation
------------
To run this walkthrough, install Ray with ``pip install -U ray``. For the latest wheels (for a snapshot of ``master``), you can use these instructions at :ref:`install-nightlies`.
Starting Ray
------------
You can start Ray on a single machine by adding this to your python script.
.. code-block:: python
@@ -18,11 +28,11 @@ large cluster. To run this walkthrough, install Ray with ``pip install -U ray``.
# ray.init(address=<cluster-address>) instead.
ray.init()
See the `Configuration <configure.html>`__ documentation for the various ways to
configure Ray. To start a multi-node Ray cluster, see the `cluster setup page
<using-ray-on-a-cluster.html>`__. You can stop ray by calling
``ray.shutdown()``. To check if Ray is initialized, you can call
``ray.is_initialized()``.
...
Ray will then be able to utilize all cores of your machine. Find out how to configure the number of cores Ray will use at :ref:`configuring-ray`.
To start a multi-node Ray cluster, see the `cluster setup page <using-ray-on-a-cluster.html>`__.
Remote functions (Tasks)
------------------------