mirror of
https://github.com/wassname/ray.git
synced 2026-06-27 19:16:19 +08:00
[Doc] Remove trailing whitespaces (#13390)
This commit is contained in:
@@ -9,10 +9,10 @@ the :ref:`Ray Cluster Launcher<ref-autoscaling>`. However, working with the oper
|
||||
running Ray locally -- all interactions with your Ray cluster are mediated by Kubernetes.
|
||||
|
||||
The operator makes use of a `Kubernetes Custom Resource`_ called a *RayCluster*.
|
||||
A RayCluster is specified by a configuration similar to the ``yaml`` files used by the Ray Cluster Launcher.
|
||||
A RayCluster is specified by a configuration similar to the ``yaml`` files used by the Ray Cluster Launcher.
|
||||
Internally, the operator uses Ray's autoscaler to manage your Ray cluster. However, the autoscaler runs in a
|
||||
separate operator pod, rather than on the Ray head node. Applying multiple RayCluster custom resources in the operator's
|
||||
namespace allows the operator to manage several Ray clusters.
|
||||
separate operator pod, rather than on the Ray head node. Applying multiple RayCluster custom resources in the operator's
|
||||
namespace allows the operator to manage several Ray clusters.
|
||||
|
||||
The rest of this document explains step-by-step how to use the Ray Kubernetes Operator to launch a Ray cluster on your existing Kubernetes cluster.
|
||||
|
||||
@@ -24,9 +24,9 @@ The rest of this document explains step-by-step how to use the Ray Kubernetes Op
|
||||
:bash:`kubectl version`.
|
||||
|
||||
.. note::
|
||||
The example commands in this document launch six Kubernetes pods, using a total of 6 CPU and 3.5Gi memory.
|
||||
The example commands in this document launch six Kubernetes pods, using a total of 6 CPU and 3.5Gi memory.
|
||||
If you are experimenting using a test Kubernetes environment such as `minikube`_, make sure to provision sufficient resources, e.g.
|
||||
:bash:`minikube start --cpus=6 --memory=\"4G\"`.
|
||||
:bash:`minikube start --cpus=6 --memory=\"4G\"`.
|
||||
Alternatively, reduce resource usage by editing the ``yaml`` files referenced in this document; for example, reduce ``minWorkers``
|
||||
in ``example_cluster.yaml`` and ``example_cluster2.yaml``.
|
||||
|
||||
@@ -47,9 +47,9 @@ First, we need to apply the `Kubernetes Custom Resource Definition`_ (CRD) defin
|
||||
|
||||
Picking a Kubernetes Namespace
|
||||
-------------------------------
|
||||
The rest of the Kubernetes resources we will use are `namespaced`_.
|
||||
You can use an existing namespace for your Ray clusters or create a new one if you have permissions.
|
||||
For this example, we will create a namespace called ``ray``.
|
||||
The rest of the Kubernetes resources we will use are `namespaced`_.
|
||||
You can use an existing namespace for your Ray clusters or create a new one if you have permissions.
|
||||
For this example, we will create a namespace called ``ray``.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
@@ -57,7 +57,7 @@ For this example, we will create a namespace called ``ray``.
|
||||
|
||||
namespace/ray created
|
||||
|
||||
Starting the Operator
|
||||
Starting the Operator
|
||||
----------------------
|
||||
|
||||
To launch the operator in our namespace, we execute the following command.
|
||||
@@ -70,9 +70,9 @@ To launch the operator in our namespace, we execute the following command.
|
||||
role.rbac.authorization.k8s.io/ray-operator-role created
|
||||
rolebinding.rbac.authorization.k8s.io/ray-operator-rolebinding created
|
||||
pod/ray-operator-pod created
|
||||
|
||||
|
||||
The output shows that we've launched a Pod named ``ray-operator-pod``. This is the pod that runs the operator process.
|
||||
The ServiceAccount, Role, and RoleBinding we have created grant the operator pod the `permissions`_ it needs to manage Ray clusters.
|
||||
The ServiceAccount, Role, and RoleBinding we have created grant the operator pod the `permissions`_ it needs to manage Ray clusters.
|
||||
|
||||
Launching Ray Clusters
|
||||
----------------------
|
||||
@@ -89,7 +89,7 @@ Our RayCluster configuration specifies ``minWorkers:2`` in the second entry of `
|
||||
|
||||
.. note::
|
||||
|
||||
For more details about RayCluster resources, we recommend take a looking at the annotated example ``example_cluster.yaml`` applied in the last command.
|
||||
For more details about RayCluster resources, we recommend take a looking at the annotated example ``example_cluster.yaml`` applied in the last command.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
@@ -100,7 +100,7 @@ Our RayCluster configuration specifies ``minWorkers:2`` in the second entry of `
|
||||
example-cluster-ray-worker-78kp5 1/1 Running 0 64s
|
||||
ray-operator-pod 1/1 Running 0 2m33s
|
||||
|
||||
We see four pods: the operator, the Ray head node, and two Ray worker nodes.
|
||||
We see four pods: the operator, the Ray head node, and two Ray worker nodes.
|
||||
|
||||
Let's launch another cluster in the same namespace, this one specifiying ``minWorkers:1``.
|
||||
|
||||
@@ -132,7 +132,7 @@ Monitoring
|
||||
----------
|
||||
Autoscaling logs are written to the operator pod's ``stdout`` and can be accessed with :code:`kubectl logs`.
|
||||
Each line of output is prefixed by the name of the cluster followed by a colon.
|
||||
The following command gets the last hundred lines of autoscaling logs for our second cluster.
|
||||
The following command gets the last hundred lines of autoscaling logs for our second cluster.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
@@ -172,7 +172,7 @@ and apply it again:
|
||||
To force a restart with the same configuration, you can add an `annotation`_ to the RayCluster resource's ``metadata.labels`` field, e.g.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
|
||||
apiVersion: cluster.ray.io/v1
|
||||
kind: RayCluster
|
||||
metadata:
|
||||
@@ -220,7 +220,7 @@ To finish clean-up, we delete the cluster ``example-cluster`` and then the opera
|
||||
$ kubectl -n ray delete raycluster example-cluster
|
||||
$ kubectl -n ray delete -f ray/python/ray/autoscaler/kubernetes/operator_configs/operator.yaml
|
||||
|
||||
If you like, you can delete the RayCluster customer resource definition.
|
||||
If you like, you can delete the RayCluster customer resource definition.
|
||||
(Using the operator again will then require reapplying the CRD.)
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
@@ -8,7 +8,7 @@ Deploying on Kubernetes
|
||||
This document is mainly for advanced Kubernetes usage. The easiest way to run a Ray cluster on Kubernetes is by using the built-in Cluster Launcher. Please see the :ref:`Cluster Launcher documentation <ray-launch-k8s>` for details.
|
||||
|
||||
|
||||
|
||||
|
||||
This document assumes that you have access to a Kubernetes cluster and have
|
||||
``kubectl`` installed locally and configured to access the cluster. It will
|
||||
first walk you through how to deploy a Ray cluster on your existing Kubernetes
|
||||
@@ -156,7 +156,7 @@ and checking that they are restarted by Kubernetes:
|
||||
ray-worker-5c49b7cc57-jx2w2 1/1 Running 0 10s
|
||||
|
||||
.. _ray-k8s-run:
|
||||
|
||||
|
||||
Running Ray Programs
|
||||
--------------------
|
||||
|
||||
@@ -306,12 +306,12 @@ To use GPUs on Kubernetes, you will need to configure both your Kubernetes setup
|
||||
|
||||
For relevant documentation for GPU usage on different clouds, see instructions for `GKE`_, for `EKS`_, and for `AKS`_.
|
||||
|
||||
The `Ray Docker Hub <https://hub.docker.com/r/rayproject/>`_ hosts CUDA-based images packaged with Ray for use in Kubernetes pods.
|
||||
The `Ray Docker Hub <https://hub.docker.com/r/rayproject/>`_ hosts CUDA-based images packaged with Ray for use in Kubernetes pods.
|
||||
For example, the image ``rayproject/ray-ml:nightly-gpu`` is ideal for running GPU-based ML workloads with the most recent nightly build of Ray.
|
||||
Read :ref:`here<docker-images>` for further details on Ray images.
|
||||
Read :ref:`here<docker-images>` for further details on Ray images.
|
||||
|
||||
Using Nvidia GPUs requires specifying the relevant resource `limits` in the container fields of your Kubernetes configurations.
|
||||
(Kubernetes `sets <https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins>`_
|
||||
(Kubernetes `sets <https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#using-device-plugins>`_
|
||||
the GPU request equal to the limit.) The configuration for a pod running a Ray GPU image and
|
||||
using one Nvidia GPU looks like this:
|
||||
|
||||
@@ -338,11 +338,11 @@ GPU taints and tolerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
.. note::
|
||||
|
||||
Users using a managed Kubernetes service probably don't need to worry about this section.
|
||||
Users using a managed Kubernetes service probably don't need to worry about this section.
|
||||
|
||||
The `Nvidia gpu plugin`_ for Kubernetes applies `taints`_ to GPU nodes; these taints prevent non-GPU pods from being scheduled on GPU nodes.
|
||||
Managed Kubernetes services like GKE, EKS, and AKS automatically apply matching `tolerations`_
|
||||
to pods requesting GPU resources. Tolerations are applied by means of Kubernetes's `ExtendedResourceToleration`_ `admission controller`_.
|
||||
Managed Kubernetes services like GKE, EKS, and AKS automatically apply matching `tolerations`_
|
||||
to pods requesting GPU resources. Tolerations are applied by means of Kubernetes's `ExtendedResourceToleration`_ `admission controller`_.
|
||||
If this admission controller is not enabled for your Kubernetes cluster, you may need to manually add a GPU toleration each of to your GPU pod configurations. For example,
|
||||
|
||||
.. code-block:: yaml
|
||||
@@ -369,7 +369,7 @@ Read about Kubernetes device plugins `here <https://kubernetes.io/docs/concepts/
|
||||
about Kubernetes GPU plugins `here <https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus>`__,
|
||||
and about Nvidia's GPU plugin for Kubernetes `here <https://github.com/NVIDIA/k8s-device-plugin>`__.
|
||||
|
||||
If you run into problems setting up GPUs for your Ray cluster on Kubernetes, please reach out to us at `<https://discuss.ray.io>`_.
|
||||
If you run into problems setting up GPUs for your Ray cluster on Kubernetes, please reach out to us at `<https://discuss.ray.io>`_.
|
||||
|
||||
Questions or Issues?
|
||||
--------------------
|
||||
|
||||
@@ -243,7 +243,7 @@ Most users should pull a Docker image from the `Ray Docker Hub. <https://hub.doc
|
||||
Image releases are `tagged` using the following format:
|
||||
|
||||
|
||||
.. list-table::
|
||||
.. list-table::
|
||||
:widths: 25 50
|
||||
:header-rows: 1
|
||||
|
||||
@@ -255,13 +255,13 @@ Image releases are `tagged` using the following format:
|
||||
- A specific Ray release.
|
||||
* - nightly
|
||||
- The most recent Ray build (the most recent commit on Github ``master``)
|
||||
* - Git SHA
|
||||
* - Git SHA
|
||||
- A specific nightly build (uses a SHA from the Github ``master``).
|
||||
|
||||
|
||||
Each tag has `variants` that add or change functionality:
|
||||
|
||||
.. list-table::
|
||||
.. list-table::
|
||||
:widths: 16 40
|
||||
:header-rows: 1
|
||||
|
||||
@@ -314,7 +314,7 @@ Start out by launching the deployment container.
|
||||
docker run --shm-size=<shm-size> -t -i rayproject/ray
|
||||
|
||||
Replace ``<shm-size>`` with a limit appropriate for your system, for example
|
||||
``512M`` or ``2G``. A good estimate for this is to use roughly 30% of your available memory (this is
|
||||
``512M`` or ``2G``. A good estimate for this is to use roughly 30% of your available memory (this is
|
||||
what Ray uses internally for its Object Store). The ``-t`` and ``-i`` options here are required to support
|
||||
interactive use of the container.
|
||||
|
||||
|
||||
@@ -20,7 +20,7 @@ to a cluster.
|
||||
Quickstart
|
||||
----------
|
||||
|
||||
To get started, first `install Ray <installation.html>`__, then use
|
||||
To get started, first `install Ray <installation.html>`__, then use
|
||||
``ray.util.multiprocessing.Pool`` in place of ``multiprocessing.Pool``.
|
||||
This will start a local Ray cluster the first time you create a ``Pool`` and
|
||||
distribute your tasks across it. See the `Run on a Cluster`_ section below for
|
||||
|
||||
@@ -216,7 +216,7 @@ because they are scheduled on a placement group with the STRICT_PACK strategy.
|
||||
# The child task is scheduled with the same placement group as its parent
|
||||
# although child.options(placement_group=pg).remote() wasn't called.
|
||||
ray.get(child.remote())
|
||||
|
||||
|
||||
ray.get(parent.options(placement_group=pg).remote())
|
||||
|
||||
To avoid it, you should specify `options(placement_group=None)` in a child task/actor remote call.
|
||||
|
||||
@@ -14,7 +14,7 @@ the process is stuck on).
|
||||
.. code-block:: shell
|
||||
|
||||
sudo gdb -batch -ex "thread apply all bt" -p <pid>
|
||||
|
||||
|
||||
Note that you can find the pid of the raylet with ``pgrep raylet``.
|
||||
|
||||
Installation
|
||||
|
||||
@@ -19,7 +19,7 @@ This runs ``ray.init()`` with default options and exposes the client gRPC port a
|
||||
From here, another Ray script can access that server from a networked machine with ``ray.util.connect()``
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
import ray
|
||||
import ray.util
|
||||
|
||||
@@ -32,8 +32,8 @@ From here, another Ray script can access that server from a networked machine wi
|
||||
|
||||
do_work.remote(2)
|
||||
#....
|
||||
|
||||
When the client disconnects, any object or actor references held by the server on behalf of the client are dropped, as if directly disconnecting from the cluster
|
||||
|
||||
When the client disconnects, any object or actor references held by the server on behalf of the client are dropped, as if directly disconnecting from the cluster
|
||||
|
||||
|
||||
===================
|
||||
|
||||
@@ -266,7 +266,7 @@ objects are staying in local memory.
|
||||
**kill actor**: A button to kill an actor in a cluster. It has the same effect as calling ``ray.kill`` on an actor handle.
|
||||
|
||||
**profile**: A button to run profiling. We currently support profiling for 10s,
|
||||
30s and 60s. It requires passwordless ``sudo``. The result of profiling is a py-spy html output displaying how much CPU time the actor spent in various methods.
|
||||
30s and 60s. It requires passwordless ``sudo``. The result of profiling is a py-spy html output displaying how much CPU time the actor spent in various methods.
|
||||
|
||||
|
||||
Memory
|
||||
|
||||
@@ -123,7 +123,7 @@ enter. This will result in the following output:
|
||||
Enter breakpoint index or press enter to refresh: 0
|
||||
> /Users/pcmoritz/tmp/stepping.py(14)<module>()
|
||||
-> result_ref = fact.remote(5)
|
||||
(Pdb)
|
||||
(Pdb)
|
||||
|
||||
You can jump into the call with the ``remote`` command in Ray's debugger.
|
||||
Inside the function, print the value of `n` with ``p(n)``, resulting in
|
||||
@@ -148,7 +148,7 @@ the following output:
|
||||
11 return n * ray.get(n_id)
|
||||
(Pdb) p(n)
|
||||
5
|
||||
(Pdb)
|
||||
(Pdb)
|
||||
|
||||
Now step into the next remote call again with
|
||||
``remote`` and print `n`. You an now either continue recursing into
|
||||
@@ -192,7 +192,7 @@ call site and use ``p(result)`` to print the result:
|
||||
-> result_ref = fact.remote(5)
|
||||
(Pdb) p(result)
|
||||
120
|
||||
(Pdb)
|
||||
(Pdb)
|
||||
|
||||
|
||||
Post Mortem Debugging
|
||||
|
||||
@@ -22,7 +22,7 @@ Let's expose metrics through `ray start`.
|
||||
|
||||
ray start --head --metrics-export-port=8080 # Assign metrics export port on a head node.
|
||||
|
||||
Now, you can scrape Ray's metrics using Prometheus.
|
||||
Now, you can scrape Ray's metrics using Prometheus.
|
||||
|
||||
First, download Prometheus. `Download Link <https://prometheus.io/download/>`_
|
||||
|
||||
@@ -119,7 +119,7 @@ This will allow Prometheus to dynamically find endpoints it should scrape (servi
|
||||
|
||||
Getting Started (Cluster Launcher)
|
||||
----------------------------------
|
||||
When you use a Ray cluster launcher, it is common node IP addresses are changing because cluster is scaling up and down.
|
||||
When you use a Ray cluster launcher, it is common node IP addresses are changing because cluster is scaling up and down.
|
||||
In this case, you can use Prometheus' `file based service discovery <https://prometheus.io/docs/guides/file-sd/#installing-configuring-and-running-prometheus>`_.
|
||||
|
||||
Prometheus Service Discovery Support
|
||||
@@ -135,8 +135,8 @@ Ray periodically updates the addresses of all metrics agents in a cluster to thi
|
||||
Now, modify a Prometheus config to scrape the file for service discovery.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# Prometheus config file
|
||||
|
||||
# Prometheus config file
|
||||
|
||||
# my global config
|
||||
global:
|
||||
|
||||
@@ -11,13 +11,13 @@ Setting up a dataset
|
||||
|
||||
A dataset can be constructed via any python iterable, or a ``ParallelIterator``. Optionally, a batch size, download function, concurrency, and a transformation can also be specified.
|
||||
|
||||
When constructing a dataset, a download function can be specified. For example, if a dataset is initialized with a set of paths, a download function can be specified which converts those paths to ``(input, label)`` tuples. The download function can be executed in parallel via ``max_concurrency``. This may be useful if the backing datastore has rate limits, there is high overhead associated with a download, or downloading is computationally expensive. Downloaded data is stored as objects in the plasma store.
|
||||
When constructing a dataset, a download function can be specified. For example, if a dataset is initialized with a set of paths, a download function can be specified which converts those paths to ``(input, label)`` tuples. The download function can be executed in parallel via ``max_concurrency``. This may be useful if the backing datastore has rate limits, there is high overhead associated with a download, or downloading is computationally expensive. Downloaded data is stored as objects in the plasma store.
|
||||
|
||||
An additional, final transformation can be specified via ``Dataset::transform``. This function is guaranteed to take place on the same worker that training will take place on. It is good practice to do operations which produce large outputs, such as converting images to tensors as transformations.
|
||||
|
||||
Finally, the batch size can be specified. The batch size is the number of data points used per training step per worker.
|
||||
|
||||
.. note:: Batch size should be specified via the dataset's constructor, __not__ the ``config["batch_size"]`` passed into the Trainer constructor. In general, datasets are configured via their own constructor, not the Trainer config, wherever possible.
|
||||
.. note:: Batch size should be specified via the dataset's constructor, __not__ the ``config["batch_size"]`` passed into the Trainer constructor. In general, datasets are configured via their own constructor, not the Trainer config, wherever possible.
|
||||
|
||||
Using a dataset
|
||||
---------------
|
||||
@@ -33,11 +33,11 @@ To use a dataset, pass it in as an argument to ``trainer.train()``. A dataset pa
|
||||
Sharding and Sampling
|
||||
---------------------
|
||||
|
||||
.. note:: These details may change in the future.
|
||||
.. note:: These details may change in the future.
|
||||
|
||||
Datasets use ParallelIterator actors for sharding. In order to handle datasets which do not shard evenly, and streaming datasets (which may not have a defined size), shards are represented as repeated sequences of data. As a result, num_steps should always be specified when training and some data may be oversampled if the data cannot be evenly sharded.
|
||||
|
||||
If the dataset is of a known length (and can be evenly sharded), training for an epoch is eqivalent to setting ``num_steps = len(data) / (num_workers * batch_size)``.
|
||||
If the dataset is of a known length (and can be evenly sharded), training for an epoch is eqivalent to setting ``num_steps = len(data) / (num_workers * batch_size)``.
|
||||
|
||||
Complete dataset example
|
||||
------------------------
|
||||
|
||||
@@ -361,7 +361,7 @@ The following simple example will make the usage clear:
|
||||
The `reconfigure` method is called when the class is created if `user_config`
|
||||
is set. In particular, it's also called when new replicas are created in the
|
||||
future, in case you decide to scale up your backend later. The
|
||||
`reconfigure` method is also called each time `user_config` is updated via
|
||||
`reconfigure` method is also called each time `user_config` is updated via
|
||||
:mod:`client.update_backend_config <ray.serve.api.Client.update_backend_config>`.
|
||||
|
||||
Dependency Management
|
||||
|
||||
@@ -33,7 +33,7 @@ Ray Serve can be used in two primary ways to deploy your models at scale:
|
||||
Chat with Ray Serve users and developers on our `community Slack <https://forms.gle/9TSdDYUgxYs8SA9e8>`_ in the #serve channel and on our `forum <https://discuss.ray.io/>`_!
|
||||
|
||||
.. note::
|
||||
Starting with Ray version 1.2.0, Ray Serve backends take in a Starlette Request object instead of a Flask Request object.
|
||||
Starting with Ray version 1.2.0, Ray Serve backends take in a Starlette Request object instead of a Flask Request object.
|
||||
See the `migration guide <https://docs.google.com/document/d/1CG4y5WTTc4G_MRQGyjnb_eZ7GK3G9dUX6TNLKLnKRAc/edit?usp=sharing>`_ for details.
|
||||
|
||||
Ray Serve Quickstart
|
||||
@@ -98,7 +98,7 @@ or head over to the :doc:`tutorials/index` to get started building your Ray Serv
|
||||
For more, see the following blog posts about Ray Serve:
|
||||
|
||||
- `How to Scale Up Your FastAPI Application Using Ray Serve <https://medium.com/distributed-computing-with-ray/how-to-scale-up-your-fastapi-application-using-ray-serve-c9a7b69e786>`_ by Archit Kulkarni
|
||||
- `Machine Learning is Broken <https://medium.com/distributed-computing-with-ray/machine-learning-serving-is-broken-f59aff2d607f>`_ by Simon Mo
|
||||
- `The Simplest Way to Serve your NLP Model in Production with Pure Python <https://medium.com/distributed-computing-with-ray/the-simplest-way-to-serve-your-nlp-model-in-production-with-pure-python-d42b6a97ad55>`_ by Edward Oakes and Bill Chambers
|
||||
- `Machine Learning is Broken <https://medium.com/distributed-computing-with-ray/machine-learning-serving-is-broken-f59aff2d607f>`_ by Simon Mo
|
||||
- `The Simplest Way to Serve your NLP Model in Production with Pure Python <https://medium.com/distributed-computing-with-ray/the-simplest-way-to-serve-your-nlp-model-in-production-with-pure-python-d42b6a97ad55>`_ by Edward Oakes and Bill Chambers
|
||||
|
||||
|
||||
|
||||
@@ -19,8 +19,8 @@ Backends
|
||||
Backends define the implementation of your business logic or models that will handle requests when queries come in to :ref:`serve-endpoint`.
|
||||
In order to support seamless scalability backends can have many replicas, which are individual processes running in the Ray cluster to handle requests.
|
||||
To define a backend, first you must define the "handler" or the business logic you'd like to respond with.
|
||||
The handler should take as input a `Starlette Request object <https://www.starlette.io/requests/>`_ and return any JSON-serializable object as output. For a more customizable response type, the handler may return a
|
||||
`Starlette Response object <https://www.starlette.io/responses/>`_.
|
||||
The handler should take as input a `Starlette Request object <https://www.starlette.io/requests/>`_ and return any JSON-serializable object as output. For a more customizable response type, the handler may return a
|
||||
`Starlette Response object <https://www.starlette.io/responses/>`_.
|
||||
|
||||
A backend is defined using :mod:`client.create_backend <ray.serve.api.Client.create_backend>`, and the implementation can be defined as either a function or a class.
|
||||
Use a function when your response is stateless and a class when you might need to maintain some state (like a model).
|
||||
|
||||
@@ -37,7 +37,7 @@ Now you can query your web server, for example by running the following in anoth
|
||||
.. code-block:: bash
|
||||
|
||||
curl "http://127.0.0.1:8000/generate?query=Hello%20friend%2C%20how"
|
||||
|
||||
|
||||
The terminal should then print the generated text:
|
||||
|
||||
.. code-block:: bash
|
||||
@@ -66,7 +66,7 @@ Here's how to run this example:
|
||||
|
||||
1. Run ``ray start --head`` to start a local Ray cluster in the background.
|
||||
|
||||
2. In the directory where the example files are saved, run ``python deploy_serve.py`` to deploy our Ray Serve endpoint.
|
||||
2. In the directory where the example files are saved, run ``python deploy_serve.py`` to deploy our Ray Serve endpoint.
|
||||
|
||||
.. note::
|
||||
Because we have omitted the keyword argument ``route`` in ``client.create_endpoint()``, our endpoint will not be exposed over HTTP by Ray Serve.
|
||||
|
||||
@@ -292,9 +292,9 @@ Cancelling tasks
|
||||
|
||||
from ray.exceptions import TaskCancelledError
|
||||
|
||||
try:
|
||||
try:
|
||||
ray.get(obj_ref)
|
||||
except TaskCancelledError:
|
||||
except TaskCancelledError:
|
||||
print("Object reference was cancelled.")
|
||||
|
||||
.. group-tab:: Java
|
||||
|
||||
Reference in New Issue
Block a user