diff --git a/doc/source/serve/advanced.rst b/doc/source/serve/advanced.rst
index 8b107b8b3..64177b7a7 100644
--- a/doc/source/serve/advanced.rst
+++ b/doc/source/serve/advanced.rst
@@ -16,7 +16,7 @@ the properties of a particular backend.
 Scaling Out
 ===========
 
-To scale out a backend to multiple workers, simplify configure the number of replicas.
+To scale out a backend to multiple workers, simply configure the number of replicas.
 
 .. code-block:: python
 
@@ -32,7 +32,7 @@ This will scale up or down the number of workers that can accept requests.
 Using Resources (CPUs, GPUs)
 ============================
 
-To assign hardware resource per worker, you can pass resource requirements to
+To assign hardware resources per worker, you can pass resource requirements to
 ``ray_actor_options``. To learn about options to pass in, take a look at
 :ref:`Resources with Actor<actor-resource-guide>` guide.
 
@@ -173,7 +173,7 @@ Session Affinity
 ----------------
 
 Splitting traffic randomly among backends for each request is is general and simple, but it can be an issue when you want to ensure that a given user or client is served by the same backend repeatedly.
-To address this, Serve offers a "shard key" can be specified for each request that will deterministically map to a backend.
+To address this, a "shard key" can be specified for each request that will deterministically map to a backend.
 In practice, this should be something that uniquely identifies the entity that you want to consistently map, like a client ID or session ID.
 The shard key can either be specified via the X-SERVE-SHARD-KEY HTTP header or :mod:`handle.options(shard_key="key") <ray.serve.handle.RayServeHandle.options>`.
 
diff --git a/doc/source/serve/architecture.rst b/doc/source/serve/architecture.rst
index ac078fe63..ac98d5c68 100644
--- a/doc/source/serve/architecture.rst
+++ b/doc/source/serve/architecture.rst
@@ -52,7 +52,7 @@ FAQ
 How does Serve handle fault tolerance?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Application errors like exceptions in your model evaluation code is catched and
+Application errors like exceptions in your model evaluation code are caught and
 wrapped. A 500 status code will be returned with the traceback information. The
 worker replica will be able to continue to handle requests.
 
diff --git a/doc/source/serve/deployment.rst b/doc/source/serve/deployment.rst
index 8c0cf672e..da8068213 100644
--- a/doc/source/serve/deployment.rst
+++ b/doc/source/serve/deployment.rst
@@ -110,7 +110,6 @@ these two models.
 While this is a simple operation, you may want to see :ref:`serve-split-traffic` for more information.
 One thing you may want to consider as well is
 :ref:`session-affinity` which gives you the ability to ensure that queries from users/clients always get mapped to the same backend.
-versions.
 
 Now that we're up and running serving two models in production, let's query
 our results several times to see some results. You'll notice that we're now splitting
diff --git a/doc/source/serve/tutorials/batch.rst b/doc/source/serve/tutorials/batch.rst
index d8fc13a3d..1183a22c3 100644
--- a/doc/source/serve/tutorials/batch.rst
+++ b/doc/source/serve/tutorials/batch.rst
@@ -4,7 +4,7 @@ Batching Tutorial
 =================
 
 In this guide, we will deploy a simple vectorized adder that takes
-a batch of queries and add them at once. In particular, we show:
+a batch of queries and adds them at once. In particular, we show:
 
 - How to implement and deploy Ray Serve model that accepts batches.
 - How to configure the batch size.
@@ -60,7 +60,7 @@ the input value, convert them into an array, and use NumPy to add 1 to each elem
 
 Let's deploy it. Note that in the ``config`` section of ``create_backend``, we
 are specifying the maximum batch size via ``config={"max_batch_size": 4}``. This
-configuration option limits the maximum possible batch size send to the backend.
+configuration option limits the maximum possible batch size sent to the backend.
 
 .. note::
     Ray Serve performs *opportunistic batching*. When a worker is free to evaluate
diff --git a/doc/source/serve/tutorials/pytorch.rst b/doc/source/serve/tutorials/pytorch.rst
index 214637557..62c7b2f2a 100644
--- a/doc/source/serve/tutorials/pytorch.rst
+++ b/doc/source/serve/tutorials/pytorch.rst
@@ -12,7 +12,7 @@ In particular, we show:
 Please see the :doc:`../key-concepts` to learn more general information about Ray Serve.
 
 This tutorial requires Pytorch and Torchvision installed in your system. Ray Serve
-is framework agnostic and work with any version of PyTorch.
+is framework agnostic and works with any version of PyTorch.
 
 .. code-block:: bash
 
diff --git a/doc/source/serve/tutorials/tensorflow.rst b/doc/source/serve/tutorials/tensorflow.rst
index 73bc577dc..4ce9a2278 100644
--- a/doc/source/serve/tutorials/tensorflow.rst
+++ b/doc/source/serve/tutorials/tensorflow.rst
@@ -11,7 +11,7 @@ In particular, we show:
 
 Please see the :doc:`../key-concepts` to learn more general information about Ray Serve.
 
-Ray Serve is framework agnostic you can use any version of Tensorflow.
+Ray Serve is framework agnostic -- you can use any version of Tensorflow.
 However, for this tutorial, we use Tensorflow 2 and Keras. Please make sure you have
 Tensorflow 2 installed.
 
diff --git a/python/ray/serve/api.py b/python/ray/serve/api.py
index 6bf5140e3..f469e6b69 100644
--- a/python/ray/serve/api.py
+++ b/python/ray/serve/api.py
@@ -358,7 +358,7 @@ def start(detached: bool = False,
             this to "0.0.0.0". One HTTP server will be started on each node in
             the Ray cluster.
         http_port (int): Port for HTTP server. Defaults to 8000.
-        http_middleswares (list): A list of Starlette middlewares that will be
+        http_middlewares (list): A list of Starlette middlewares that will be
             applied to the HTTP servers in the cluster.
     """
     # Initialize ray if needed.
diff --git a/python/ray/serve/examples/doc/quickstart_class.py b/python/ray/serve/examples/doc/quickstart_class.py
index 890aed094..7a8c3d7ff 100644
--- a/python/ray/serve/examples/doc/quickstart_class.py
+++ b/python/ray/serve/examples/doc/quickstart_class.py
@@ -18,4 +18,4 @@ client.create_backend("counter", Counter)
 client.create_endpoint("counter", backend="counter", route="/counter")
 
 requests.get("http://127.0.0.1:8000/counter").json()
-# > {"current_counter": self.count}
+# > {"current_counter": 0}