[tune] Contributor Guide and Design Page (#4716)

* Move setup script out * some changes * Finished Contributor guide * some comments to the design * move * Apply suggestions from code review Co-Authored-By: richardliaw <rliaw@berkeley.edu> * sourcecode * comments
2026-07-02 10:46:13 +08:00 · 2019-05-05 00:04:13 -07:00
parent d81e71e297
commit f2faf5ce75
7 changed files with 196 additions and 1 deletions
@@ -84,7 +84,9 @@ Ray comes with libraries that accelerate deep learning and reinforcement learnin
   tune-schedulers.rst
   tune-searchalg.rst
   tune-package-ref.rst
+   tune-design.rst
   tune-examples.rst
+   tune-contrib.rst

 .. toctree::
   :maxdepth: 1
@@ -4,7 +4,7 @@ RLlib Development
 Development Install
 -------------------

-You can develop RLlib locally without needing to compile Ray by using the `setup-rllib-dev.py <https://github.com/ray-project/ray/blob/master/python/ray/rllib/setup-rllib-dev.py>`__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)
+You can develop RLlib locally without needing to compile Ray by using the `setup-dev.py <https://github.com/ray-project/ray/blob/master/python/ray/setup-dev.py>`__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)

 API Stability
 -------------
@@ -0,0 +1,112 @@
+Contributing to Tune
+====================
+
+We welcome (and encourage!) all forms of contributions to Tune, including and not limited to:
+
+- Code reviewing of patches and PRs.
+- Pushing patches.
+- Documentation and examples.
+- Community participation in forums and issues.
+- Code readability and code comments to improve readability.
+- Test cases to make the codebase more robust.
+- Tutorials, blog posts, talks that promote the project.
+
+
+Setting up a development environment
+------------------------------------
+
+If you have Ray installed via pip (``pip install -U ray``), you can develop Tune locally without needing to compile Ray.
+
+
+First, you will need your own [fork](https://help.github.com/en/articles/fork-a-repo) to work on the code. Press the Fork button on the `ray project page <https://github.com/ray-project/ray/>`__.
+Then, clone the project to your machine and connect your repository to the upstream (main project) ray repository.
+
+.. code-block:: shell
+
+    git clone https://github.com/[your username]/ray.git [path to ray directory]
+    cd [path to ray directory]
+    git remote add upstream https://github.com/ray-project/ray.git
+
+
+Then, run `[path to ray directory]/python/ray/setup-dev.py` `(also here on Github) <https://github.com/ray-project/ray/blob/master/python/ray/setup-dev.py>`__ script.
+This sets up links between the ``tune`` dir (among other directories) in your local repo and the one bundled with the ``ray`` package.
+
+When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)
+
+
+What can I work on?
+-------------------
+
+We use Github to track issues, feature requests, and bugs. Take a look at the
+ones labeled `"good first issue" <https://github.com/ray-project/ray/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22>`__ and `"help wanted" <https://github.com/ray-project/ray/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22>`__ for a place to start. Look for issues with "[tune]" in the title.
+
+.. note::
+
+  If raising a new issue or PR related to Tune, be sure to include "[tune]" in the beginning of the title.
+
+For project organization, Tune maintains a relatively up-to-date organization of
+issues on the `Tune Github Project Board <https://github.com/ray-project/ray/projects/4>`__.
+Here, you can track and identify how issues are organized.
+
+
+Submitting and Merging a Contribution
+-------------------------------------
+
+There are a couple steps to merge a contribution.
+
+1. First rebase your development branch on the most recent version of master.
+
+   .. code:: bash
+
+     git remote add upstream https://github.com/ray-project/ray.git
+     git fetch upstream
+     git rebase upstream/master
+
+2. Make sure all existing tests `pass <tune-contrib.html#testing>`__.
+3. If introducing a new feature or patching a bug, be sure to add new test cases
+   in the relevant file in `tune/tests/`.
+4. Document the code. Public functions need to be documented, and remember to provide an usage
+   example if applicable.
+5. Request code reviews from other contributors and address their comments. One fast way to get reviews is
+   to help review others' code so that they return the favor. You should aim to improve the code as much as
+   possible before the review. We highly value patches that can get in without extensive reviews.
+6. Reviewers will merge and approve the pull request; be sure to ping them if
+   the pull request is getting stale.
+
+
+Testing
+-------
+
+Even though we have hooks to run unit tests automatically for each pull request,
+we recommend you to run unit tests locally beforehand to reduce reviewers’
+burden and speedup review process.
+
+
+.. code-block:: shell
+
+    pytest ray/python/ray/tune/tests/
+
+Documentation should be documented in `Google style <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`__ format.
+
+We also have tests for code formatting and linting that need to pass before merge.
+Install `yapf==0.23, flake8, flake8-quotes`. You can run the following locally:
+
+.. code-block:: shell
+
+    ray/scripts/format.sh
+
+
+Becoming a Reviewer
+-------------------
+
+We identify reviewers from active contributors. Reviewers are individuals who
+not only actively contribute to the project and are also willing
+to participate in the code review of new contributions.
+A pull request to the project has to be reviewed by at least one reviewer in order to be merged.
+There is currently no formal process, but active contributors to Tune will be
+solicited by current reviewers.
+
+
+.. note::
+
+    These tips are based off of the TVM `contributor guide <https://github.com/dmlc/tvm>`__.
@@ -0,0 +1,75 @@
+Tune Design Guide
+=================
+
+In this part of the documentation, we overview the design and architecture
+of Tune.
+
+.. image:: images/tune-arch.png
+
+The blue boxes refer to internal components, and green boxes are public-facing.
+Please refer to the package reference for `user-facing APIs <tune-package-ref.html>`__.
+
+Main Components
+---------------
+
+Tune's main components consist of TrialRunner, Trial objects, TrialExecutor, SearchAlg, TrialScheduler, and Trainable.
+
+TrialRunner
+~~~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial_runner.py>`__]
+This is the main driver of the training loop. This component
+uses the TrialScheduler to prioritize and execute trials,
+queries the SearchAlgorithm for new
+configurations to evaluate, and handles the fault tolerance logic.
+
+**Fault Tolerance**: The TrialRunner executes checkpointing if ``checkpoint_freq``
+is set, along with automatic trial restarting in case of trial failures (if ``max_failures`` is set).
+For example, if a node is lost while a trial (specifically, the corresponding
+Trainable of the trial) is still executing on that node and checkpointing
+is enabled, the trial will then be reverted to a ``"PENDING"`` state and resumed
+from the last available checkpoint when it is run.
+The TrialRunner is also in charge of checkpointing the entire experiment execution state
+upon each loop iteration. This allows users to restart their experiment
+in case of machine failure.
+
+Trial objects
+~~~~~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial.py>`__]
+This is an internal data structure that contains metadata about each training run. Each Trial
+object is mapped one-to-one with a Trainable object but are not themselves
+distributed/remote. Trial objects transition among
+the following states: ``"PENDING"``, ``"RUNNING"``, ``"PAUSED"``, ``"ERRORED"``, and
+``"TERMINATED"``.
+
+TrialExecutor
+~~~~~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial_executor.py>`__]
+The TrialExecutor is a component that interacts with the underlying execution framework.
+It also manages resources to ensure the cluster isn't overloaded. By default, the TrialExecutor uses Ray to execute trials.
+
+SearchAlg
+~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/tree/master/python/ray/tune/suggest>`__] The SearchAlgorithm is a user-provided object
+that is used for querying new hyperparameter configurations to evaluate.
+
+SearchAlgorithms will be notified every time a trial finishes
+executing one training step (of ``train()``), every time a trial
+errors, and every time a trial completes.
+
+TrialScheduler
+~~~~~~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/schedulers>`__] TrialSchedulers operate over a set of possible trials to run,
+prioritizing trial execution given available cluster resources.
+
+TrialSchedulers are given the ability to kill or pause trials,
+and also are given the ability to reorder/prioritize incoming trials.
+
+Trainables
+~~~~~~~~~~
+[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trainable.py>`__]
+These are user-provided objects that are used for
+the training process. If a class is provided, it is expected to conform to the
+Trainable interface. If a function is provided. it is wrapped into a
+Trainable class, and the function itself is executed on a separate thread.
+
+Trainables will execute one step of ``train()`` before notifying the TrialRunner.
@@ -91,6 +91,12 @@ For the function you wish to tune, pass in a ``reporter`` object:

 Tune can be used anywhere Ray can, e.g. on your laptop with ``ray.init()`` embedded in a Python script, or in an `auto-scaling cluster <autoscaling.html>`__ for massive parallelism.

+Contribute to Tune
+------------------
+
+Take a look at our `Contributor Guide <tune-contrib.html>`__ for guidelines on contributing.
+
+
 Citing Tune
 -----------