[tune] Contributor Guide and Design Page (#4716)

* Move setup script out

* some changes

* Finished Contributor guide

* some comments to the design

* move

* Apply suggestions from code review

Co-Authored-By: richardliaw <rliaw@berkeley.edu>

* sourcecode

* comments
This commit is contained in:
Richard Liaw
2019-05-05 00:04:13 -07:00
committed by Peter Schafhalter
parent d81e71e297
commit f2faf5ce75
7 changed files with 196 additions and 1 deletions
Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

+2
View File
@@ -84,7 +84,9 @@ Ray comes with libraries that accelerate deep learning and reinforcement learnin
tune-schedulers.rst
tune-searchalg.rst
tune-package-ref.rst
tune-design.rst
tune-examples.rst
tune-contrib.rst
.. toctree::
:maxdepth: 1
+1 -1
View File
@@ -4,7 +4,7 @@ RLlib Development
Development Install
-------------------
You can develop RLlib locally without needing to compile Ray by using the `setup-rllib-dev.py <https://github.com/ray-project/ray/blob/master/python/ray/rllib/setup-rllib-dev.py>`__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)
You can develop RLlib locally without needing to compile Ray by using the `setup-dev.py <https://github.com/ray-project/ray/blob/master/python/ray/setup-dev.py>`__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)
API Stability
-------------
+112
View File
@@ -0,0 +1,112 @@
Contributing to Tune
====================
We welcome (and encourage!) all forms of contributions to Tune, including and not limited to:
- Code reviewing of patches and PRs.
- Pushing patches.
- Documentation and examples.
- Community participation in forums and issues.
- Code readability and code comments to improve readability.
- Test cases to make the codebase more robust.
- Tutorials, blog posts, talks that promote the project.
Setting up a development environment
------------------------------------
If you have Ray installed via pip (``pip install -U ray``), you can develop Tune locally without needing to compile Ray.
First, you will need your own [fork](https://help.github.com/en/articles/fork-a-repo) to work on the code. Press the Fork button on the `ray project page <https://github.com/ray-project/ray/>`__.
Then, clone the project to your machine and connect your repository to the upstream (main project) ray repository.
.. code-block:: shell
git clone https://github.com/[your username]/ray.git [path to ray directory]
cd [path to ray directory]
git remote add upstream https://github.com/ray-project/ray.git
Then, run `[path to ray directory]/python/ray/setup-dev.py` `(also here on Github) <https://github.com/ray-project/ray/blob/master/python/ray/setup-dev.py>`__ script.
This sets up links between the ``tune`` dir (among other directories) in your local repo and the one bundled with the ``ray`` package.
When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master <https://github.com/ray-project/ray>`__ and have the latest `wheel <https://ray.readthedocs.io/en/latest/installation.html>`__ installed.)
What can I work on?
-------------------
We use Github to track issues, feature requests, and bugs. Take a look at the
ones labeled `"good first issue" <https://github.com/ray-project/ray/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22>`__ and `"help wanted" <https://github.com/ray-project/ray/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22>`__ for a place to start. Look for issues with "[tune]" in the title.
.. note::
If raising a new issue or PR related to Tune, be sure to include "[tune]" in the beginning of the title.
For project organization, Tune maintains a relatively up-to-date organization of
issues on the `Tune Github Project Board <https://github.com/ray-project/ray/projects/4>`__.
Here, you can track and identify how issues are organized.
Submitting and Merging a Contribution
-------------------------------------
There are a couple steps to merge a contribution.
1. First rebase your development branch on the most recent version of master.
.. code:: bash
git remote add upstream https://github.com/ray-project/ray.git
git fetch upstream
git rebase upstream/master
2. Make sure all existing tests `pass <tune-contrib.html#testing>`__.
3. If introducing a new feature or patching a bug, be sure to add new test cases
in the relevant file in `tune/tests/`.
4. Document the code. Public functions need to be documented, and remember to provide an usage
example if applicable.
5. Request code reviews from other contributors and address their comments. One fast way to get reviews is
to help review others' code so that they return the favor. You should aim to improve the code as much as
possible before the review. We highly value patches that can get in without extensive reviews.
6. Reviewers will merge and approve the pull request; be sure to ping them if
the pull request is getting stale.
Testing
-------
Even though we have hooks to run unit tests automatically for each pull request,
we recommend you to run unit tests locally beforehand to reduce reviewers
burden and speedup review process.
.. code-block:: shell
pytest ray/python/ray/tune/tests/
Documentation should be documented in `Google style <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`__ format.
We also have tests for code formatting and linting that need to pass before merge.
Install `yapf==0.23, flake8, flake8-quotes`. You can run the following locally:
.. code-block:: shell
ray/scripts/format.sh
Becoming a Reviewer
-------------------
We identify reviewers from active contributors. Reviewers are individuals who
not only actively contribute to the project and are also willing
to participate in the code review of new contributions.
A pull request to the project has to be reviewed by at least one reviewer in order to be merged.
There is currently no formal process, but active contributors to Tune will be
solicited by current reviewers.
.. note::
These tips are based off of the TVM `contributor guide <https://github.com/dmlc/tvm>`__.
+75
View File
@@ -0,0 +1,75 @@
Tune Design Guide
=================
In this part of the documentation, we overview the design and architecture
of Tune.
.. image:: images/tune-arch.png
The blue boxes refer to internal components, and green boxes are public-facing.
Please refer to the package reference for `user-facing APIs <tune-package-ref.html>`__.
Main Components
---------------
Tune's main components consist of TrialRunner, Trial objects, TrialExecutor, SearchAlg, TrialScheduler, and Trainable.
TrialRunner
~~~~~~~~~~~
[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial_runner.py>`__]
This is the main driver of the training loop. This component
uses the TrialScheduler to prioritize and execute trials,
queries the SearchAlgorithm for new
configurations to evaluate, and handles the fault tolerance logic.
**Fault Tolerance**: The TrialRunner executes checkpointing if ``checkpoint_freq``
is set, along with automatic trial restarting in case of trial failures (if ``max_failures`` is set).
For example, if a node is lost while a trial (specifically, the corresponding
Trainable of the trial) is still executing on that node and checkpointing
is enabled, the trial will then be reverted to a ``"PENDING"`` state and resumed
from the last available checkpoint when it is run.
The TrialRunner is also in charge of checkpointing the entire experiment execution state
upon each loop iteration. This allows users to restart their experiment
in case of machine failure.
Trial objects
~~~~~~~~~~~~~
[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial.py>`__]
This is an internal data structure that contains metadata about each training run. Each Trial
object is mapped one-to-one with a Trainable object but are not themselves
distributed/remote. Trial objects transition among
the following states: ``"PENDING"``, ``"RUNNING"``, ``"PAUSED"``, ``"ERRORED"``, and
``"TERMINATED"``.
TrialExecutor
~~~~~~~~~~~~~
[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trial_executor.py>`__]
The TrialExecutor is a component that interacts with the underlying execution framework.
It also manages resources to ensure the cluster isn't overloaded. By default, the TrialExecutor uses Ray to execute trials.
SearchAlg
~~~~~~~~~
[`source code <https://github.com/ray-project/ray/tree/master/python/ray/tune/suggest>`__] The SearchAlgorithm is a user-provided object
that is used for querying new hyperparameter configurations to evaluate.
SearchAlgorithms will be notified every time a trial finishes
executing one training step (of ``train()``), every time a trial
errors, and every time a trial completes.
TrialScheduler
~~~~~~~~~~~~~~
[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/schedulers>`__] TrialSchedulers operate over a set of possible trials to run,
prioritizing trial execution given available cluster resources.
TrialSchedulers are given the ability to kill or pause trials,
and also are given the ability to reorder/prioritize incoming trials.
Trainables
~~~~~~~~~~
[`source code <https://github.com/ray-project/ray/blob/master/python/ray/tune/trainable.py>`__]
These are user-provided objects that are used for
the training process. If a class is provided, it is expected to conform to the
Trainable interface. If a function is provided. it is wrapped into a
Trainable class, and the function itself is executed on a separate thread.
Trainables will execute one step of ``train()`` before notifying the TrialRunner.
+6
View File
@@ -91,6 +91,12 @@ For the function you wish to tune, pass in a ``reporter`` object:
Tune can be used anywhere Ray can, e.g. on your laptop with ``ray.init()`` embedded in a Python script, or in an `auto-scaling cluster <autoscaling.html>`__ for massive parallelism.
Contribute to Tune
------------------
Take a look at our `Contributor Guide <tune-contrib.html>`__ for guidelines on contributing.
Citing Tune
-----------