diff --git a/doc/source/images/tune-arch.png b/doc/source/images/tune-arch.png new file mode 100644 index 000000000..0f1751b8d Binary files /dev/null and b/doc/source/images/tune-arch.png differ diff --git a/doc/source/index.rst b/doc/source/index.rst index d47dcbacc..48c0c0d0e 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -84,7 +84,9 @@ Ray comes with libraries that accelerate deep learning and reinforcement learnin tune-schedulers.rst tune-searchalg.rst tune-package-ref.rst + tune-design.rst tune-examples.rst + tune-contrib.rst .. toctree:: :maxdepth: 1 diff --git a/doc/source/rllib-dev.rst b/doc/source/rllib-dev.rst index 1445e043f..5425879fd 100644 --- a/doc/source/rllib-dev.rst +++ b/doc/source/rllib-dev.rst @@ -4,7 +4,7 @@ RLlib Development Development Install ------------------- -You can develop RLlib locally without needing to compile Ray by using the `setup-rllib-dev.py `__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master `__ and have the latest `wheel `__ installed.) +You can develop RLlib locally without needing to compile Ray by using the `setup-dev.py `__ script. This sets up links between the ``rllib`` dir in your git repo and the one bundled with the ``ray`` package. When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master `__ and have the latest `wheel `__ installed.) API Stability ------------- diff --git a/doc/source/tune-contrib.rst b/doc/source/tune-contrib.rst new file mode 100644 index 000000000..f945ee679 --- /dev/null +++ b/doc/source/tune-contrib.rst @@ -0,0 +1,112 @@ +Contributing to Tune +==================== + +We welcome (and encourage!) all forms of contributions to Tune, including and not limited to: + +- Code reviewing of patches and PRs. +- Pushing patches. +- Documentation and examples. +- Community participation in forums and issues. +- Code readability and code comments to improve readability. +- Test cases to make the codebase more robust. +- Tutorials, blog posts, talks that promote the project. + + +Setting up a development environment +------------------------------------ + +If you have Ray installed via pip (``pip install -U ray``), you can develop Tune locally without needing to compile Ray. + + +First, you will need your own [fork](https://help.github.com/en/articles/fork-a-repo) to work on the code. Press the Fork button on the `ray project page `__. +Then, clone the project to your machine and connect your repository to the upstream (main project) ray repository. + +.. code-block:: shell + + git clone https://github.com/[your username]/ray.git [path to ray directory] + cd [path to ray directory] + git remote add upstream https://github.com/ray-project/ray.git + + +Then, run `[path to ray directory]/python/ray/setup-dev.py` `(also here on Github) `__ script. +This sets up links between the ``tune`` dir (among other directories) in your local repo and the one bundled with the ``ray`` package. + +When using this script, make sure that your git branch is in sync with the installed Ray binaries (i.e., you are up-to-date on `master `__ and have the latest `wheel `__ installed.) + + +What can I work on? +------------------- + +We use Github to track issues, feature requests, and bugs. Take a look at the +ones labeled `"good first issue" `__ and `"help wanted" `__ for a place to start. Look for issues with "[tune]" in the title. + +.. note:: + + If raising a new issue or PR related to Tune, be sure to include "[tune]" in the beginning of the title. + +For project organization, Tune maintains a relatively up-to-date organization of +issues on the `Tune Github Project Board `__. +Here, you can track and identify how issues are organized. + + +Submitting and Merging a Contribution +------------------------------------- + +There are a couple steps to merge a contribution. + +1. First rebase your development branch on the most recent version of master. + + .. code:: bash + + git remote add upstream https://github.com/ray-project/ray.git + git fetch upstream + git rebase upstream/master + +2. Make sure all existing tests `pass `__. +3. If introducing a new feature or patching a bug, be sure to add new test cases + in the relevant file in `tune/tests/`. +4. Document the code. Public functions need to be documented, and remember to provide an usage + example if applicable. +5. Request code reviews from other contributors and address their comments. One fast way to get reviews is + to help review others' code so that they return the favor. You should aim to improve the code as much as + possible before the review. We highly value patches that can get in without extensive reviews. +6. Reviewers will merge and approve the pull request; be sure to ping them if + the pull request is getting stale. + + +Testing +------- + +Even though we have hooks to run unit tests automatically for each pull request, +we recommend you to run unit tests locally beforehand to reduce reviewers’ +burden and speedup review process. + + +.. code-block:: shell + + pytest ray/python/ray/tune/tests/ + +Documentation should be documented in `Google style `__ format. + +We also have tests for code formatting and linting that need to pass before merge. +Install `yapf==0.23, flake8, flake8-quotes`. You can run the following locally: + +.. code-block:: shell + + ray/scripts/format.sh + + +Becoming a Reviewer +------------------- + +We identify reviewers from active contributors. Reviewers are individuals who +not only actively contribute to the project and are also willing +to participate in the code review of new contributions. +A pull request to the project has to be reviewed by at least one reviewer in order to be merged. +There is currently no formal process, but active contributors to Tune will be +solicited by current reviewers. + + +.. note:: + + These tips are based off of the TVM `contributor guide `__. diff --git a/doc/source/tune-design.rst b/doc/source/tune-design.rst new file mode 100644 index 000000000..b8bf09a08 --- /dev/null +++ b/doc/source/tune-design.rst @@ -0,0 +1,75 @@ +Tune Design Guide +================= + +In this part of the documentation, we overview the design and architecture +of Tune. + +.. image:: images/tune-arch.png + +The blue boxes refer to internal components, and green boxes are public-facing. +Please refer to the package reference for `user-facing APIs `__. + +Main Components +--------------- + +Tune's main components consist of TrialRunner, Trial objects, TrialExecutor, SearchAlg, TrialScheduler, and Trainable. + +TrialRunner +~~~~~~~~~~~ +[`source code `__] +This is the main driver of the training loop. This component +uses the TrialScheduler to prioritize and execute trials, +queries the SearchAlgorithm for new +configurations to evaluate, and handles the fault tolerance logic. + +**Fault Tolerance**: The TrialRunner executes checkpointing if ``checkpoint_freq`` +is set, along with automatic trial restarting in case of trial failures (if ``max_failures`` is set). +For example, if a node is lost while a trial (specifically, the corresponding +Trainable of the trial) is still executing on that node and checkpointing +is enabled, the trial will then be reverted to a ``"PENDING"`` state and resumed +from the last available checkpoint when it is run. +The TrialRunner is also in charge of checkpointing the entire experiment execution state +upon each loop iteration. This allows users to restart their experiment +in case of machine failure. + +Trial objects +~~~~~~~~~~~~~ +[`source code `__] +This is an internal data structure that contains metadata about each training run. Each Trial +object is mapped one-to-one with a Trainable object but are not themselves +distributed/remote. Trial objects transition among +the following states: ``"PENDING"``, ``"RUNNING"``, ``"PAUSED"``, ``"ERRORED"``, and +``"TERMINATED"``. + +TrialExecutor +~~~~~~~~~~~~~ +[`source code `__] +The TrialExecutor is a component that interacts with the underlying execution framework. +It also manages resources to ensure the cluster isn't overloaded. By default, the TrialExecutor uses Ray to execute trials. + +SearchAlg +~~~~~~~~~ +[`source code `__] The SearchAlgorithm is a user-provided object +that is used for querying new hyperparameter configurations to evaluate. + +SearchAlgorithms will be notified every time a trial finishes +executing one training step (of ``train()``), every time a trial +errors, and every time a trial completes. + +TrialScheduler +~~~~~~~~~~~~~~ +[`source code `__] TrialSchedulers operate over a set of possible trials to run, +prioritizing trial execution given available cluster resources. + +TrialSchedulers are given the ability to kill or pause trials, +and also are given the ability to reorder/prioritize incoming trials. + +Trainables +~~~~~~~~~~ +[`source code `__] +These are user-provided objects that are used for +the training process. If a class is provided, it is expected to conform to the +Trainable interface. If a function is provided. it is wrapped into a +Trainable class, and the function itself is executed on a separate thread. + +Trainables will execute one step of ``train()`` before notifying the TrialRunner. diff --git a/doc/source/tune.rst b/doc/source/tune.rst index 7f6aafa2a..bfeb729e6 100644 --- a/doc/source/tune.rst +++ b/doc/source/tune.rst @@ -91,6 +91,12 @@ For the function you wish to tune, pass in a ``reporter`` object: Tune can be used anywhere Ray can, e.g. on your laptop with ``ray.init()`` embedded in a Python script, or in an `auto-scaling cluster `__ for massive parallelism. +Contribute to Tune +------------------ + +Take a look at our `Contributor Guide `__ for guidelines on contributing. + + Citing Tune ----------- diff --git a/python/ray/rllib/setup-rllib-dev.py b/python/ray/setup-dev.py similarity index 100% rename from python/ray/rllib/setup-rllib-dev.py rename to python/ray/setup-dev.py