[docs] Pictures for all the Examples (#5859)

* image * plot resnet * hyperparam * fixup_pictures * custom_direct
2026-07-02 22:47:18 +08:00 · 2019-10-14 14:18:52 -07:00
parent 8fd23c0c3f
commit 7f4141df4e
14 changed files with 24 additions and 803 deletions
@@ -3,18 +3,22 @@ Examples Overview

 .. customgalleryitem::
   :tooltip: Build a simple parameter server using Ray.
+   :figure: /images/param_actor.png
   :description: :doc:`/auto_examples/plot_parameter_server`

 .. customgalleryitem::
   :tooltip: Asynchronous Advantage Actor Critic agent using Ray.
+   :figure: /images/a3c.png
   :description: :doc:`/auto_examples/plot_example-a3c`

 .. customgalleryitem::
   :tooltip: Simple parallel asynchronous hyperparameter evaluation.
+   :figure: /images/hyperparameter.png
   :description: :doc:`/auto_examples/plot_hyperparameter`

 .. customgalleryitem::
   :tooltip: Parallelizing a policy gradient calculation on OpenAI Gym Pong.
+   :figure: /images/pong.png
   :description: :doc:`/auto_examples/plot_pong_example`

 .. customgalleryitem::
@@ -25,10 +29,6 @@ Examples Overview
   :tooltip: Implementing a simple news reader using Ray.
   :description: :doc:`/auto_examples/plot_newsreader`

-.. customgalleryitem::
-   :tooltip: Using Ray to train ResNet across multiple GPUs.
-   :description: :doc:`/auto_examples/plot_resnet`
-
 .. customgalleryitem::
   :tooltip: Implement a simple streaming application using Ray’s actors.
   :description: :doc:`/auto_examples/plot_streaming`
@@ -25,6 +25,10 @@ To run the application, first install **ray** and then some dependencies:
  pip install opencv-python-headless
  pip install scipy

+
+.. image:: ../images/a3c.png
+    :align: center
+
 You can run the code with

 .. code-block:: bash
@@ -9,6 +9,9 @@ This script will demonstrate how to use two important parts of the Ray API:
 using ``ray.remote`` to define remote functions and ``ray.wait`` to wait for
 their results to be ready.

+.. image:: ../images/hyperparameter.png
+    :align: center
+
 .. important:: For a production-grade implementation of distributed
    hyperparameter tuning, use `Tune`_, a scalable hyperparameter
    tuning library built using Ray's Actor API.
@@ -14,6 +14,10 @@ then be passed back to each Ray actor for more gradient calculation.
 This application is adapted, with minimal modifications, from
 Andrej Karpathy's `source code`_ (see the accompanying `blog post`_).

+.. image:: ../images/pong-arch.svg
+    :align: center
+
+
 To run the application, first install some dependencies.

 .. code-block:: bash
@@ -1,103 +0,0 @@
-ResNet
-======
-
-This code uses ResNet to do data parallel training
-across multiple GPUs using Ray. View the `code for this example`_.
-
-To run the example, you will need to install `TensorFlow`_ (at
-least version ``1.0.0``). Then you can run the example as follows.
-
-First download the CIFAR-10 or CIFAR-100 dataset.
-
-.. code-block:: bash
-
-  # Get the CIFAR-10 dataset.
-  curl -o cifar-10-binary.tar.gz https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
-  tar -xvf cifar-10-binary.tar.gz
-
-  # Get the CIFAR-100 dataset.
-  curl -o cifar-100-binary.tar.gz https://www.cs.toronto.edu/~kriz/cifar-100-binary.tar.gz
-  tar -xvf cifar-100-binary.tar.gz
-
-Then run the training script that matches the dataset you downloaded.
-
-.. code-block:: bash
-
-  # Train Resnet on CIFAR-10.
-  python ray/doc/examples/resnet/resnet_main.py \
-      --eval_dir=/tmp/resnet-model/eval \
-      --train_data_path=cifar-10-batches-bin/data_batch* \
-      --eval_data_path=cifar-10-batches-bin/test_batch.bin \
-      --dataset=cifar10 \
-      --num_gpus=1
-
-  # Train Resnet on CIFAR-100.
-  python ray/doc/examples/resnet/resnet_main.py \
-      --eval_dir=/tmp/resnet-model/eval \
-      --train_data_path=cifar-100-binary/train.bin \
-      --eval_data_path=cifar-100-binary/test.bin \
-      --dataset=cifar100 \
-      --num_gpus=1
-
-To run the training script on a cluster with multiple machines, you will need
-to also pass in the flag ``--address=<address>``, where
-``<address>`` is the address of the Redis server on the head node.
-
-The script will print out the IP address that the log files are stored on. In
-the single-node case, you can ignore this and run tensorboard on the current
-machine.
-
-.. code-block:: bash
-
-  python -m tensorflow.tensorboard --logdir=/tmp/resnet-model
-
-If you are running Ray on multiple nodes, you will need to go to the node at the
-IP address printed, and run the command.
-
-The core of the script is the actor definition.
-
-.. code-block:: python
-
-  @ray.remote(num_gpus=1)
-  class ResNetTrainActor(object):
-      def __init__(self, data, dataset, num_gpus):
-          # data is the preprocessed images and labels extracted from the dataset.
-          # Thus, every actor has its own copy of the data.
-          # Set the CUDA_VISIBLE_DEVICES environment variable in order to restrict
-          # which GPUs TensorFlow uses. Note that this only works if it is done before
-          # the call to tf.Session.
-          os.environ['CUDA_VISIBLE_DEVICES'] = ','.join([str(i) for i in ray.get_gpu_ids()])
-          with tf.Graph().as_default():
-              with tf.device('/gpu:0'):
-                  # We omit the code here that actually constructs the residual network
-                  # and initializes it. Uses the definition in the Tensorflow Resnet Example.
-
-      def compute_steps(self, weights):
-          # This method sets the weights in the network, runs some training steps,
-          # and returns the new weights. self.model.variables is a TensorFlowVariables
-          # class that we pass the train operation into.
-          self.model.variables.set_weights(weights)
-          for i in range(self.steps):
-              self.model.variables.sess.run(self.model.train_op)
-          return self.model.variables.get_weights()
-
-The main script first creates one actor for each GPU, or a single actor if
-``num_gpus`` is zero.
-
-.. code-block:: python
-
-  train_actors = [ResNetTrainActor.remote(train_data, dataset, num_gpus) for _ in range(num_gpus)]
-
-Then the main loop passes the same weights to every model, performs
-updates on each model, averages the updates, and puts the new weights in the
-object store.
-
-.. code-block:: python
-
-  while True:
-      all_weights = ray.get([actor.compute_steps.remote(weight_id) for actor in train_actors])
-      mean_weights = {k: sum([weights[k] for weights in all_weights]) / num_gpus for k in all_weights[0]}
-      weight_id = ray.put(mean_weights)
-
-.. _`TensorFlow`: https://www.tensorflow.org/install/
-.. _`code for this example`: https://github.com/ray-project/ray/tree/master/doc/examples/resnet
@@ -1,116 +0,0 @@
-"""CIFAR dataset input module, with the majority taken from
-https://github.com/tensorflow/models/tree/master/resnet.
-"""
-
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
-import tensorflow as tf
-
-
-def build_data(data_path, size, dataset):
-    """Creates the queue and preprocessing operations for the dataset.
-
-    Args:
-        data_path: Filename for cifar10 data.
-        size: The number of images in the dataset.
-        dataset: The dataset we are using.
-
-    Returns:
-        queue: A Tensorflow queue for extracting the images and labels.
-    """
-    image_size = 32
-    if dataset == "cifar10":
-        label_bytes = 1
-        label_offset = 0
-    elif dataset == "cifar100":
-        label_bytes = 1
-        label_offset = 1
-    depth = 3
-    image_bytes = image_size * image_size * depth
-    record_bytes = label_bytes + label_offset + image_bytes
-
-    def load_transform(value):
-        # Convert these examples to dense labels and processed images.
-        record = tf.reshape(tf.decode_raw(value, tf.uint8), [record_bytes])
-        label = tf.cast(
-            tf.slice(record, [label_offset], [label_bytes]), tf.int32)
-        # Convert from string to [depth * height * width] to
-        # [depth, height, width].
-        depth_major = tf.reshape(
-            tf.slice(record, [label_bytes], [image_bytes]),
-            [depth, image_size, image_size])
-        # Convert from [depth, height, width] to [height, width, depth].
-        image = tf.cast(tf.transpose(depth_major, [1, 2, 0]), tf.float32)
-        return (image, label)
-
-    # Read examples from files in the filename queue.
-    data_files = tf.gfile.Glob(data_path)
-    data = tf.data.FixedLengthRecordDataset(
-        data_files, record_bytes=record_bytes)
-    data = data.map(load_transform)
-    data = data.batch(size)
-    iterator = data.make_one_shot_iterator()
-    return iterator.get_next()
-
-
-def build_input(data, batch_size, dataset, train):
-    """Build CIFAR image and labels.
-
-    Args:
-        data_path: Filename for cifar10 data.
-        batch_size: Input batch size.
-        train: True if we are training and false if we are testing.
-
-    Returns:
-        images: Batches of images of size
-            [batch_size, image_size, image_size, 3].
-        labels: Batches of labels of size [batch_size, num_classes].
-
-    Raises:
-      ValueError: When the specified dataset is not supported.
-    """
-    image_size = 32
-    depth = 3
-    num_classes = 10 if dataset == "cifar10" else 100
-    images, labels = data
-    num_samples = images.shape[0] - images.shape[0] % batch_size
-    dataset = tf.data.Dataset.from_tensor_slices(
-        (images[:num_samples], labels[:num_samples]))
-
-    def map_train(image, label):
-        image = tf.image.resize_image_with_crop_or_pad(image, image_size + 4,
-                                                       image_size + 4)
-        image = tf.random_crop(image, [image_size, image_size, 3])
-        image = tf.image.random_flip_left_right(image)
-        image = tf.image.per_image_standardization(image)
-        return (image, label)
-
-    def map_test(image, label):
-        image = tf.image.resize_image_with_crop_or_pad(image, image_size,
-                                                       image_size)
-        image = tf.image.per_image_standardization(image)
-        return (image, label)
-
-    dataset = dataset.map(map_train if train else map_test)
-    dataset = dataset.batch(batch_size)
-    dataset = dataset.repeat()
-    if train:
-        dataset = dataset.shuffle(buffer_size=16 * batch_size)
-    images, labels = dataset.make_one_shot_iterator().get_next()
-    images = tf.reshape(images, [batch_size, image_size, image_size, depth])
-    labels = tf.reshape(labels, [batch_size, 1])
-    indices = tf.reshape(tf.range(0, batch_size, 1), [batch_size, 1])
-    labels = tf.sparse_to_dense(
-        tf.concat([indices, labels], 1), [batch_size, num_classes], 1.0, 0.0)
-
-    assert len(images.get_shape()) == 4
-    assert images.get_shape()[0] == batch_size
-    assert images.get_shape()[-1] == 3
-    assert len(labels.get_shape()) == 2
-    assert labels.get_shape()[0] == batch_size
-    assert labels.get_shape()[1] == num_classes
-    if not train:
-        tf.summary.image("images", images)
-    return images, labels
@@ -1,257 +0,0 @@
-"""ResNet training script, with some code from
-https://github.com/tensorflow/models/tree/master/resnet.
-"""
-
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
-import argparse
-import os
-import numpy as np
-import ray
-import tensorflow as tf
-
-import cifar_input
-import resnet_model
-
-# Tensorflow must be at least version 1.2.0 for the example to work.
-tf_major = int(tf.__version__.split(".")[0])
-tf_minor = int(tf.__version__.split(".")[1])
-if (tf_major < 1) or (tf_major == 1 and tf_minor < 2):
-    raise Exception("Your Tensorflow version is less than 1.2.0. Please "
-                    "update Tensorflow to the latest version.")
-
-parser = argparse.ArgumentParser(description="Run the ResNet example.")
-parser.add_argument(
-    "--dataset",
-    default="cifar10",
-    type=str,
-    help="Dataset to use: cifar10 or cifar100.")
-parser.add_argument(
-    "--train_data_path",
-    default="cifar-10-batches-bin/data_batch*",
-    type=str,
-    help="Data path for the training data.")
-parser.add_argument(
-    "--eval_data_path",
-    default="cifar-10-batches-bin/test_batch.bin",
-    type=str,
-    help="Data path for the testing data.")
-parser.add_argument(
-    "--eval_dir",
-    default="/tmp/resnet-model/eval",
-    type=str,
-    help="Data path for the tensorboard logs.")
-parser.add_argument(
-    "--eval_batch_count",
-    default=50,
-    type=int,
-    help="Number of batches to evaluate over.")
-parser.add_argument(
-    "--num_gpus",
-    default=0,
-    type=int,
-    help="Number of GPUs to use for training.")
-parser.add_argument(
-    "--redis-address",
-    default=None,
-    type=str,
-    help="The Redis address of the cluster.")
-
-FLAGS = parser.parse_args()
-
-# Determines if the actors require a gpu or not.
-use_gpu = 1 if int(FLAGS.num_gpus) > 0 else 0
-
-
-@ray.remote
-def get_data(path, size, dataset):
-    # Retrieves all preprocessed images and labels using a tensorflow queue.
-    # This only uses the cpu.
-    os.environ["CUDA_VISIBLE_DEVICES"] = ""
-    with tf.device("/cpu:0"):
-        dataset = cifar_input.build_data(path, size, dataset)
-        sess = tf.Session()
-        images, labels = sess.run(dataset)
-        sess.close()
-        return images, labels
-
-
-@ray.remote(num_gpus=use_gpu)
-class ResNetTrainActor(object):
-    def __init__(self, data, dataset, num_gpus):
-        if num_gpus > 0:
-            os.environ["CUDA_VISIBLE_DEVICES"] = ",".join(
-                [str(i) for i in ray.get_gpu_ids()])
-        hps = resnet_model.HParams(
-            batch_size=128,
-            num_classes=100 if dataset == "cifar100" else 10,
-            min_lrn_rate=0.0001,
-            lrn_rate=0.1,
-            num_residual_units=5,
-            use_bottleneck=False,
-            weight_decay_rate=0.0002,
-            relu_leakiness=0.1,
-            optimizer="mom",
-            num_gpus=num_gpus)
-
-        # We seed each actor differently so that each actor operates on a
-        # different subset of data.
-        if num_gpus > 0:
-            tf.set_random_seed(ray.get_gpu_ids()[0] + 1)
-        else:
-            # Only a single actor in this case.
-            tf.set_random_seed(1)
-
-        with tf.device("/gpu:0" if num_gpus > 0 else "/cpu:0"):
-            # Build the model.
-            images, labels = cifar_input.build_input(data, hps.batch_size,
-                                                     dataset, True)
-            self.model = resnet_model.ResNet(hps, images, labels, "train")
-            self.model.build_graph()
-            config = tf.ConfigProto(allow_soft_placement=True)
-            config.gpu_options.allow_growth = True
-            sess = tf.Session(config=config)
-            self.model.variables.set_session(sess)
-            init = tf.global_variables_initializer()
-            sess.run(init)
-            self.steps = 10
-
-    def compute_steps(self, weights):
-        # This method sets the weights in the network, trains the network
-        # self.steps times, and returns the new weights.
-        self.model.variables.set_weights(weights)
-        for i in range(self.steps):
-            self.model.variables.sess.run(self.model.train_op)
-        return self.model.variables.get_weights()
-
-    def get_weights(self):
-        # Note that the driver cannot directly access fields of the class,
-        # so helper methods must be created.
-        return self.model.variables.get_weights()
-
-
-@ray.remote
-class ResNetTestActor(object):
-    def __init__(self, data, dataset, eval_batch_count, eval_dir):
-        os.environ["CUDA_VISIBLE_DEVICES"] = ""
-        hps = resnet_model.HParams(
-            batch_size=100,
-            num_classes=100 if dataset == "cifar100" else 10,
-            min_lrn_rate=0.0001,
-            lrn_rate=0.1,
-            num_residual_units=5,
-            use_bottleneck=False,
-            weight_decay_rate=0.0002,
-            relu_leakiness=0.1,
-            optimizer="mom",
-            num_gpus=0)
-        with tf.device("/cpu:0"):
-            # Builds the testing network.
-            images, labels = cifar_input.build_input(data, hps.batch_size,
-                                                     dataset, False)
-            self.model = resnet_model.ResNet(hps, images, labels, "eval")
-            self.model.build_graph()
-            config = tf.ConfigProto(allow_soft_placement=True)
-            config.gpu_options.allow_growth = True
-            sess = tf.Session(config=config)
-            self.model.variables.set_session(sess)
-            init = tf.global_variables_initializer()
-            sess.run(init)
-
-            # Initializing parameters for tensorboard.
-            self.best_precision = 0.0
-            self.eval_batch_count = eval_batch_count
-            self.summary_writer = tf.summary.FileWriter(eval_dir, sess.graph)
-        # The IP address where tensorboard logs will be on.
-        self.ip_addr = ray.services.get_node_ip_address()
-
-    def accuracy(self, weights, train_step):
-        # Sets the weights, computes the accuracy and other metrics
-        # over eval_batches, and outputs to tensorboard.
-        self.model.variables.set_weights(weights)
-        total_prediction, correct_prediction = 0, 0
-        model = self.model
-        sess = self.model.variables.sess
-        for _ in range(self.eval_batch_count):
-            summaries, loss, predictions, truth = sess.run(
-                [model.summaries, model.cost, model.predictions, model.labels])
-
-            truth = np.argmax(truth, axis=1)
-            predictions = np.argmax(predictions, axis=1)
-            correct_prediction += np.sum(truth == predictions)
-            total_prediction += predictions.shape[0]
-
-        precision = 1.0 * correct_prediction / total_prediction
-        self.best_precision = max(precision, self.best_precision)
-        precision_summ = tf.Summary()
-        precision_summ.value.add(tag="Precision", simple_value=precision)
-        self.summary_writer.add_summary(precision_summ, train_step)
-        best_precision_summ = tf.Summary()
-        best_precision_summ.value.add(
-            tag="Best Precision", simple_value=self.best_precision)
-        self.summary_writer.add_summary(best_precision_summ, train_step)
-        self.summary_writer.add_summary(summaries, train_step)
-        tf.logging.info("loss: %.3f, precision: %.3f, best precision: %.3f" %
-                        (loss, precision, self.best_precision))
-        self.summary_writer.flush()
-        return precision
-
-    def get_ip_addr(self):
-        # As above, a helper method must be created to access the field from
-        # the driver.
-        return self.ip_addr
-
-
-def train():
-    num_gpus = FLAGS.num_gpus
-    if FLAGS.redis_address is None:
-        ray.init(num_gpus=num_gpus)
-    else:
-        ray.init(redis_address=FLAGS.redis_address)
-    train_data = get_data.remote(FLAGS.train_data_path, 50000, FLAGS.dataset)
-    test_data = get_data.remote(FLAGS.eval_data_path, 10000, FLAGS.dataset)
-    # Creates an actor for each gpu, or one if only using the cpu. Each actor
-    # has access to the dataset.
-    if FLAGS.num_gpus > 0:
-        train_actors = [
-            ResNetTrainActor.remote(train_data, FLAGS.dataset, num_gpus)
-            for _ in range(num_gpus)
-        ]
-    else:
-        train_actors = [ResNetTrainActor.remote(train_data, FLAGS.dataset, 0)]
-    test_actor = ResNetTestActor.remote(test_data, FLAGS.dataset,
-                                        FLAGS.eval_batch_count, FLAGS.eval_dir)
-    print("The log files for tensorboard are stored at ip {}.".format(
-        ray.get(test_actor.get_ip_addr.remote())))
-    step = 0
-    weight_id = train_actors[0].get_weights.remote()
-    acc_id = test_actor.accuracy.remote(weight_id, step)
-    # Correction for dividing the weights by the number of gpus.
-    if num_gpus == 0:
-        num_gpus = 1
-    print("Starting training loop. Use Ctrl-C to exit.")
-    try:
-        while True:
-            all_weights = ray.get([
-                actor.compute_steps.remote(weight_id) for actor in train_actors
-            ])
-            mean_weights = {
-                k: (sum(weights[k] for weights in all_weights) / num_gpus)
-                for k in all_weights[0]
-            }
-            weight_id = ray.put(mean_weights)
-            step += 10
-            if step % 200 == 0:
-                # Retrieves the previously computed accuracy and launches a new
-                # testing task with the current weights every 200 steps.
-                acc = ray.get(acc_id)
-                acc_id = test_actor.accuracy.remote(weight_id, step)
-                print("Step {}: {:.6f}".format(step - 200, acc))
-    except KeyboardInterrupt:
-        pass
-
-
-if __name__ == "__main__":
-    train()
@@ -1,317 +0,0 @@
-"""ResNet model with most of the code taken from
-https://github.com/tensorflow/models/tree/master/resnet.
-
-Related papers:
-https://arxiv.org/pdf/1603.05027v2.pdf
-https://arxiv.org/pdf/1512.03385v1.pdf
-https://arxiv.org/pdf/1605.07146v1.pdf
-"""
-
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
-from collections import namedtuple
-import numpy as np
-
-import tensorflow as tf
-from tensorflow.python.training import moving_averages
-
-import ray
-import ray.experimental.tf_utils
-
-HParams = namedtuple(
-    "HParams", "batch_size, num_classes, min_lrn_rate, lrn_rate, "
-    "num_residual_units, use_bottleneck, weight_decay_rate, "
-    "relu_leakiness, optimizer, num_gpus")
-
-
-class ResNet(object):
-    """ResNet model."""
-
-    def __init__(self, hps, images, labels, mode):
-        """ResNet constructor.
-
-        Args:
-            hps: Hyperparameters.
-            images: Batches of images of size [batch_size, image_size,
-                image_size, 3].
-            labels: Batches of labels of size [batch_size, num_classes].
-            mode: One of 'train' and 'eval'.
-        """
-        self.hps = hps
-        self._images = images
-        self.labels = labels
-        self.mode = mode
-
-        self._extra_train_ops = []
-
-    def build_graph(self):
-        """Build a whole graph for the model."""
-        self.global_step = tf.Variable(0, trainable=False)
-        self._build_model()
-        if self.mode == "train":
-            self._build_train_op()
-        else:
-            # Additional initialization for the test network.
-            self.variables = ray.experimental.tf_utils.TensorFlowVariables(
-                self.cost)
-            self.summaries = tf.summary.merge_all()
-
-    def _stride_arr(self, stride):
-        """Map a stride scalar to the stride array for tf.nn.conv2d."""
-        return [1, stride, stride, 1]
-
-    def _build_model(self):
-        """Build the core model within the graph."""
-
-        with tf.variable_scope("init"):
-            x = self._conv("init_conv", self._images, 3, 3, 16,
-                           self._stride_arr(1))
-
-        strides = [1, 2, 2]
-        activate_before_residual = [True, False, False]
-        if self.hps.use_bottleneck:
-            res_func = self._bottleneck_residual
-            filters = [16, 64, 128, 256]
-        else:
-            res_func = self._residual
-            filters = [16, 16, 32, 64]
-
-        with tf.variable_scope("unit_1_0"):
-            x = res_func(x, filters[0], filters[1], self._stride_arr(
-                strides[0]), activate_before_residual[0])
-        for i in range(1, self.hps.num_residual_units):
-            with tf.variable_scope("unit_1_%d" % i):
-                x = res_func(x, filters[1], filters[1], self._stride_arr(1),
-                             False)
-
-        with tf.variable_scope("unit_2_0"):
-            x = res_func(x, filters[1], filters[2], self._stride_arr(
-                strides[1]), activate_before_residual[1])
-        for i in range(1, self.hps.num_residual_units):
-            with tf.variable_scope("unit_2_%d" % i):
-                x = res_func(x, filters[2], filters[2], self._stride_arr(1),
-                             False)
-
-        with tf.variable_scope("unit_3_0"):
-            x = res_func(x, filters[2], filters[3], self._stride_arr(
-                strides[2]), activate_before_residual[2])
-        for i in range(1, self.hps.num_residual_units):
-            with tf.variable_scope("unit_3_%d" % i):
-                x = res_func(x, filters[3], filters[3], self._stride_arr(1),
-                             False)
-        with tf.variable_scope("unit_last"):
-            x = self._batch_norm("final_bn", x)
-            x = self._relu(x, self.hps.relu_leakiness)
-            x = self._global_avg_pool(x)
-
-        with tf.variable_scope("logit"):
-            logits = self._fully_connected(x, self.hps.num_classes)
-            self.predictions = tf.nn.softmax(logits)
-
-        with tf.variable_scope("costs"):
-            xent = tf.nn.softmax_cross_entropy_with_logits(
-                logits=logits, labels=self.labels)
-            self.cost = tf.reduce_mean(xent, name="xent")
-            self.cost += self._decay()
-
-            if self.mode == "eval":
-                tf.summary.scalar("cost", self.cost)
-
-    def _build_train_op(self):
-        """Build training specific ops for the graph."""
-        num_gpus = self.hps.num_gpus if self.hps.num_gpus != 0 else 1
-        # The learning rate schedule is dependent on the number of gpus.
-        boundaries = [int(20000 * i / np.sqrt(num_gpus)) for i in range(2, 5)]
-        values = [0.1, 0.01, 0.001, 0.0001]
-        self.lrn_rate = tf.train.piecewise_constant(self.global_step,
-                                                    boundaries, values)
-        tf.summary.scalar("learning rate", self.lrn_rate)
-
-        if self.hps.optimizer == "sgd":
-            optimizer = tf.train.GradientDescentOptimizer(self.lrn_rate)
-        elif self.hps.optimizer == "mom":
-            optimizer = tf.train.MomentumOptimizer(self.lrn_rate, 0.9)
-
-        apply_op = optimizer.minimize(self.cost, global_step=self.global_step)
-        train_ops = [apply_op] + self._extra_train_ops
-        self.train_op = tf.group(*train_ops)
-        self.variables = ray.experimental.tf_utils.TensorFlowVariables(
-            self.train_op)
-
-    def _batch_norm(self, name, x):
-        """Batch normalization."""
-        with tf.variable_scope(name):
-            params_shape = [x.get_shape()[-1]]
-
-            beta = tf.get_variable(
-                "beta",
-                params_shape,
-                tf.float32,
-                initializer=tf.constant_initializer(0.0, tf.float32))
-            gamma = tf.get_variable(
-                "gamma",
-                params_shape,
-                tf.float32,
-                initializer=tf.constant_initializer(1.0, tf.float32))
-
-            if self.mode == "train":
-                mean, variance = tf.nn.moments(x, [0, 1, 2], name="moments")
-
-                moving_mean = tf.get_variable(
-                    "moving_mean",
-                    params_shape,
-                    tf.float32,
-                    initializer=tf.constant_initializer(0.0, tf.float32),
-                    trainable=False)
-                moving_variance = tf.get_variable(
-                    "moving_variance",
-                    params_shape,
-                    tf.float32,
-                    initializer=tf.constant_initializer(1.0, tf.float32),
-                    trainable=False)
-
-                self._extra_train_ops.append(
-                    moving_averages.assign_moving_average(
-                        moving_mean, mean, 0.9))
-                self._extra_train_ops.append(
-                    moving_averages.assign_moving_average(
-                        moving_variance, variance, 0.9))
-            else:
-                mean = tf.get_variable(
-                    "moving_mean",
-                    params_shape,
-                    tf.float32,
-                    initializer=tf.constant_initializer(0.0, tf.float32),
-                    trainable=False)
-                variance = tf.get_variable(
-                    "moving_variance",
-                    params_shape,
-                    tf.float32,
-                    initializer=tf.constant_initializer(1.0, tf.float32),
-                    trainable=False)
-                tf.summary.histogram(mean.op.name, mean)
-                tf.summary.histogram(variance.op.name, variance)
-            # elipson used to be 1e-5. Maybe 0.001 solves NaN problem in deeper
-            # net.
-            y = tf.nn.batch_normalization(x, mean, variance, beta, gamma,
-                                          0.001)
-            y.set_shape(x.get_shape())
-            return y
-
-    def _residual(self,
-                  x,
-                  in_filter,
-                  out_filter,
-                  stride,
-                  activate_before_residual=False):
-        """Residual unit with 2 sub layers."""
-        if activate_before_residual:
-            with tf.variable_scope("shared_activation"):
-                x = self._batch_norm("init_bn", x)
-                x = self._relu(x, self.hps.relu_leakiness)
-                orig_x = x
-        else:
-            with tf.variable_scope("residual_only_activation"):
-                orig_x = x
-                x = self._batch_norm("init_bn", x)
-                x = self._relu(x, self.hps.relu_leakiness)
-
-        with tf.variable_scope("sub1"):
-            x = self._conv("conv1", x, 3, in_filter, out_filter, stride)
-
-        with tf.variable_scope("sub2"):
-            x = self._batch_norm("bn2", x)
-            x = self._relu(x, self.hps.relu_leakiness)
-            x = self._conv("conv2", x, 3, out_filter, out_filter, [1, 1, 1, 1])
-
-        with tf.variable_scope("sub_add"):
-            if in_filter != out_filter:
-                orig_x = tf.nn.avg_pool(orig_x, stride, stride, "VALID")
-                orig_x = tf.pad(
-                    orig_x,
-                    [[0, 0], [0, 0], [0, 0], [(out_filter - in_filter) // 2,
-                                              (out_filter - in_filter) // 2]])
-            x += orig_x
-
-        return x
-
-    def _bottleneck_residual(self,
-                             x,
-                             in_filter,
-                             out_filter,
-                             stride,
-                             activate_before_residual=False):
-        """Bottleneck residual unit with 3 sub layers."""
-        if activate_before_residual:
-            with tf.variable_scope("common_bn_relu"):
-                x = self._batch_norm("init_bn", x)
-                x = self._relu(x, self.hps.relu_leakiness)
-                orig_x = x
-        else:
-            with tf.variable_scope("residual_bn_relu"):
-                orig_x = x
-                x = self._batch_norm("init_bn", x)
-                x = self._relu(x, self.hps.relu_leakiness)
-
-        with tf.variable_scope("sub1"):
-            x = self._conv("conv1", x, 1, in_filter, out_filter / 4, stride)
-
-        with tf.variable_scope("sub2"):
-            x = self._batch_norm("bn2", x)
-            x = self._relu(x, self.hps.relu_leakiness)
-            x = self._conv("conv2", x, 3, out_filter / 4, out_filter / 4,
-                           [1, 1, 1, 1])
-
-        with tf.variable_scope("sub3"):
-            x = self._batch_norm("bn3", x)
-            x = self._relu(x, self.hps.relu_leakiness)
-            x = self._conv("conv3", x, 1, out_filter / 4, out_filter,
-                           [1, 1, 1, 1])
-
-        with tf.variable_scope("sub_add"):
-            if in_filter != out_filter:
-                orig_x = self._conv("project", orig_x, 1, in_filter,
-                                    out_filter, stride)
-            x += orig_x
-
-        return x
-
-    def _decay(self):
-        """L2 weight decay loss."""
-        costs = []
-        for var in tf.trainable_variables():
-            if var.op.name.find(r"DW") > 0:
-                costs.append(tf.nn.l2_loss(var))
-
-        return tf.multiply(self.hps.weight_decay_rate, tf.add_n(costs))
-
-    def _conv(self, name, x, filter_size, in_filters, out_filters, strides):
-        """Convolution."""
-        with tf.variable_scope(name):
-            n = filter_size * filter_size * out_filters
-            kernel = tf.get_variable(
-                "DW", [filter_size, filter_size, in_filters, out_filters],
-                tf.float32,
-                initializer=tf.random_normal_initializer(
-                    stddev=np.sqrt(2.0 / n)))
-            return tf.nn.conv2d(x, kernel, strides, padding="SAME")
-
-    def _relu(self, x, leakiness=0.0):
-        """Relu, with optional leaky support."""
-        return tf.where(tf.less(x, 0.0), leakiness * x, x, name="leaky_relu")
-
-    def _fully_connected(self, x, out_dim):
-        """FullyConnected layer for final output."""
-        x = tf.reshape(x, [self.hps.batch_size, -1])
-        w = tf.get_variable(
-            "DW", [x.get_shape()[1], out_dim],
-            initializer=tf.uniform_unit_scaling_initializer(factor=1.0))
-        b = tf.get_variable(
-            "biases", [out_dim], initializer=tf.constant_initializer())
-        return tf.nn.xw_plus_b(x, w, b)
-
-    def _global_avg_pool(self, x):
-        assert x.get_shape().ndims == 4
-        return tf.reduce_mean(x, [1, 2])
@@ -11,6 +11,8 @@ try:
 except NameError:
    FileNotFoundError = IOError

+# This is not a top level item in the directory, so we use `../` to refer
+# to images located at the top level.
 GALLERY_TEMPLATE = """
 .. raw:: html

@@ -18,7 +20,7 @@ GALLERY_TEMPLATE = """

 .. only:: html

-    .. figure:: {thumbnail}
+    .. figure:: ../{thumbnail}

        {description}

@@ -71,12 +73,13 @@ class CustomGalleryItemDirective(Directive):
        if "figure" in self.options:
            env = self.state.document.settings.env
            rel_figname, figname = env.relfn2path(self.options["figure"])
-            thumbnail = os.path.join("_static/thumbs/",
-                                     os.path.basename(figname))

-            os.makedirs("_static/thumbs", exist_ok=True)
+            thumb_dir = os.path.join(env.srcdir, "_static/thumbs/")
+            os.makedirs(thumb_dir, exist_ok=True)
+            image_path = os.path.join(thumb_dir, os.path.basename(figname))
+            sphinx_gallery.gen_rst.scale_image(figname, image_path, 400, 280)

-            sphinx_gallery.gen_rst.scale_image(figname, thumbnail, 400, 280)
+            thumbnail = os.path.relpath(image_path, env.srcdir)
        else:
            thumbnail = "/_static/img/thumbnails/default.png"

@@ -263,7 +263,6 @@ Getting Involved
   auto_examples/plot_newsreader.rst
   auto_examples/plot_hyperparameter.rst
   auto_examples/plot_pong_example.rst
-   auto_examples/plot_resnet.rst
   auto_examples/plot_streaming.rst
   auto_examples/plot_parameter_server.rst
   auto_examples/plot_example-a3c.rst