Files
ray/doc/source
Eric Liang 5f430da180 [rllib] Provide internal access to episode state in compute_actions() and allow returning extra batches (#2559)
The goal of this PR is to allow custom policies to perform model-based rollouts. In the multi-agent setting, this requires access to not only policies of other agents, but also their current observations.
Also, you might want to return the model-based trajectories as part of the rollout for efficiency.

  compute_actions() now takes a new keyword arg episodes
  pull out internal episode class into a top-level file
  add function to return extra trajectories from an episode that will be appended to the sample batch
  documentation
2018-08-16 14:37:21 -07:00
..
2018-07-01 00:05:08 -07:00
2018-07-01 00:05:08 -07:00
2018-07-01 00:05:08 -07:00
2018-01-25 16:39:00 -08:00