Eric Liang
5f430da180
[rllib] Provide internal access to episode state in compute_actions() and allow returning extra batches ( #2559 )
...
The goal of this PR is to allow custom policies to perform model-based rollouts. In the multi-agent setting, this requires access to not only policies of other agents, but also their current observations.
Also, you might want to return the model-based trajectories as part of the rollout for efficiency.
compute_actions() now takes a new keyword arg episodes
pull out internal episode class into a top-level file
add function to return extra trajectories from an episode that will be appended to the sample batch
documentation
2018-08-16 14:37:21 -07:00
..
2017-02-27 21:14:31 -08:00
2018-04-10 00:51:44 -07:00
2018-08-04 21:27:39 -07:00
2018-01-23 13:40:50 -08:00
2018-07-01 00:05:08 -07:00
2018-08-15 14:31:50 -07:00
2018-01-01 13:02:05 -08:00
2018-08-15 14:31:50 -07:00
2018-08-09 19:51:32 -07:00
2017-09-09 10:21:51 -07:00
2018-06-08 02:25:55 -07:00
2018-07-01 00:05:08 -07:00
2017-12-23 00:31:33 -08:00
2018-02-22 11:15:03 -08:00
2017-11-20 17:52:43 -08:00
2017-07-16 22:19:33 -07:00
2017-11-15 17:49:31 -08:00
2017-12-23 00:31:33 -08:00
2017-08-29 21:37:53 -07:00
2017-07-16 22:19:33 -07:00
2017-11-27 21:38:35 -08:00
2018-01-23 13:40:50 -08:00
2018-03-08 09:18:09 -08:00
2018-08-01 20:53:53 -07:00
2018-07-12 16:57:39 -07:00
2017-03-24 17:33:26 -07:00
2017-09-04 22:58:49 -07:00
2018-08-05 23:58:58 -07:00
2017-07-16 22:19:33 -07:00
2018-07-01 00:05:08 -07:00
2018-07-06 00:16:22 -07:00
2018-02-02 23:03:12 -08:00
2018-05-14 01:05:06 -07:00
2017-09-30 09:56:52 -07:00
2018-07-01 00:05:08 -07:00
2018-07-12 16:57:39 -07:00
2018-01-25 16:39:00 -08:00
2017-11-23 11:31:59 -08:00
2017-11-23 11:31:59 -08:00
2018-07-05 23:44:37 -07:00
2018-05-25 22:19:47 -07:00
2018-08-02 13:35:53 -07:00
2018-01-19 10:08:45 -08:00
2018-01-19 10:08:45 -08:00
2018-07-22 14:47:14 -07:00
2018-08-01 16:29:27 -07:00
2018-07-01 00:05:08 -07:00
2018-08-16 14:37:21 -07:00
2018-08-06 12:10:59 -07:00
2018-07-01 00:05:08 -07:00
2018-08-15 10:19:41 -07:00
2018-08-16 14:37:21 -07:00
2017-07-16 22:19:33 -07:00
2018-07-01 00:05:08 -07:00
2017-12-20 12:54:25 -08:00
2017-07-16 22:19:33 -07:00
2018-08-07 16:29:21 -07:00
2018-08-07 12:17:44 -07:00
2017-07-16 22:19:33 -07:00
2018-07-12 16:57:39 -07:00
2018-07-12 16:57:39 -07:00
2017-06-02 20:17:48 +00:00
2018-08-15 14:31:50 -07:00
2018-08-15 14:31:50 -07:00
2017-07-16 22:19:33 -07:00
2018-03-19 12:55:10 -07:00
2018-02-25 10:19:12 -08:00