Commit Graph

19 Commits

Author SHA1 Message Date
Alok Singh c0e4c9d3d1 [rllib] Add magic methods for rollouts (#2024) 2018-05-16 22:59:46 -07:00
alexbao 68bec0f6fb [rllib] Queue lib for python 2.7 (#2057)
* Queue lib for python 2.7

* use six.moves.queue instead
2018-05-15 15:27:52 -07:00
Eric Liang 47bc4c3009 [rllib] Add DDPG documentation, rename DDPG2 <=> DDPG (#1946)
* updates

* updates

* updates

* updates

* updates

* updates

* Update rllib.rst

* Update policy-optimizers.rst
2018-04-30 00:18:15 -07:00
Roy Fox baf97e450b [rllib] arr[end] was excluded when end is not None (#1931)
Looks good, thanks!
2018-04-22 15:12:55 -07:00
Eric Liang 7ab890f4a1 [tune] [rllib] Automatically determine RLlib resources and add queueing mechanism for autoscaling (#1848) 2018-04-16 16:58:15 -07:00
alvkao58 15a668dd12 [RLLib] DDPG (#1685) 2018-04-11 15:08:39 -07:00
Eric Liang faaa123046 [rllib] Set num_cpu=None for workers in the default settings (#1793) 2018-03-29 16:33:40 -07:00
Eric Liang b41bdcefa0 [rllib] Update RLlib to work with new actor scheduling behavior (#1754)
* Mon Mar 19 21:23:01 PDT 2018

* Mon Mar 19 21:23:07 PDT 2018

* Mon Mar 19 21:30:49 PDT 2018

* Mon Mar 19 21:32:05 PDT 2018

* Mon Mar 19 21:35:43 PDT 2018

* fix cpu limits

* Mon Mar 19 22:25:07 PDT 2018
2018-03-20 19:29:52 -07:00
Eric Liang 882a649f0c [rllib] [docs] Cleanup RLlib API and make docs consistent with upcoming blog post (#1708)
* wip

* more work

* fix apex

* docs

* apex doc

* pool comment

* clean up

* make wrap stack pluggable

* Mon Mar 12 21:45:50 PDT 2018

* clean up comment

* table

* Mon Mar 12 22:51:57 PDT 2018

* Mon Mar 12 22:53:05 PDT 2018

* Mon Mar 12 22:55:03 PDT 2018

* Mon Mar 12 22:56:18 PDT 2018

* Mon Mar 12 22:59:54 PDT 2018

* Update apex_optimizer.py

* Update index.rst

* Update README.rst

* Update README.rst

* comments

* Wed Mar 14 19:01:02 PDT 2018
2018-03-15 15:57:31 -07:00
Eric Liang 076936a7f5 [rllib] Switch DQN to using deepmind wrappers (#1655)
* deepmind wrap

* use 80x80

* respect custom prep

* fix replay size

* fix chekc

* batch idx

* Wed Mar  7 11:00:39 PST 2018

* random starts and reward clipping

* Fri Mar  9 17:27:17 PST 2018

* Fri Mar  9 17:36:15 PST 2018

* Sat Mar 10 19:47:10 PST 2018

* Sat Mar 10 19:47:37 PST 2018

* Sat Mar 10 20:05:12 PST 2018

* Sat Mar 10 20:54:21 PST 2018

* Sat Mar 10 21:03:52 PST 2018
2018-03-11 21:14:38 -07:00
Eric Liang 75e825177f [rllib] Move Ape-X metrics behind a debug flag and remove some of them (#1656) 2018-03-08 00:48:49 -08:00
Eric Liang ecb811c26e [rllib] Ape-X implementation and DQN refactor to handle replay in policy optimizer (#1604)
* minimal apex checkin

* cleanup dqn options

* actor utils

* Sun Feb 25 17:39:54 PST 2018

* update

* compression refactor

* fix

* add test

* fix models

* Sun Feb 25 21:46:27 PST 2018

* Wed Feb 28 10:26:34 PST 2018

* Wed Feb 28 10:28:09 PST 2018

* Wed Feb 28 10:42:59 PST 2018

* refactor

* Wed Feb 28 11:17:19 PST 2018

* Wed Feb 28 11:42:08 PST 2018

* Wed Feb 28 11:42:13 PST 2018

* Wed Feb 28 11:59:02 PST 2018

* Wed Feb 28 11:59:58 PST 2018

* Wed Feb 28 12:00:08 PST 2018

* Wed Feb 28 12:02:19 PST 2018

* Wed Feb 28 13:44:31 PST 2018

* Wed Feb 28 17:01:20 PST 2018

* Sat Mar  3 14:55:59 PST 2018

* make optimizer construction explicit

* Sat Mar  3 18:23:08 PST 2018

* Sat Mar  3 18:24:28 PST 2018

* Sat Mar  3 18:49:28 PST 2018

* Sat Mar  3 18:50:42 PST 2018

* Sat Mar  3 18:56:10 PST 2018
2018-03-04 12:25:25 -08:00
Eric Liang 7e998db656 [rllib] Reduce concat memory usage, allow object store memory to be specified in init (#1529)
* c

* stop agents

* comment

* Sat Feb 10 02:33:30 PST 2018

* Sat Feb 10 02:33:39 PST 2018

* Update sample_batch.py

* Sun Feb 11 14:38:55 PST 2018

* add ppo config warn
2018-02-11 19:14:51 -08:00
Roy Fox 4b0ef5eb2c [rllib] Behavior Cloning (#1400)
* Behavior Cloning

* episode_reward_mean -> mean_loss

* removing vestigial code

* punctuation

* unnecessary

* Behavior Cloning

* Behavior Cloning

* Update __init__.py
2018-01-23 10:50:45 -08:00
Richard Liaw 3304099cc4 [rllib] Evaluators and Optimizers Refactoring (#1339) 2017-12-30 00:24:54 -08:00
Eric Liang 22c7c87e14 [rllib] [tune] Custom preprocessors and models, various fixes (#1372) 2017-12-28 13:19:04 -08:00
Richard Liaw 4bb5b6bd5b [rllib] A3C Configurations (#1370)
* initial introduction of a3c configs

* fix sample batch

* flake but need to check save

* save,resotre

* fix

* pickles

* entropy

* fix

* moving ppo

* results

* jenkins
2017-12-24 12:25:13 -08:00
Eric Liang 47b1f02d3e [rllib] Pull out multi-gpu optimizer as a generic class (#1313) 2017-12-17 15:59:57 -08:00
Eric Liang 2d543b6e19 [rllib] Refactor DQN to use an Evaluator abstraction (#1276)
This introduces rllib.Evaluator and rllib.Optimizer classes. Optimizers encapsulate a particular distributed optimization strategy for RL. Evaluators encapsulate the model graph, and once implemented, any Optimizer may be "plugged in" to any algorithm that implements the Evaluator interface.
2017-12-06 17:51:57 -08:00