wassname/ray - ray - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/ray.git synced 2026-07-04 18:32:13 +08:00

Author	SHA1	Message	Date
butchcom	936bebef99	[rllib] Upgrade to OpenAI Gym 0.10.3 (#1601 )	2018-03-06 00:31:02 -08:00
Eric Liang	ecb811c26e	[rllib] Ape-X implementation and DQN refactor to handle replay in policy optimizer (#1604 ) * minimal apex checkin * cleanup dqn options * actor utils * Sun Feb 25 17:39:54 PST 2018 * update * compression refactor * fix * add test * fix models * Sun Feb 25 21:46:27 PST 2018 * Wed Feb 28 10:26:34 PST 2018 * Wed Feb 28 10:28:09 PST 2018 * Wed Feb 28 10:42:59 PST 2018 * refactor * Wed Feb 28 11:17:19 PST 2018 * Wed Feb 28 11:42:08 PST 2018 * Wed Feb 28 11:42:13 PST 2018 * Wed Feb 28 11:59:02 PST 2018 * Wed Feb 28 11:59:58 PST 2018 * Wed Feb 28 12:00:08 PST 2018 * Wed Feb 28 12:02:19 PST 2018 * Wed Feb 28 13:44:31 PST 2018 * Wed Feb 28 17:01:20 PST 2018 * Sat Mar 3 14:55:59 PST 2018 * make optimizer construction explicit * Sat Mar 3 18:23:08 PST 2018 * Sat Mar 3 18:24:28 PST 2018 * Sat Mar 3 18:49:28 PST 2018 * Sat Mar 3 18:50:42 PST 2018 * Sat Mar 3 18:56:10 PST 2018	2018-03-04 12:25:25 -08:00
Eric Liang	75293a0ba0	[rllib] Basic regression tests on CartPole (#1608 ) * Sun Feb 25 21:36:22 PST 2018 * Sun Feb 25 21:42:09 PST 2018 * Sun Feb 25 21:44:30 PST 2018 * fix lint * Wed Feb 28 12:41:49 PST 2018	2018-03-03 16:27:56 -08:00
Richard Liaw	b79597dc00	[rllib] PPO Thread Limit (#1568 )	2018-02-26 22:22:05 -08:00
Richard Liaw	c2ad800cbf	[rllib] Registry fix for DQN Replay Evaluators (#1593 )	2018-02-25 22:30:11 -08:00
Eric Liang	1b596f7d3b	[rllib] Rollout script needs to pipe in config and update states (#1566 ) * Mon Feb 19 15:20:09 PST 2018 * fix it actually	2018-02-20 12:04:41 -08:00
Richard Liaw	0f766ae24b	[rllib] Fix testGetFilters in A3C (#1557 )	2018-02-19 22:44:14 -08:00
Eric Liang	4a6cfee887	[rllib] add tuned example for pendulum (#1552 )	2018-02-18 00:46:42 -08:00
Robert Nishihara	61d8a17de0	[rllib] Change NotImplemented -> NotImplementedError. (#1535 )	2018-02-16 17:08:25 -08:00
Eric Liang	ca0f08d100	[tune] Recover experiments from last checkpoint (#1532 )	2018-02-12 14:01:19 -08:00
Eric Liang	7e998db656	[rllib] Reduce concat memory usage, allow object store memory to be specified in init (#1529 ) * c * stop agents * comment * Sat Feb 10 02:33:30 PST 2018 * Sat Feb 10 02:33:39 PST 2018 * Update sample_batch.py * Sun Feb 11 14:38:55 PST 2018 * add ppo config warn	2018-02-11 19:14:51 -08:00
eugenevinitsky	639df85fda	updated multiagent docs (#1523 ) * updated multiagent docs * Update rllib.rst * Update multiagent_mountaincar_env.py * Update multiagent_pendulum_env.py	2018-02-11 16:35:03 -08:00
alvkao58	81a4be8f65	[rllib] Added vanilla policy gradient (#1497 )	2018-02-10 13:54:51 -08:00
Eric Liang	0a9dbc84b5	Tue Feb 6 20:57:42 PST 2018 (#1521 ) The test failure was unrelated	2018-02-06 23:11:31 -08:00
the-sea	d0dd33e13c	not share registered objects between _Regitry objects (#1508 )	2018-02-03 15:03:52 -08:00
Eric Liang	b948405532	[tune] clean up population based training prototype (#1478 ) * patch up pbt * Sat Jan 27 01:00:03 PST 2018 * Sat Jan 27 01:04:14 PST 2018 * Sat Jan 27 01:04:21 PST 2018 * Sat Jan 27 01:15:15 PST 2018 * Sat Jan 27 01:15:42 PST 2018 * Sat Jan 27 01:16:14 PST 2018 * Sat Jan 27 01:38:42 PST 2018 * Sat Jan 27 01:39:21 PST 2018 * add pbt * Sat Jan 27 01:41:19 PST 2018 * Sat Jan 27 01:44:21 PST 2018 * Sat Jan 27 01:45:46 PST 2018 * Sat Jan 27 16:54:42 PST 2018 * Sat Jan 27 16:57:53 PST 2018 * clean up test * Sat Jan 27 18:01:15 PST 2018 * Sat Jan 27 18:02:54 PST 2018 * Sat Jan 27 18:11:18 PST 2018 * Sat Jan 27 18:11:55 PST 2018 * Sat Jan 27 18:14:09 PST 2018 * review * try out a ppo example * some tweaks to ppo example * add postprocess hook * Sun Jan 28 15:00:40 PST 2018 * clean up custom explore fn * Sun Jan 28 15:10:21 PST 2018 * Sun Jan 28 15:14:53 PST 2018 * Sun Jan 28 15:17:04 PST 2018 * Sun Jan 28 15:33:13 PST 2018 * Sun Jan 28 15:56:40 PST 2018 * Sun Jan 28 15:57:36 PST 2018 * Sun Jan 28 16:00:35 PST 2018 * Sun Jan 28 16:02:58 PST 2018 * Sun Jan 28 16:29:50 PST 2018 * Sun Jan 28 16:30:36 PST 2018 * Sun Jan 28 16:31:44 PST 2018 * improve tune doc * concepts * update humanoid * Fri Feb 2 18:03:33 PST 2018 * fix example * show error file	2018-02-02 23:03:12 -08:00
the-sea	a936468f99	[tune] using None as the parameter default value instead of mutable dict (#1501 ) * do not use dict as default parameter * Update trial.py	2018-02-02 21:47:51 -08:00
eugenevinitsky	369773d3e8	[rllib] minor bug fix to shared model, model wasnt actually shared due to new scope (#1503 )	2018-02-02 20:37:00 -08:00
Philipp Moritz	7550b628bf	fix indentation for ES (#1484 )	2018-01-31 17:22:38 -08:00
Eric Liang	35b1d6189b	[tune] save error msg, cleanup after object checkpoints	2018-01-29 18:48:45 -08:00
Kaahan	7aa979a024	[tune] Added Population Based Training (#1355 ) Adds a Population-Based Training (as described in https://arxiv.org/abs/1711.09846) scheduler to Ray.tune. Currently mutates hyperparameters according to either a user-defined list of possible values to mutate to (necessary if hyperparameters can only be certain values ex. sgd_batch_size), or by a factor of 0.8 or 1.2.	2018-01-25 21:38:37 -08:00
eugenevinitsky	0a01d3c71f	[rllib] Mountaincar fix (#1472 ) * Fix for gym version 0.9.5. * fixed bug in reshaper that was causing discrete spaces to fail	2018-01-25 13:58:35 -08:00
Robert Nishihara	f6c835e4b8	Fix for gym version 0.9.5. (#1471 )	2018-01-25 13:58:15 -08:00
Eric Liang	173f1d629a	[tune] Ray Tune API cleanup (#1454 ) Remove rllib dep: trainable is now a standalone abstract class that can be easily subclassed. Clean up hyperband: fix debug string and add an example. Remove YAML api / ScriptRunner: this was never really used. Move ray.init() out of run_experiments(): This provides greater flexibility and should be less confusing since there isn't an implicit init() done there. Note that this is a breaking API change for tune.	2018-01-24 16:55:17 -08:00
Eric Liang	1d2a28ab07	[rllib] test all combinations of {obs_space} x {action_space} (#1449 )	2018-01-24 11:03:43 -08:00
Roy Fox	4b0ef5eb2c	[rllib] Behavior Cloning (#1400 ) * Behavior Cloning * episode_reward_mean -> mean_loss * removing vestigial code * punctuation * unnecessary * Behavior Cloning * Behavior Cloning * Update __init__.py	2018-01-23 10:50:45 -08:00
Eric Liang	ee36effd8e	[rllib] Add n-step Q learning for DQN (#1439 ) * n-step * add sample adjustm * Oops * fix nstep * metric adjustment * Sat Jan 20 23:30:34 PST 2018 * Sun Jan 21 16:40:46 PST 2018 * Mon Jan 22 22:24:57 PST 2018	2018-01-23 10:31:19 -08:00
Eric Liang	424bd7f74d	[rllib] improve custom env docs (#1447 ) * env docs * add env * update env * Fri Jan 19 18:55:34 PST 2018	2018-01-19 21:36:18 -08:00
Eric Liang	e216766bbc	[rllib] Update docs with api and components overview figures (#1443 )	2018-01-19 10:08:45 -08:00
eugenevinitsky	37076a9ff8	Multiagent model using concatenated observations (#1416 ) * working multi action distribution and multiagent model * currently working but the splits arent done in the right place * added shared models * added categorical support and mountain car example * now compatible with generalized advantage estimation * working multiagent code with discrete and continuous example * moved reshaper to utils * code review changes made, ppo action placeholder moved to model catalog, all multiagent code moved out of fcnet * added examples in * added PEP8 compliance * examples are mostly pep8 compliant * removed all flake errors * added examples to jenkins tests * fixed custom options bug * added lines to let docker file find multiagent tests * shortened example run length * corrected nits * fixed flake errors	2018-01-18 19:51:31 -08:00
Peter Schafhalter	215d526e0d	Load evaluation configuration from checkpoint (#1392 )	2018-01-17 10:51:33 -08:00
Philipp Moritz	1290072764	[rllib] Expose PPO evaluator resource requirements (#1391 )	2018-01-11 11:09:01 -08:00
Eric Liang	c60ccbad46	[carla] [rllib] Add support for carla nav planner and scenarios from paper (#1382 ) * wip * Sat Dec 30 15:07:28 PST 2017 * log video * video doesn't work well * scenario integration * Sat Dec 30 17:30:22 PST 2017 * Sat Dec 30 17:31:05 PST 2017 * Sat Dec 30 17:31:32 PST 2017 * Sat Dec 30 17:32:16 PST 2017 * Sat Dec 30 17:34:11 PST 2017 * Sat Dec 30 17:34:50 PST 2017 * Sat Dec 30 17:35:34 PST 2017 * Sat Dec 30 17:38:49 PST 2017 * Sat Dec 30 17:40:39 PST 2017 * Sat Dec 30 17:43:00 PST 2017 * Sat Dec 30 17:43:04 PST 2017 * Sat Dec 30 17:45:56 PST 2017 * Sat Dec 30 17:46:26 PST 2017 * Sat Dec 30 17:47:02 PST 2017 * Sat Dec 30 17:51:53 PST 2017 * Sat Dec 30 17:52:54 PST 2017 * Sat Dec 30 17:56:43 PST 2017 * Sat Dec 30 18:27:07 PST 2017 * Sat Dec 30 18:27:52 PST 2017 * fix train * Sat Dec 30 18:41:51 PST 2017 * Sat Dec 30 18:54:11 PST 2017 * Sat Dec 30 18:56:22 PST 2017 * Sat Dec 30 19:05:04 PST 2017 * Sat Dec 30 19:05:23 PST 2017 * Sat Dec 30 19:11:53 PST 2017 * Sat Dec 30 19:14:31 PST 2017 * Sat Dec 30 19:16:20 PST 2017 * Sat Dec 30 19:18:05 PST 2017 * Sat Dec 30 19:18:45 PST 2017 * Sat Dec 30 19:22:44 PST 2017 * Sat Dec 30 19:24:41 PST 2017 * Sat Dec 30 19:26:57 PST 2017 * Sat Dec 30 19:40:37 PST 2017 * wip models * reward bonus * test prep * Sun Dec 31 18:45:25 PST 2017 * Sun Dec 31 18:58:28 PST 2017 * Sun Dec 31 18:59:34 PST 2017 * Sun Dec 31 19:03:33 PST 2017 * Sun Dec 31 19:05:05 PST 2017 * Sun Dec 31 19:09:25 PST 2017 * fix train * kill * add tuple preprocessor * Sun Dec 31 20:38:33 PST 2017 * Sun Dec 31 22:51:24 PST 2017 * Sun Dec 31 23:14:13 PST 2017 * Sun Dec 31 23:16:04 PST 2017 * Mon Jan 1 00:08:35 PST 2018 * Mon Jan 1 00:10:48 PST 2018 * Mon Jan 1 01:08:31 PST 2018 * Mon Jan 1 14:45:44 PST 2018 * Mon Jan 1 14:54:56 PST 2018 * Mon Jan 1 17:29:29 PST 2018 * switch to euclidean dists * Mon Jan 1 17:39:27 PST 2018 * Mon Jan 1 17:41:47 PST 2018 * Mon Jan 1 17:44:18 PST 2018 * Mon Jan 1 17:47:09 PST 2018 * Mon Jan 1 20:31:02 PST 2018 * Mon Jan 1 20:39:33 PST 2018 * Mon Jan 1 20:40:55 PST 2018 * Mon Jan 1 20:55:06 PST 2018 * Mon Jan 1 21:05:52 PST 2018 * fix env path * merge richards fix * fix hash * Mon Jan 1 22:04:00 PST 2018 * Mon Jan 1 22:25:29 PST 2018 * Mon Jan 1 22:30:42 PST 2018 * simplified reward function * add framestack * add env configs * simplify speed reward * Tue Jan 2 17:36:15 PST 2018 * Tue Jan 2 17:49:16 PST 2018 * Tue Jan 2 18:10:38 PST 2018 * add lane keeping simple mode * Tue Jan 2 20:25:26 PST 2018 * Tue Jan 2 20:30:30 PST 2018 * Tue Jan 2 20:33:26 PST 2018 * Tue Jan 2 20:41:42 PST 2018 * ppo lane keep * simplify discrete actions * Tue Jan 2 21:41:05 PST 2018 * Tue Jan 2 21:49:03 PST 2018 * Tue Jan 2 22:12:23 PST 2018 * Tue Jan 2 22:14:42 PST 2018 * Tue Jan 2 22:20:59 PST 2018 * Tue Jan 2 22:23:43 PST 2018 * Tue Jan 2 22:26:27 PST 2018 * Tue Jan 2 22:27:20 PST 2018 * Tue Jan 2 22:44:00 PST 2018 * Tue Jan 2 22:57:58 PST 2018 * Tue Jan 2 23:08:51 PST 2018 * Tue Jan 2 23:11:32 PST 2018 * update dqn reward * Thu Jan 4 12:29:40 PST 2018 * Thu Jan 4 12:30:26 PST 2018 * Update train_dqn.py * fix	2018-01-05 21:32:41 -08:00
Eric Liang	6e6674a824	[rllib] Split docs into user and development guide (#1377 ) * docs * Update README.rst * Sat Dec 30 15:23:49 PST 2017 * comments * Sun Dec 31 23:33:30 PST 2017 * Sun Dec 31 23:33:38 PST 2017 * Sun Dec 31 23:37:46 PST 2017 * Sun Dec 31 23:39:28 PST 2017 * Sun Dec 31 23:43:05 PST 2017 * Sun Dec 31 23:51:55 PST 2017 * Sun Dec 31 23:52:51 PST 2017	2018-01-01 11:10:44 -08:00
Richard Liaw	3304099cc4	[rllib] Evaluators and Optimizers Refactoring (#1339 )	2017-12-30 00:24:54 -08:00
Eric Liang	22c7c87e14	[rllib] [tune] Custom preprocessors and models, various fixes (#1372 )	2017-12-28 13:19:04 -08:00
Richard Liaw	4bb5b6bd5b	[rllib] A3C Configurations (#1370 ) * initial introduction of a3c configs * fix sample batch * flake but need to check save * save,resotre * fix * pickles * entropy * fix * moving ppo * results * jenkins	2017-12-24 12:25:13 -08:00
Richard Liaw	b217a5ef14	[rllib] Fix Pong-PPO tuned example Config (#1369 )	2017-12-23 01:36:33 -08:00
Cathy Wu	772527caa4	[rllib] Support 1-dimensional action spaces (PPO) (#1347 ) * Small fix for supporting custom preprocessors * PEP8 * Remove squeeze from actions	2017-12-19 14:17:06 -08:00
Eric Liang	6724f57b03	[Examples] Add Carla test env (#1343 ) * add carla example * add reward * set obs * Sun Dec 17 16:06:00 PST 2017 * add spec * fix measurement * add train script * resize to 80x80 * null * initial small training run * robustify env, clean up action space * clean up vars * switch to town2 which is faster * tunify train.py * add discrete mode * update * fix excessive brakinG * fix the weather * rename * redirect output and from future import * doc * update * fix rebase * allow dqn gpu growht * adjust dqn hyperparams * better ppo parameters	2017-12-19 12:57:58 -08:00
Eric Liang	47b1f02d3e	[rllib] Pull out multi-gpu optimizer as a generic class (#1313 )	2017-12-17 15:59:57 -08:00
Cathy Wu	53e736fe01	[rllib] Small fix for supporting custom preprocessors (#1334 ) * Small fix for supporting custom preprocessors * PEP8 * fix test	2017-12-17 04:37:29 -08:00
Eric Liang	fbf1806b8a	[tune] Clean up result logging: move out of /tmp, add timestamp (#1297 )	2017-12-15 14:19:08 -08:00
Richard Liaw	c5c83a4465	[rllib] PPO and A3C unification (#1253 )	2017-12-14 01:08:23 -08:00
Richard Liaw	cabbd27c56	[rllib] Support Nested Configuration Merging (#1268 )	2017-12-13 14:39:01 -08:00
Peter Schafhalter	20d6b74aa6	[rllib] Added evaluation script to RLLib (#1295 )	2017-12-11 11:59:44 -08:00
Zongheng Yang	7e4a28f933	[rllib] Add tuned_examples/pong-ppo.yaml (#1302 ) * Add tuned_examples/pong-ppo.yaml: 21 rew in ~3380sec * Header comments	2017-12-09 01:20:22 -08:00
Richard Liaw	2e0eb0e4c7	[rllib] Adding dependencies (#1298 )	2017-12-08 01:57:19 -08:00
Philipp Moritz	26125e1547	Fixing the jenkins tests (#1299 ) * trying to fix jenkins tests * comment out more tests * remove pytorch stuff * use non-monotonic clock (monotonic not supported on python 2.7) * whitespace	2017-12-07 17:03:58 -08:00
Eric Liang	35f7398666	[rllib] Update RLlib docs and README (#1288 ) Updates the rllib docs and README.	2017-12-06 18:17:51 -08:00

1 2 3

124 Commits