wassname/ray - ray - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/ray.git synced 2026-06-29 17:37:07 +08:00

Author	SHA1	Message	Date
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
Eric Liang	955154a19d	Reduce Ray / RLlib startup messages (#5368 )	2019-08-05 13:23:54 -07:00
Kristian Hartikainen	13fb9fe3db	[rllib] Feature/soft actor critic v2 (#5328 ) * Add base for Soft Actor-Critic * Pick changes from old SAC branch * Update sac.py * First implementation of sac model * Remove unnecessary SAC imports * Prune unnecessary noise and exploration code * Implement SAC model and use that in SAC policy * runs but doesn't learn * clear state * fix batch size * Add missing alpha grads and vars * -200 by 2k timesteps * doc * lazy squash * one file * ignore tfp * revert done	2019-08-01 23:37:36 -07:00
Eric Liang	20450a4e82	[rllib] Add rock paper scissors multi-agent example (#5336 )	2019-08-01 13:03:59 -07:00
jichan3751	bd6dfc994f	[sgd] Replaced class Resources in sgd with `use_gpu` (#5252 )	2019-08-01 01:03:10 -07:00
Jaroslaw Rzepecki	b3c8091a35	Fix Tuple spaces in rollout.py (#5332 ) Make sure that the initial action is also properly flattened.	2019-07-31 11:38:49 -07:00
Michael Luo	1337c98f02	[rllib] Importance Sampling and KL Loss for APPO (#5051 )	2019-07-29 15:02:32 -07:00
Eric Liang	3bdd114282	[rllib] Better example rnn envs (#5300 )	2019-07-28 14:07:18 -07:00
Eric Liang	a62c5f40f6	[rllib] Document ModelV2 and clean up the models/ directory (#5277 )	2019-07-27 02:08:16 -07:00
Richard Liaw	5e15b36d6e	[tune] experiment_analysis split to Analysis (#5115 )	2019-07-27 01:10:52 -07:00
Antoine Galataud	827618254a	[rllib] Configure learner queue timeout (#5270 ) * configure learner queue timeout * lint * use config * fix method args order, add unit test * fix wrong param name	2019-07-25 21:18:05 -07:00
Eric Liang	bf9199ad77	[rllib] ModelV2 support for pytorch (#5249 )	2019-07-25 11:02:53 -07:00
Eric Liang	60f59639c1	[rllib] Port DDPG to the build_tf_policy pattern (#5242 )	2019-07-24 13:55:55 -07:00
Eric Liang	690b374581	[rllib] Add Keras LSTM example with ModelV2 (#5258 )	2019-07-24 13:09:41 -07:00
Eric Liang	97c43284a6	[rllib] Fix trainer state restore (#5257 )	2019-07-23 21:18:58 -07:00
Eric Liang	f9043cc49a	[rllib] Remove experimental eager support	2019-07-21 12:27:17 -07:00
Eric Liang	d58b986858	[rllib] MultiCategorical shouldn't return array for kl or entropy (#5215 ) * wip * fix	2019-07-19 12:12:04 -07:00
Jones Wong	da7676c925	Removed the implicit sync barrier at the end of each training iteration (#5217 ) * removed sync barrier at the end of each training iteration * formatted * modify the comment according to current semantics * lint check * Update trainer.py	2019-07-18 22:59:52 -07:00
Eric Liang	28e5c5555d	[rllib] Move some inline defs to avoid deserialization errors (#5228 ) * fix bug * move metrics too	2019-07-18 21:01:16 -07:00
Jones Wong	0af07bd493	Enable seeding actors for reproducible experiments (#5197 ) * enable graph-level worker-specific seed * lint checked * revised according to eric's suggestions * revised accordingly and added a test case * formated * Update test_reproducibility.py * Update trainer.py * Update rollout_worker.py * Update run_rllib_tests.sh * Update worker_set.py	2019-07-17 23:31:34 -07:00
Jones Wong	81d297f87e	Remove redundant scaler of l2 reg (#5172 ) * remove redundant scaler of l2 reg * lint formatted * Update ddpg_policy.py	2019-07-17 15:11:27 -07:00
Jones Wong	ae03c42dd6	Fixed inconsistent action placeholder (#5213 )	2019-07-17 10:55:14 -07:00
Sam Toyer	214f09d969	[rllib] Make RLLib handle zero-length observation arrays (#5208 ) * [rllib] Make _summarize handle zero-len arrays Fixes #5207 * [rllib] Make aligned_array() handle empty arrays * [rllib] Conform with old yapf	2019-07-16 22:37:57 -07:00
Eric Liang	4fa2a6006c	[rllib] Remove nested import (#5204 ) * remove nested import * Update metrics.py	2019-07-16 10:52:56 -07:00
Eric Liang	047f4ccd61	[rllib] Fix rollout.py with tuple action space (#5201 ) * fix it * update doc too * fix rollout	2019-07-16 10:52:35 -07:00
Jones Wong	5b13a7eb90	Keep parameter space noise consistent with action space noise (Fix 5173) (#5193 ) * make parameter space noise consistent with action space noise * modified according to lint check * indent	2019-07-14 12:20:35 -07:00
Richard Liaw	691c9733f9	[tune] Document trainable attributes and enable user-checkpoint… (#4868 )	2019-07-10 18:51:11 -07:00
Eric Liang	5ab5017c67	[rllib] Fix impala stress test (#5101 ) * add copy * upgrade to tf 1.14 * update * reduce count to workaround https://github.com/ray-project/ray/issues/5125 * Update impala.py * placeholder * comments * update	2019-07-09 20:22:30 -07:00
Stefan Pantic	dfc94ce7bc	[rllib]Add entropy coeff decay (#5043 )	2019-07-08 18:30:32 -07:00
Eric Liang	893744b3be	[rllib] Revert "use make template" which seems to break DQN/Atari (#5134 ) * Revert "use make template" This reverts commit 291e9e0031c6e315fe24e5b4973dea375fe73918. * debug vars	2019-07-07 19:51:26 -07:00
Eric Liang	932d6b2517	[rllib] Port IMPALA to ModelV2/build_tf_policy (#5130 ) * port vtrace * fix vf * fix vs * fix the example * wip ddpg * fix tests * fix tests * remove ddpg model * comments * set vf share layers True by default * typo * fix test	2019-07-07 15:06:41 -07:00
Eric Liang	445bcb29b0	[hotfix] fix backward compat with older yaml libraries	2019-07-06 20:41:28 -07:00
Eric Liang	c15ed3ac55	[rllib] Shuffle RNN sequences in PPO as well (#5129 ) * shuffle seq * fix test	2019-07-06 20:40:49 -07:00
Brandon Bertelsen	c04b69902c	Updates for #5072 (#5091 )	2019-07-06 16:05:50 -07:00
Aleksei Petrenko	09bde397c9	Multiagent experiment resume (#5102 ) * Fixed problem with multiagent experiment resume * Applied format script * fix lint	2019-07-06 11:38:17 -07:00
Dušan Josipović	e9b88dcbed	[wingman -> tune] Add system performance tracking (#4924 )	2019-07-06 00:57:35 -07:00
Eric Liang	34d054ff19	[rllib] ModelV2 API (#4926 )	2019-07-03 15:59:47 -07:00
Kristian Hartikainen	9e0192bc0b	[tune] Change the log syncing behavior (#4450 ) * Change the log syncing behavior * fix up abstractions for syncer * Finished checkpoint syncing * Code * Set of changes to get things running * Fixes for log syncing * Fix parts * Lint and other fixes * fix some test * Remove extra parsing functionality * some test fixes * Fix up cloud syncing * Another thing to do * Fix up tests and local sync Changes LogSync into a mixin, and adds tests for different functionalities. * Fix up tests, start on local migration * fix distributed migrations * comments * formatting * Better checkpoint directory handling * fix tests * fix tests * fix click * comments * formatting comments * formatting and comments * sync function deprecations * syncfunction * Add documentation for Syncing and Uploading * nit * BaseSyncer as base for Mixin in edge case * more docs * clean up assertions * validate * nit * Update test_cluster.py * betterdoc * Update tune-usage.rst * cleanup * nit	2019-07-02 20:46:00 -07:00
Philipp Moritz	bbe3e5b4ed	[rllib] Give error if sample_async is used with pytorch for A3C (#5000 ) * give error if sample_async is used with pytorch * update * Update a3c.py	2019-06-25 22:06:35 -07:00
Eric Liang	aa5fc52e32	[rllib] Add QMIX mixer parameters to optimizer param list (#5014 ) * add mixer params * Update qmix_policy.py	2019-06-25 19:02:40 -07:00
Eric Liang	1d17125333	temp fix for build (#5006 )	2019-06-20 18:07:44 -07:00
Eric Liang	fa1d4c9807	[rllib] Fix DDPG example (#4973 )	2019-06-13 15:07:46 -07:00
Eric Liang	4f8e100fe0	fix (#4950 )	2019-06-10 10:20:55 +08:00
Eric Liang	77689d1116	[rllib] Port remainder of algorithms to build_trainer() pattern (#4920 )	2019-06-07 16:45:36 -07:00
Eric Liang	9e328fbe6f	[rllib] Add docs on how to use TF eager execution (#4927 )	2019-06-07 16:42:37 -07:00
Eric Liang	7501ee51db	[rllib] Rename PolicyEvaluator => RolloutWorker (#4820 )	2019-06-03 06:49:24 +08:00
Eric Liang	99eae05cf6	[tune] Disallow setting resources_per_trial when it is already configured (#4880 ) * disallow it * import fix * fix example * fix test * fix tests * Update mock.py * fix * make less convoluted * fix tests	2019-06-03 06:47:39 +08:00
Eric Liang	665d081fe9	[rllib] Rough port of DQN to build_tf_policy() pattern (#4823 )	2019-06-02 14:14:31 +08:00
Eric Liang	9aa1cd613d	[rllib] Allow Torch policies access to full action input dict in extra_action_out_fn (#4894 ) * fix torch extra out * preserve setitem * fix docs	2019-06-01 16:58:49 +08:00
Eric Liang	1c073e92e4	[rllib] Fix documentation on custom policies (#4910 ) * wip * add docs * lint * todo sections * fix doc	2019-06-01 16:13:21 +08:00

1 2 3 4 5 ...

496 Commits