wassname/ray - ray - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/ray.git synced 2026-07-04 01:41:28 +08:00

Author	SHA1	Message	Date
Eric Liang	445bcb29b0	[hotfix] fix backward compat with older yaml libraries	2019-07-06 20:41:28 -07:00
Eric Liang	c15ed3ac55	[rllib] Shuffle RNN sequences in PPO as well (#5129 ) * shuffle seq * fix test	2019-07-06 20:40:49 -07:00
Brandon Bertelsen	c04b69902c	Updates for #5072 (#5091 )	2019-07-06 16:05:50 -07:00
Aleksei Petrenko	09bde397c9	Multiagent experiment resume (#5102 ) * Fixed problem with multiagent experiment resume * Applied format script * fix lint	2019-07-06 11:38:17 -07:00
Dušan Josipović	e9b88dcbed	[wingman -> tune] Add system performance tracking (#4924 )	2019-07-06 00:57:35 -07:00
Eric Liang	34d054ff19	[rllib] ModelV2 API (#4926 )	2019-07-03 15:59:47 -07:00
Kristian Hartikainen	9e0192bc0b	[tune] Change the log syncing behavior (#4450 ) * Change the log syncing behavior * fix up abstractions for syncer * Finished checkpoint syncing * Code * Set of changes to get things running * Fixes for log syncing * Fix parts * Lint and other fixes * fix some test * Remove extra parsing functionality * some test fixes * Fix up cloud syncing * Another thing to do * Fix up tests and local sync Changes LogSync into a mixin, and adds tests for different functionalities. * Fix up tests, start on local migration * fix distributed migrations * comments * formatting * Better checkpoint directory handling * fix tests * fix tests * fix click * comments * formatting comments * formatting and comments * sync function deprecations * syncfunction * Add documentation for Syncing and Uploading * nit * BaseSyncer as base for Mixin in edge case * more docs * clean up assertions * validate * nit * Update test_cluster.py * betterdoc * Update tune-usage.rst * cleanup * nit	2019-07-02 20:46:00 -07:00
Philipp Moritz	bbe3e5b4ed	[rllib] Give error if sample_async is used with pytorch for A3C (#5000 ) * give error if sample_async is used with pytorch * update * Update a3c.py	2019-06-25 22:06:35 -07:00
Eric Liang	aa5fc52e32	[rllib] Add QMIX mixer parameters to optimizer param list (#5014 ) * add mixer params * Update qmix_policy.py	2019-06-25 19:02:40 -07:00
Eric Liang	1d17125333	temp fix for build (#5006 )	2019-06-20 18:07:44 -07:00
Eric Liang	fa1d4c9807	[rllib] Fix DDPG example (#4973 )	2019-06-13 15:07:46 -07:00
Eric Liang	4f8e100fe0	fix (#4950 )	2019-06-10 10:20:55 +08:00
Eric Liang	77689d1116	[rllib] Port remainder of algorithms to build_trainer() pattern (#4920 )	2019-06-07 16:45:36 -07:00
Eric Liang	9e328fbe6f	[rllib] Add docs on how to use TF eager execution (#4927 )	2019-06-07 16:42:37 -07:00
Eric Liang	7501ee51db	[rllib] Rename PolicyEvaluator => RolloutWorker (#4820 )	2019-06-03 06:49:24 +08:00
Eric Liang	99eae05cf6	[tune] Disallow setting resources_per_trial when it is already configured (#4880 ) * disallow it * import fix * fix example * fix test * fix tests * Update mock.py * fix * make less convoluted * fix tests	2019-06-03 06:47:39 +08:00
Eric Liang	665d081fe9	[rllib] Rough port of DQN to build_tf_policy() pattern (#4823 )	2019-06-02 14:14:31 +08:00
Eric Liang	9aa1cd613d	[rllib] Allow Torch policies access to full action input dict in extra_action_out_fn (#4894 ) * fix torch extra out * preserve setitem * fix docs	2019-06-01 16:58:49 +08:00
Eric Liang	1c073e92e4	[rllib] Fix documentation on custom policies (#4910 ) * wip * add docs * lint * todo sections * fix doc	2019-06-01 16:13:21 +08:00
Eric Liang	3f4d37cd0e	[rllib] Fix Multidiscrete support (#4869 )	2019-05-29 20:41:02 -07:00
Eric Liang	2dd0beb5bd	[rllib] Allow access to batches prior to postprocessing (#4871 )	2019-05-29 18:17:14 -07:00
Philipp Moritz	64eb7b322c	Upgrade arrow to latest master (#4858 )	2019-05-28 16:04:16 -07:00
Eric Liang	d7be5a5d36	[rllib] Fix error getting kl when simple_optimizer: True in multi-agent PPO	2019-05-27 17:24:45 -07:00
Eric Liang	a45c61e19b	[rllib] Update concepts docs and add "Building Policies in Torch/TensorFlow" section (#4821 ) * wip * fix index * fix bugs * todo * add imports * note on get ph * note on get ph * rename to building custom algs * add rnn state info	2019-05-27 14:17:32 -07:00
Eric Liang	7237ea70c4	[rllib] [RFC] Deprecate Python 2 / RLlib (#4832 )	2019-05-25 10:45:26 -07:00
Eric Liang	02583a8598	[rllib] Rename PolicyGraph => Policy, move from evaluation/ to policy/ (#4819 ) This implements some of the renames proposed in #4813 We leave behind backwards-compatibility aliases for *PolicyGraph and SampleBatch.	2019-05-20 16:46:05 -07:00
Eric Liang	6cb5b90bd6	[rllib] [RFC] Dynamic definition of loss functions and modularization support (#4795 ) * dynamic graph * wip * clean up * fix * document trainer * wip * initialize the graph using a fake batch * clean up dynamic init * wip * spelling * use builder for ppo pol graph * add ppo graph * fix naming * order * docs * set class name correctly * add torch builder * add custom model support in builder * cleanup * remove underscores * fix py2 compat * Update dynamic_tf_policy_graph.py * Update tracking_dict.py * wip * rename * debug level * rename policy_graph -> policy in new classes * fix test * rename ppo tf policy * port appo too * forgot grads * default policy optimizer * make default config optional * add config to optimizer * use lr by default in optimizer * update * comments * remove optimizer * fix tuple actions support in dynamic tf graph	2019-05-18 00:23:11 -07:00
Eric Liang	3807fb505b	[rllib] TensorFlow 2 compatibility (#4802 )	2019-05-16 22:12:07 -07:00
Eric Liang	7d5ef6d99c	[rllib] Support continuous action distributions in IMPALA/APPO (#4771 )	2019-05-16 22:05:07 -07:00
Jones Wong	c5161a2c4d	[rllib] fix clip by value issue as TF upgraded (#4697 ) * fix clip_by_value issue * fix typo	2019-05-13 15:39:25 -07:00
Eric Liang	69352e3302	[rllib] Implement learn_on_batch() in torch policy graph	2019-05-12 21:29:58 -07:00
Eric Liang	351753aae5	[rllib] Remove dependency on TensorFlow (#4764 ) * remove hard tf dep * add test * comment fix * fix test	2019-05-10 20:36:18 -07:00
Jacob Beck	28496c8b50	[rllib] Qmix padding patch (#4735 ) * Qmix padding patch * Update qmix_policy_graph.py * lint errors * more linting * Update qmix_policy_graph.py	2019-05-08 14:07:29 -07:00
Eric Liang	71b2dec3b4	[rllib] Fix bounds of space returned by preprocessor.observation_space (#4736 )	2019-05-05 18:25:38 -07:00
Richard Liaw	f2faf5ce75	[tune] Contributor Guide and Design Page (#4716 ) * Move setup script out * some changes * Finished Contributor guide * some comments to the design * move * Apply suggestions from code review Co-Authored-By: richardliaw <rliaw@berkeley.edu> * sourcecode * comments	2019-05-05 00:04:13 -07:00
Federico Fontana	78bb26286e	Replaced discontinued rnn_cell.BasicLSTMCell with rnn_cell.LSTMCell (#4703 ) * Fixed bug in Dirichlet (#4440) * Replaced deprecated rnn_cell.BasicLSTMCell with rnn_cell.LSTMCell	2019-05-02 13:19:27 -07:00
Sam Toyer	663e92ab3f	[rllib] TD3/DDPG improvements and MuJoCo benchmarks (#4694 ) * [rllib] Separate optimisers for DDPG actor & crit. * [rllib] Better names for DDPG variables & options Config changes: - noise_scale -> exploration_ou_noise_scale - exploration_theta -> exploration_ou_theta - exploration_sigma -> exploration_ou_sigma - act_noise -> exploration_gaussian_sigma - noise_clip -> target_noise_clip * [rllib] Make DDPG less class-y Used functions to replace three classes with only an __init__ method & a handful of unrelated attributes. * [rllib] Refactor DDPG noise * [rllib] Unify DDPG exploration annealing Added option "exploration_should_anneal" to enable linear annealing of exploration noise. By default this is off, for consistency with DDPG & TD3 papers. Also renamed "exploration_final_eps" to "exploration_final_scale" (that name seems to have been carried over from DQN, and doesn't really make sense here). Finally, tried to rename "eps" to "noise_scale" wherever possible.	2019-04-26 17:49:53 -07:00
Andrew	06c768823c	[rllib] train-eval loop implementation for rllib.Trainer class (#4647 )	2019-04-21 12:08:04 -07:00
Vlad Firoiu	39a09fa457	Turn replay into a circular queue. (#4667 )	2019-04-19 11:42:00 -07:00
Wang Qing	9d481cc2e6	[hotfix] Missing import breaks Travis builds	2019-04-18 23:12:44 -07:00
Eric Liang	5a562bbf12	[rllib] Fix num_gpus cast and raise error on large batch (#4652 )	2019-04-18 15:23:29 -07:00
Eric Liang	6848dfd179	[rllib] Replace ray.get() with ray_get_and_free() to optimize memory usage (#4586 )	2019-04-17 20:30:03 -04:00
Eric Liang	3fd9dea721	[rllib] Fix tune.run(Agent class) (#4630 ) * update * Update __init__.py	2019-04-15 09:12:23 -07:00
Vlad Firoiu	f600591468	Cast MultiCategorical num_outputs to int. (#4629 )	2019-04-14 19:51:37 -07:00
Eric Liang	6e7680bf21	[rllib] Clean up concepts documentation and policy optimizer creation (#4592 )	2019-04-12 21:03:26 -07:00
cfan	bb207a205b	[rllib] Support torch device and distributions. (#4553 )	2019-04-12 11:39:14 -07:00
justinwyang	e88e706fcc	Enforce quoting style in Travis. (#4589 )	2019-04-11 14:24:26 -07:00
Vlad Firoiu	74fd3d7e21	[rllib] Support prev_state/prev_action in rollout and fix multiagent (#4565 ) * Cleaner and more correct treatment of agent states in rollout.py * support lstm_use_prev_action_reward in rollout.py * Linter. * appease flake8 * Use _DUMMY_AGENT_ID instead of 0. * All agents have a policy_agent_mapping. Reset the mapping cache at the start of each episode. * Update rollout.py * Fix rollout.py for single-agent envs. * Use agent_id, not policy_id.	2019-04-10 00:01:25 -07:00
Eric Liang	4f46d3e9bf	[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554 )	2019-04-09 00:36:49 -07:00
Jones Wong	da5a471485	[rllib] validate observation in NoPreprocessor (#4546 )	2019-04-07 16:11:50 -07:00

1 2 3 4 5 ...

465 Commits