Commit Graph

9 Commits

Author SHA1 Message Date
Philipp Moritz 9bcaaaeaf5 Debugging for policy gradients (#681)
* configuration option for tensorflow debugger

* add model checkpointing

* fix linting

* make it possible to run without checkpointing

* fix

* loading from checkpoint and expose debugger through cli

* todo for filters

* Fix typo.
2017-06-18 17:58:41 -07:00
Eric Liang 4374ad1453 Policy gradient example: Support multi-GPU training (#584)
* add tf metrics

* comments

* fix network scopes

* add doc

* initial work

* try with 3 virtual cpus

* clean up metrics

* use format string

* fix trace level

* back to pong

* always run summary on cpu

* plot intermediate and final sgd stats

* add back a global step

* update

* add timeline

* use staging area and reuse weights properly

* stage at cpu

* whoops, stage only the batch

* clean up a bit

* fix py flake

* wip

* create an optimizer graph per device

* print timeline on 5th batch instead

* print examples per second

* log placement for training ops

* force placement on cpu:0

* try separating weights onto different gpus

* try using nccl

* add cpu fallback

* remove space from date

* check has gpu device

* fix flag config

* checkpoint

* wip

* update

* add some timing

* trace loading

* try cpu

* revert that

* remove expensive test

* lint

* cleanups

* clean up timers

* clean it up a bit

* fix code for non-scalar action spaces

* address some nits

* fix quotes

* efficient shuffling between sgd epochs
2017-06-13 06:03:25 +00:00
Philipp Moritz 679910496e fix policy gradients for mujoco domains (#589) 2017-05-24 18:39:37 -07:00
Eric Liang 06241daf61 Policy gradient example: record stats for tensorboard (#577)
* add tf metrics

* comments

* fix network scopes

* add doc

* use format string

* fix trace level

* plot intermediate and final sgd stats

* add back a global step
2017-05-21 14:51:24 -07:00
Robert Nishihara ec2534422b Remove register_class from API. (#550)
* Perform ray.register_class under the hood.

* Fix bug.

* Release worker lock when waiting for imports to arrive in get.

* Remove calls to register_class from examples and tests.

* Clear serialization state between tests.

* Fix bug and add test for multiple custom classes with same name.

* Fix failure test.

* Fix linting and cleanups to python code.

* Fixes to documentation.

* Implement recursion depth for recursively registering classes.

* Fix linting.

* Push warning to user if waiting for class for too long.

* Fix typos.

* Don't export FunctionToRun if pickling the function fails.

* Don't broadcast class definition when pickling class.
2017-05-16 18:38:52 -07:00
Robert Nishihara 3ebfd850e1 Make example applications pep8 compliant. (#553)
* Test examples for pep8 compliance.

* Make rl_pong example pep8 compliant.

* Make policy gradient example pep8 compliant.

* Make lbfgs example pep8 compliant.

* Make hyperopt example pep8 compliant.

* Make a3c example pep8 compliant.

* Make evolution strategies example pep8 compliant.

* Make resnet example pep8 compliant.

* Fix.
2017-05-16 14:12:18 -07:00
Robert Nishihara 9f91eb8c91 Change API for remote function declaration, actor instantiation, and actor method invocation. (#541)
* Direction substitution of @ray.remote -> @ray.task.

* Changes to make '@ray.task' work.

* Instantiate actors with Class.remote() instead of Class().

* Convert actor instantiation in tests and examples from Class() to Class.remote().

* Change actor method invocation from object.method() to object.method.remote().

* Update tests and examples to invoke actor methods with .remote().

* Fix bugs in jenkins tests.

* Fix example applications.

* Change @ray.task back to @ray.remote.

* Changes to make @ray.actor -> @ray.remote work.

* Direct substitution of @ray.actor -> @ray.remote.

* Fixes.

* Raise exception if @ray.actor decorator is used.

* Simplify ActorMethod class.
2017-05-14 00:01:20 -07:00
Philipp Moritz 4af0aa6258 Atari on pixels (#364)
* pong on pixels working (not cleaned up)

* make training compatible with all atari games

* cartpole runs

* Update documentation and usage for policy gradients.
2017-03-14 13:31:29 -07:00
Philipp Moritz 555dcf35a2 Add policy gradient example. (#344)
* add policy gradient example

* fix typos

* Minor changes plus some documentation.

* Minor fixes.
2017-03-07 23:42:44 -08:00