Files
ray/python
Eric Liang 6cb5b90bd6 [rllib] [RFC] Dynamic definition of loss functions and modularization support (#4795)
* dynamic graph

* wip

* clean up

* fix

* document trainer

* wip

* initialize the graph using a fake batch

* clean up dynamic init

* wip

* spelling

* use builder for ppo pol graph

* add ppo graph

* fix naming

* order

* docs

* set class name correctly

* add torch builder

* add custom model support in builder

* cleanup

* remove underscores

* fix py2 compat

* Update dynamic_tf_policy_graph.py

* Update tracking_dict.py

* wip

* rename

* debug level

* rename policy_graph -> policy in new classes

* fix test

* rename ppo tf policy

* port appo too

* forgot grads

* default policy optimizer

* make default config optional

* add config to optimizer

* use lr by default in optimizer

* update

* comments

* remove optimizer

* fix tuple actions support in dynamic tf graph
2019-05-18 00:23:11 -07:00
..
2018-06-20 10:43:44 -07:00
2019-04-02 22:17:33 -07:00