Commit Graph

  • 71243203a4 [rllib] Fix KeyError: 'kl' in multiagent ppo training Eric Liang 2019-01-09 19:33:07 -08:00
  • 6fc3fc4120 Cap task lease timeout (#3707) Hao Chen 2019-01-10 09:19:48 +08:00
  • edb7aaf7c7 [tune] Better Serialization for Server (#3708) Richard Liaw 2019-01-09 11:55:32 -08:00
  • 04f31db54d Actor dummy object garbage collection (#3593) Stephanie Wang 2019-01-09 10:37:11 -08:00
  • 3027dde303 Fix some storage problems of RayLog (#3595) Wenting Shen 2019-01-09 13:54:21 +08:00
  • d1e21b702e Change timeout from milliseconds to seconds in ray.wait. (#3706) Robert Nishihara 2019-01-08 21:32:08 -08:00
  • 59d861281e Bug fixing: Redis password should be used when reporting errors. (#3724) Si-Yuan 2019-01-09 13:23:55 +08:00
  • 6bbc667f93 Remove unused code path in services.py. (#3722) Robert Nishihara 2019-01-08 19:57:16 -08:00
  • 5945b92fd3 [sgd] Add checkpointing (#3638) Peter Schafhalter 2019-01-08 15:29:30 -08:00
  • 5e76d52868 Improve cluster.wait_for_nodes() API. (#3712) Robert Nishihara 2019-01-07 21:26:58 -08:00
  • 33319502b6 [tune] Add a callable check for converting to trainable (#3711) Richard Liaw 2019-01-07 16:18:29 -08:00
  • 5dadac148c Remove unused file. (#3695) Robert Nishihara 2019-01-07 12:45:48 -08:00
  • c9d70f0dda Remove num_local_schedulers argument from ray.worker._init. (#3704) Robert Nishihara 2019-01-07 12:44:49 -08:00
  • e78562b2e8 [rllib] Misc fixes: set lr for PG, better error message for LSTM/PPO, fix multi-agent/APEX (#3697) Eric Liang 2019-01-06 19:37:35 -08:00
  • df0733cafb Skip test_multiple_recursive (#3683) Hao Chen 2019-01-07 05:24:29 +08:00
  • 8934e37a78 [tune] Change log handling for Tune (#3661) Richard Liaw 2019-01-06 13:20:10 -08:00
  • 681e8cd3fd [autoscaler] Add an initial_workers option (#3530) mattearllongshot 2019-01-06 01:58:42 +00:00
  • 067976ad3d Push a warning to all users when large number of workers have been started. (#3645) Robert Nishihara 2019-01-05 13:27:32 -08:00
  • 692fdc6bc3 [Java] Allow actor handle to be serialized without forking (#3686) Wang Qing 2019-01-06 00:29:08 +08:00
  • 03fe760616 [rllib] Model self loss isn't included in all algorithms (#3679) Eric Liang 2019-01-04 22:30:35 -08:00
  • 960a943503 [tune] Fault Tolerance: handle lost checkpoints by restart (#3657) Richard Liaw 2019-01-04 22:05:27 -08:00
  • 7db1f3be2a [tune] resume=False by default but print a tip to set resume="prompt" + jenkins fix (#3681) Eric Liang 2019-01-04 17:23:19 -08:00
  • 747b117929 [tune] Tweak/allow nested pbt mutations (#3455) Kristian Hartikainen 2019-01-04 13:51:11 -08:00
  • cd80891ddb Try to figure out the memory limit in a docker container. (#3605) Robert Nishihara 2019-01-03 23:07:24 -08:00
  • 586a5c9ffa Limit default redis max memory to 10GB. (#3630) Robert Nishihara 2019-01-03 13:23:54 -08:00
  • 4b23a34c93 Fix multi-thread problem of function manager and Jenkins test (#3648) Yuhong Guo 2019-01-03 17:05:13 +08:00
  • ad2287ebe9 Fix new boost libs failure in cache-lib mode and add test to cover collect_dependent_libs.sh (#3627) Yuhong Guo 2019-01-03 15:51:11 +08:00
  • ca864faece [rllib] Documentation for I/O API and multi-agent support / cleanup (#3650) Eric Liang 2019-01-03 15:15:36 +08:00
  • 2177e2f410 [rllib] Agent: Allow unknown subkeys for custom_resources_per_worker (#3639) opherlieber 2019-01-03 08:19:59 +02:00
  • 47d36d7bd6 [rllib] Refactor pytorch custom model support (#3634) Eric Liang 2019-01-03 13:48:33 +08:00
  • b6bcd18d65 Split profile table among many keys in the GCS. (#3676) Robert Nishihara 2019-01-02 21:33:01 -08:00
  • 93e9d2b82c Improve backend log: env variable setting and format refine. (#3662) Yuhong Guo 2019-01-02 13:45:29 +08:00
  • b8a9e3f106 [rllib] Remove uses of sgd_stepsize => lr (#3667) Eric Liang 2019-01-01 12:01:27 +08:00
  • 93d54110f8 Prevent overriding faulthandler settings (#3668) Si-Yuan 2018-12-31 16:36:26 -08:00
  • c9b8ecca51 Add RayParams to refactor the parameters used by ray python. (#3558) Yuhong Guo 2018-12-29 22:04:27 +08:00
  • eb1e5fa2cf Fixing Python2 compatibility issues. Adding inline docs (#3656) Devin Petersohn 2018-12-28 22:53:28 -08:00
  • aad3c50e2d [tune] Cluster Fault Tolerance (#3309) Richard Liaw 2018-12-29 11:42:25 +08:00
  • 382b138fc7 fix code issues in object manager that are reported by scanning tool (#3649) Zhijun Fu 2018-12-29 06:38:59 +08:00
  • 3df1e1c471 Add missing lock in FreeObjects of object buffer pool (#3647) Zhijun Fu 2018-12-29 03:47:31 +08:00
  • c59b506c6e [Java] Support calling Ray APIs from multiple threads (#3646) Wang Qing 2018-12-28 17:44:31 +08:00
  • 0b682d043e Fix memory leak in PyRayletCient (#3640) Hao Chen 2018-12-28 09:39:02 +08:00
  • 62af2f25be Fix test_multiple_actor_reconstruction failure (#3641) Hao Chen 2018-12-28 05:57:52 +08:00
  • ac792d70c8 [rllib] Add starcraft multiagent env as example (#3542) Richard Liaw 2018-12-27 10:00:32 +08:00
  • b4f61dfd50 [rllib] Export policy model checkpoint (#3637) Tianming Xu 2018-12-27 07:43:06 +08:00
  • 6e2d7a9ba1 [tune] Support Configuration Merging (#3584) Richard Liaw 2018-12-26 03:07:11 -08:00
  • 4ce3818be5 Average aggregated gradients before put in plasma store (#3631) Stan Wang 2018-12-26 19:03:11 +08:00
  • 4cde971916 [Java] Print the log message slowly. (#3633) Wang Qing 2018-12-26 16:33:21 +08:00
  • 1b98fb8238 Fix Jenkins test failures and function descriptor bug. (#3569) Yuhong Guo 2018-12-26 15:31:44 +08:00
  • a971b73bbe [Java] Fix the issue when waiting an empty list or a null pointer (#3632) Wang Qing 2018-12-26 11:29:29 +08:00
  • f4011754d6 Fix: ServerConnection should be closed before being removed (#3626) Hao Chen 2018-12-26 03:01:53 +08:00
  • 5426234cd8 Update documentation to reflect 0.6.1 release. (#3622) Robert Nishihara 2018-12-24 11:10:04 -08:00
  • 1e8cdb5421 Update release documentation. (#3587) Robert Nishihara 2018-12-24 11:09:09 -08:00
  • 3d8f56409b Ensure numpy is at least 1.10.4 in setup.py (#2462) nam-cern 2018-12-24 20:01:25 +01:00
  • 9f63119a83 [rllib] Allow development without needing to compile Ray (#3623) Eric Liang 2018-12-24 18:08:23 +09:00
  • c13b2685f5 [modin] Append to path to avoid namespace collision on development branches (#3621) Devin Petersohn 2018-12-23 23:58:56 -08:00
  • a1995ff3b0 Resize logo in README. (#3619) Si-Yuan 2018-12-23 22:59:23 -08:00
  • 9b8d7573fe bump version from 0.6.0 to 0.6.1 (#3610) ray-0.6.1 Alexey Tumanov 2018-12-23 17:03:42 -08:00
  • bb7ca3bae7 Upgrade flatbuffers version to 1.10.0. (#3559) Robert Nishihara 2018-12-23 14:56:34 -08:00
  • ddd4c842f1 Initialize some variables in constructor instead of header file. (#3617) Robert Nishihara 2018-12-23 02:44:23 -08:00
  • bada42c334 object store notification mgr: fix using uninitialized variables (#3592) Alexey Tumanov 2018-12-22 19:51:22 -08:00
  • e578a38116 Fix TensorFlow and PyTorch compatibility (#3574) Philipp Moritz 2018-12-22 13:25:48 -08:00
  • deb26b954e [rllib] Export tensorflow model of policy graph (#3585) Tianming Xu 2018-12-22 16:35:25 +08:00
  • 8393df2516 Use BaseTest to instead of TestListener. (#3577) Wang Qing 2018-12-22 08:29:16 +08:00
  • ddc97864df [rllib] Add requested clarifications to test requirement of contrib docs (#3589) Eric Liang 2018-12-22 04:02:02 +09:00
  • 6b179cb8a7 change the order of allocation for io_service and gcs client in raylet main (#3597) Alexey Tumanov 2018-12-21 00:13:28 -08:00
  • e65b8f18f4 [java] change RayLog.core to org.slf4j.Logger (#3579) bibabolynn 2018-12-21 15:58:32 +08:00
  • e046a5c767 [tune] resources_per_trial from trial_resources (#3580) Richard Liaw 2018-12-20 19:00:47 -08:00
  • a174a46e02 Allowing multiple users to access the /tmp/ray file at the same time (#3591) Devin Petersohn 2018-12-20 18:46:54 -08:00
  • 34bab6291c Cleanup actor handle pickling code (#3560) Stephanie Wang 2018-12-20 16:37:21 -08:00
  • 6bb1103930 [rllib] Avoid sample wastage with bad PPO configurations (#3552) Eric Liang 2018-12-21 03:50:44 +09:00
  • ac48a58e4e [tune] Reduce scope of variant generator (#3583) Richard Liaw 2018-12-20 10:48:28 -08:00
  • 303883a3b6 [rllib] [rfc] add contrib module and guideline for merging (#3565) Eric Liang 2018-12-21 03:44:34 +09:00
  • cf0c4745f4 [rllib] support running older version tensorflow(version < 1.5.0) (#3571) adoda 2018-12-20 12:27:24 +08:00
  • a5309bec7c Make README render properly on PyPI. (#3578) Robert Nishihara 2018-12-19 18:41:09 -08:00
  • 132a23354e Fix pending callback not called when ServerConnection destructs (#3572) Hao Chen 2018-12-20 09:29:36 +08:00
  • ffa6ee3ec8 [rllib] streaming minibatching for IMPALA (#3402) Eric Liang 2018-12-19 02:23:29 -08:00
  • c4cba98c75 Remove deprecation warnings when running actor tests (#3563) Alexey Tumanov 2018-12-18 17:04:51 -08:00
  • fb33fa9097 Enable function_descriptor in backend to replace the function_id (#3028) Yuhong Guo 2018-12-19 07:53:59 +08:00
  • 3822b20319 [doc] update testing and dev instructions (#3562) Alexey Tumanov 2018-12-18 14:45:24 -08:00
  • 26ca40817e Convert UniqueID::nil() to a constructor (#3564) Stephanie Wang 2018-12-18 11:59:02 -08:00
  • 75ddf7cca4 Fix 2 small bugs (#3573) Yuhong Guo 2018-12-19 03:52:21 +08:00
  • db0dee573e [rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) (#3548) Eric Liang 2018-12-18 10:40:01 -08:00
  • bc4aa85ea3 fix link in doc (#3567) YifengHuang 2018-12-18 16:10:55 +08:00
  • 854b06854f remove auto-concat of rollouts in AsyncSampler (#3556) opherlieber 2018-12-17 23:54:52 +02:00
  • 3833ba4e4b Bump modin version to 0.2.5 (#3553) Devin Petersohn 2018-12-17 11:36:47 -08:00
  • 7767aba637 Note requirement cython==0.29.0 in installation instructions (#3555) Tianming Xu 2018-12-17 20:43:47 +08:00
  • 417c7f2d6f Update arrow and remove plasma_manager references. (#3545) Robert Nishihara 2018-12-16 02:36:02 -05:00
  • b3bf608608 Update arrow to reduce plasma IPCs. (#3497) Philipp Moritz 2018-12-14 20:49:37 -08:00
  • fcc37021b2 Throw exception for ray.get of an evicted actor object (#3490) Stephanie Wang 2018-12-14 11:41:27 -08:00
  • 7fd24e384b [java] Pass large args by reference (#3504) bibabolynn 2018-12-14 23:32:35 +08:00
  • de3fdeb5b5 [autoscaler] Fix Error Handling for botocore (#3534) Richard Liaw 2018-12-14 00:20:49 -08:00
  • 2a4685a08b Add a script to collect built thirdparty libs to avoid download and building again. (#3521) Yuhong Guo 2018-12-14 15:56:40 +08:00
  • a4abe6c0fe Add test to test raylet client connection when raylet crashes. (#3518) Yuhong Guo 2018-12-14 15:40:50 +08:00
  • e7b51cbd1b [xray] Implement Actor Reconstruction (#3332) Hao Chen 2018-12-14 13:28:58 +08:00
  • 2455de78ce save initial config instead of initial resource config (#3532) Alexey Tumanov 2018-12-13 20:39:42 -08:00
  • 84fae57ab5 Convert the raylet client (the code in local_scheduler_client.cc) to proper C++. (#3511) Si-Yuan 2018-12-13 13:39:10 -08:00
  • 5dcc333199 [sgd] Modify: add interface for model (#3458) Chunyang Wen 2018-12-13 13:23:25 +08:00
  • 0e00533ed4 Different approach to removing RayGetError (#3471) Eric Liang 2018-12-12 20:30:51 -08:00
  • 20c7fad4f4 Move actor table to primary redis context Eric Liang 2018-12-12 16:51:29 -08:00
  • 32473cf22e [rllib] Basic Offline Data IO API (#3473) Eric Liang 2018-12-12 13:57:48 -08:00