Commit Graph

5317 Commits

Author SHA1 Message Date
Richard Liaw a96ddec358 [tune] Fix restoration for function API PBT (#9853) 2020-08-06 11:00:52 -07:00
Alex Wu ea1ac15da0 [Core] Gpu type detection (#9695)
* .

* .

* .

* .

* .

* .

* .

* .

* Test cases

* detection only

* .

* Done?

* .

* .

* Done

* added test case

* .

* .

* .

* .

* .

* .

* Update python/ray/ray_constants.py

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* .

* .

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-08-06 11:00:37 -07:00
Alan Guo f8f6f342f6 [autoscaler] Create worker_file_mounts config (#9762) 2020-08-06 11:00:30 -07:00
SangBin Cho 0a6847c5dc Bump up the veresion. 2020-07-31 10:09:54 -07:00
mehrdadn 78995d085f Fix macOS incompatibility in format.sh (#9832)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-31 09:25:55 -07:00
Kai Yang 006e034cdb fix lint for ReferenceCountingTest.java (#9837) 2020-07-31 17:00:00 +08:00
Hao Chen 6fb6bd3e61 Refine Java "Ray Core Walkthrough" doc (#9836) 2020-07-31 15:35:43 +08:00
fangfengbin 3900643948 Add actor states definitions & transition diagram doc (#9754) 2020-07-31 15:35:25 +08:00
bermaker 88e8714bcb Fix ray java worker metric test indentation (#9834) 2020-07-31 14:39:41 +08:00
Richard Liaw a47121476f [tune] Remove accidentally added files (#9835) 2020-07-30 21:47:27 -07:00
Kai Yang 02fd950252 [Java] Local and distributed ref counting in Java (#9371) 2020-07-31 11:49:31 +08:00
mehrdadn e2c0174ab2 Factor out some Bazel options into .bazelrc (#9804)
* Factor out --keep_going in Bazel --config=ci

* Remove Bazel --test_timeout=600 for Windows

* Use global --test_output for Bazel CI

Co-authored-by: Mehrdad <noreply@github.com>
2020-07-30 18:09:31 -07:00
mehrdadn a7b97b6f8a Add shellcheck support (#8574) 2020-07-30 18:39:28 -05:00
Eric Liang 73df3f7bd2 Clean up formatting of placement group resources (#9740) 2020-07-30 15:52:32 -07:00
SangBin Cho 940617d092 Make test failure large. (#9822) 2020-07-30 13:11:51 -07:00
krfricke 619e44e54a [tune] Added WandbLogger (#9725)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-30 13:09:03 -07:00
Barak Michener 68f3fec744 *: Centralize requirements.txt and unify dependency versions (#9759)
* python_test: fix cython_examples in doc/ and tests/

* update setup.py to parse the bazel version string better

* all: centralize all python deps into stackable requirements files in python/

* format

* Move cython test into the proper package

* Add cross-reference dependency comments for requirements and setup.py

* re-enable version pinning on CI, fix formatting

* fix up torchvision version

* fix case in shell
2020-07-30 11:22:56 -07:00
SangBin Cho e6d1e3afe2 Use pass by reference for const auto in for loop. (#9811) 2020-07-30 12:34:24 -05:00
Richard Liaw 0c3b9ebeef [tune/sgd] Document func_trainable and add checkpoint context (#9739)
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-30 09:46:37 -07:00
Sven Mika e540e425e4 [RLlib] rllib rollout test and bug fixes. (#9779) 2020-07-30 16:17:03 +02:00
Sven Mika f6bd12eb18 [RLlib] Add tensor-based tests for Schedules and fix some bugs related to using Schedules with tensor time input. (#9782) 2020-07-30 12:49:32 +02:00
Miguel Morales 372114b4ed Update sampler.py (#9805)
Minor fix for warning string
2020-07-29 22:58:35 -07:00
bermaker ccd6b90a42 Fix ray java worker metric registry indentation (#9780) 2020-07-30 13:20:24 +08:00
chaokunyang 6464bf55c6 [dist] Mvn deploy (#9777) 2020-07-30 11:48:31 +08:00
Kai Yang 9be5a2f0fc Fix GCS related tests (#9783) 2020-07-30 11:46:36 +08:00
Hao Chen 260bc52254 Java doc: "Ray Core Walkthrough" page (#8595) 2020-07-30 11:13:38 +08:00
chaokunyang 5aba53e9b2 [dist] Fix travis deploy for java dist (#9768) 2020-07-30 10:59:11 +08:00
SangBin Cho 826f14c824 [Stats] Fix harvestor threads + Fix flaky stats shutdown. (#9745) 2020-07-29 18:57:59 -05:00
mehrdadn 07022f3f11 Fix src/ray/core_worker/common.h deleted constructor (#9785)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-29 15:49:02 -07:00
Alex Wu 6e294dd90f [Core] Custom socket name (#9766)
* fix issues

* hot fixes

* test

* test

* socket name change only
2020-07-29 13:19:41 -07:00
Alex Wu e6696b2533 Fixed stderr logging (9765) 2020-07-29 13:19:04 -07:00
Alex Wu 72297dc46f [Core] Socket creation race condition bug fixes (#9764)
* fix issues

* hot fixes

* test

* test

* Always info log
2020-07-29 13:17:46 -07:00
Sven Mika b0b0463161 [RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678) 2020-07-29 21:15:09 +02:00
Bill Chambers 067c2752f8 [TUNE] Tune Docs re-organization (#9600)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-07-29 11:22:44 -07:00
SangBin Cho d1b37ca7e4 [GCS Actor Management] Fix flaky test_dead_actors. (#9715)
* Fix.

* Add logs.

* Add an unit test.
2020-07-29 10:54:18 -07:00
Tao Wang 2babad9906 [GCS]Use a separate thread in node failure detector to handle heartbeat (#9416)
* use a sole thread to handle heartbeat

* separate signal thread

* use work to avoid exiting when task is underway

* protect shared data structure to avoid deadlock

* add comments

* decrease io service num

* minor changes

* fix test

* per stephanie's comments

* use single io service instead of 1-size io service pool

* typo
2020-07-29 09:58:58 -07:00
Lingxuan Zuo 156067b423 [Stats] enable core worker stats (#9355) 2020-07-29 17:28:33 +08:00
fangfengbin a484947742 Fix leased worker leak bug if lease worker requests that are still waiting to be scheduled when GCS restarts (#9719) 2020-07-29 14:16:03 +08:00
Kai Yang 2cafc7cebe [Java] Fix MetricTest.java due to incomplete changes from #9703 (#9770) 2020-07-29 12:18:17 +08:00
Kai Yang bdc005a4d4 [Java] Use test groups to filter tests of different run modes (#9703) 2020-07-29 11:18:45 +08:00
Simon Mo 9fbfee2424 Pin pytest version (#9767) 2020-07-28 19:54:48 -07:00
mehrdadn fb5280f21b Fix some Windows CI issues (#9708)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-28 18:10:23 -07:00
SangBin Cho 423dc96cc4 Revert "[dist] swap mac/linux wheel build order (#9746)" and "Fix package and upload ray jar (#9742)" (#9758)
* Revert "[dist] swap mac/linux wheel build order (#9746)"

This reverts commit a9340565ff.

* Revert "Fix package and upload ray jar (#9742)"

This reverts commit c290c308fe.
2020-07-28 15:34:29 -07:00
Alex Wu 21af0ceb0c Register function race (#9346) 2020-07-28 13:51:34 -07:00
SangBin Cho c00742f103 [Release] Fix release tests (#9733) 2020-07-28 10:44:06 -07:00
SangBin Cho 7e3ba289dc [Stats] Basic Metrics Infrastructure (Metrics Agent + Prometheus Exporter) (#9607) 2020-07-28 10:28:01 -07:00
Alex Wu feb3751824 [New scheduler] First unit test for task manager (#9696)
* .

* .

* refactor WorkerInterface

* .

* Basic unit test structure complete?

* .

* bad git >:-(

* small clean up

* CR

* .

* .

* One more fixture

* One more fixture

* .

* .

* bazel-format

* .
2020-07-28 09:44:58 -07:00
Ian Rodney b1c2983c97 Run _with_interactive in Docker (#9747) 2020-07-28 08:57:04 -07:00
fangfengbin bd18e975c0 fix windows compile bug (#9741)
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-07-28 21:30:31 +08:00
bermaker 6e23aff723 [Metrics]Ray java worker metric registry (#9636)
* ray worker metrics gauge init

* ray java metric mapping

* add jni source files for gauge and tagkey

* mapping all metric classes to stats object

* check non-null for tags and name

* lint

* add symbol for native metric JNI

* extern c for symbol

* add tests for all metrics

* Update Metric.java

use metricNativePointer instead.

* unify metric native stuff to one class

* fix jni file

* add comments for metric transform function in jni utils

* move metric function to native metric file

* remove unused disconnect jni

* Add a metric registry for java metircs

* Restore install-bazel.sh

* Add some comments for metric registry

* Fix thread safe problem of metrics

* Fix metric tests and remove sleep code from tests

* Fix comments of metrics

Co-authored-by: lingxuan.zlx <skyzlxuan@gmail.com>
2020-07-28 21:29:33 +08:00