Commit Graph

92 Commits

Author SHA1 Message Date
Eric Liang 03a5b90ed6 Revert "Revert "Increase the number of unique bits for actors to avoi… (#12990) 2020-12-21 15:16:42 -08:00
Eric Liang 5d987f5988 Revert "Increase the number of unique bits for actors to avoid handle collisions (#12894)" (#12988)
This reverts commit 3e492a79ec.
2020-12-18 23:51:44 -08:00
Eric Liang 3e492a79ec Increase the number of unique bits for actors to avoid handle collisions (#12894) 2020-12-18 15:59:03 -08:00
Kaushik B 7422abddb4 [tune] trim kwargs in shim instantiation functions (#12544) 2020-12-02 12:07:00 -08:00
Barak Michener 6412dfaf38 [ray_client] actors v0 (#12388) 2020-12-01 13:12:08 -08:00
SangBin Cho f56d7c1a76 [Logging] Remove per worker job log file / support worker log rotation (#11927)
* In progress.

* MVP done.

* In Progress.

* Remove unnecessay code.

* Fix some issues.

* Fix test failures.

* Addressed code review + fix object spilling test failure.
2020-11-16 11:29:43 -08:00
fyrestone 269e1f0b98 Fix push_error_to_driver_through_redis (#10848)
Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-09-17 10:50:44 -07:00
Alex Wu 6f479d4697 [hotfix] CPU Detection (#10821) 2020-09-16 21:02:52 -07:00
Richard Liaw e3fd5eceec [minor] fix warning about docker cpus (#10768) 2020-09-14 09:08:34 -07:00
Ian Rodney 5bc2ba38fd [docker] Detect CPUs in container correctly (#10507)
Co-authored-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2020-09-13 23:40:48 -07:00
Alex Wu d6a9f0e2e4 [Core] Accelerator type API (#10561) 2020-09-06 20:58:40 -07:00
Alex Wu a699f6a4d8 [Core] Fix override memory and object_store_memory in decorator (#10563) 2020-09-06 20:56:48 -07:00
fyrestone e9b046306a [Dashboard] Dashboard basic modules (#10303)
* Improve reporter module

* Add test_node_physical_stats to test_reporter.py

* Add test_class_method_route_table to test_dashboard.py

* Add stats_collector module for dashboard

* Subscribe actor table data

* Add log module for dashboard

* Only enable test module in some test cases

* CI run all dashboard tests

* Reduce test timeout to 10s

* Use fstring

* Remove unused code

* Remove blank line

* Fix dashboard tests

* Fix asyncio.create_task not available in py36; Fix lint

* Add format_web_url to ray.test_utils

* Update dashboard/modules/reporter/reporter_head.py

Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* Add DictChangeItem type for Dict change

* Refine logger.exception

* Refine GET /api/launch_profiling

* Remove disable_test_module fixture

* Fix test_basic may fail

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <mfitton@berkeley.edu>
2020-08-29 23:09:34 -07:00
Ian Rodney d6f2b0d933 [docker] Run profiling without sudo (#10388)
* fix profiling for docker

* small fixes

* use name

* do not import pwd on windows
2020-08-28 21:25:10 -07:00
SangBin Cho 92664249e8 Partially Use f string (#10218)
* flynt. trial 1.

* Trial 1.

* Addressed code review.
2020-08-20 18:21:16 -07:00
yncxcw 32cd94b750 [Core] Do not convert gpu id to int (#9744)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Alex Wu 12d75784a4 [Core] test_advanced_3.py::test_logging_to_driver (round 2) (#9916) 2020-08-05 15:04:36 -05:00
kisuke95 28b1f7710c [Core] Error info pubsub (Remove ray.errors API) (#9665) 2020-08-04 14:04:29 +08:00
Clark Zinzow 9f969260e8 [core] Fix Ray service startup when logging redirection is disabled. (#9547) 2020-07-23 11:26:24 -05:00
Hao Chen d49dadf891 Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
Alex Wu 46962f5db1 [Core] Log monitor multidriver (#8953) 2020-06-25 11:05:53 -07:00
Alex Wu c152730e4a [Core] Log output from different jobs to different drivers. (#8885)
* .

* .

* Correct now

* No interactivity errors

* format

* Filtering

* lint

* .

* No more filtering

* Removed interactivity

* .

* .

* .

* .

* .

* .

* Redirection works

* formatting

* something broken?

* .

* Works

* formatting

* redirect output

* formatting

* formatting

* Fix file descriptor leakage

* format

* .

* .

* .

* .

* .

* Refactor

* .

* Only run on job switch

* .

* cleanup

* .

* ...

* Review

* .

* .

* .

* .

* whoops

* .

* Should fix bug

* .

* .

* addressed comments

* formatting

* formatting

* Fix typo

* .

* .

* .

* .

Co-authored-by: Ubuntu <ubuntu@ip-172-31-14-33.us-west-2.compute.internal>
2020-06-23 18:45:32 -07:00
mehrdadn 8958728139 Windows bug fixes (#7740) 2020-03-30 20:39:23 -05:00
mehrdadn fc23f79f82 Windows process issues (#7739) 2020-03-29 12:48:32 -07:00
Robert Nishihara ee8c9ff732 Remove six and cloudpickle from setup.py. (#7694) 2020-03-23 11:42:05 -07:00
mehrdadn a0700e2f86 Change /tmp to platform-specific temporary directory (#7529) 2020-03-16 18:10:14 -07:00
mehrdadn 4d42664b2a Use prctl(PR_SET_PDEATHSIG) on Linux instead of reaper (#7150) 2020-03-03 11:45:42 -06:00
Alex Wu 0d3687a10d No warning for docker memory > system memory (#7151) 2020-02-13 15:21:44 -08:00
Alex Wu 72c31e3e19 Ray nodes should respect docker limits (#7039) 2020-02-10 11:08:38 -08:00
ijrsvt 0826f95e1c Including psutil & setproctitle (#7031) 2020-02-05 14:16:58 -08:00
Richard Liaw 52c33b53f7 [minor][core] fix gpu ids for SLURM (#7014)
* fix gpu ids

* fix
2020-02-02 16:09:22 -08:00
Edward Oakes 92525f35d1 Remove raylet client from Python worker (#6018) 2020-01-31 18:23:01 -08:00
Ziyad Edher c480d1d1e4 Treat static methods as class methods instead of instance methods in actors (#6756)
* Treat static methods as class methods rather than instance methods

* Add tests for static methods in actors

* Revert formatting changes

* Readd future imports

* Restructure static method check

* Documentation enhancements

* Fix linting issues
2020-01-15 19:38:41 -06:00
Sven 60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Lixin Wei 859dbad155 Fix estimate_available_memory() in utils.py (#6302) 2020-01-08 15:22:47 +08:00
Ujval Misra 5b40408678 [tune] Remove py2.7-specific code (#6665)
* Remove backwards compatability py2.7 code.

* Use exists_ok=True in ray

* nit

* nit

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-03 01:03:13 -08:00
Yuhao Yang 3db8faab0d [tune] fix log dir race condition (#6420) 2019-12-10 21:00:19 -08:00
Eric Liang 4edae7ea2b Speed up task submissions a bit (#5992) 2019-10-25 00:10:37 -07:00
Edward Oakes 07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Edward Oakes dfd2a45f69 Simplify symlinking and don't print warnings (#5615) 2019-09-02 13:31:10 -07:00
Edward Oakes 0cc0abf857 Create session_latest symlink for Ray sessions (#5580) 2019-08-31 22:14:54 -07:00
Eric Liang e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
Qing Wang f2293243cc [ID Refactor] Shorten the length of JobID to 4 bytes (#5110)
* WIP

* Fix

* Add jobid test

* Fix

* Add python part

* Fix

* Fix tes

* Remove TODOs

* Fix C++ tests

* Lint

* Fix

* Fix exporting functions in multiple ray.init

* Fix java test

* Fix lint

* Fix linting

* Address comments.

* FIx

* Address and fix linting

* Refine and fix

* Fix

* address

* Address comments.

* Fix linting

* Fix

* Address

* Address comments.

* Address

* Address

* Fix

* Fix

* Fix

* Fix lint

* Fix

* Fix linting

* Address comments.

* Fix linting

* Address comments.

* Fix linting

* address comments.

* Fix
2019-07-11 14:25:16 +08:00
Eric Liang 5aec750107 Add warning/error if object store memory exceeds available memory (#4893)
* exclude

* format

* add warning

* hatch

* reduce mem usage

* reduce object store mem

* set obj mem
2019-07-08 21:37:08 -07:00
Qing Wang 62e4b591e3 [ID Refactor] Rename DriverID to JobID (#5004)
* WIP

WIP

WIP

Rename Driver -> Job

Fix complition

Fix

Rename in Java

In py

WIP

Fix

WIP

Fix

Fix test

Fix

Fix C++ linting

Fix

* Update java/runtime/src/main/java/org/ray/runtime/config/RayConfig.java

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/core_worker/core_worker.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Fix

* Fix CI

* Fix cpp linting

* Fix py lint

* FIx

* Address comments and fix

* Address comments

* Address

* Fix import_threading
2019-06-28 00:44:51 +08:00
Hao Chen 0131353d42 [gRPC] Migrate gcs data structures to protobuf (#5024) 2019-06-25 14:31:19 -07:00
Yuhong Guo 1a39fee9c6 Refactor ID Serial 1: Separate ObjectID and TaskID from UniqueID (#4776)
* Enable BaseId.

* Change TaskID and make python test pass

* Remove unnecessary functions and fix test failure and change TaskID to
16 bytes.

* Java code change draft

* Refine

* Lint

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ObjectId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comment

* Lint

* Fix SINGLE_PROCESS

* Fix comments

* Refine code

* Refine test

* Resolve conflict
2019-05-22 14:46:30 +08:00
Si-Yuan bd00735fe8 Fix tempfile issues (#4605) 2019-05-05 16:06:15 -07:00
Yuhong Guo c2349cf12d Remove local/global_scheduler from code and doc. (#4549) 2019-04-03 17:05:09 -07:00
Yuhong Guo 41b81af11b Downgrade six to 1.0.0 (#4180) 2019-02-27 13:05:25 -08:00