fyrestone
269e1f0b98
Fix push_error_to_driver_through_redis ( #10848 )
...
Co-authored-by: 刘宝 <po.lb@antfin.com >
2020-09-17 10:50:44 -07:00
Alex Wu
6f479d4697
[hotfix] CPU Detection ( #10821 )
2020-09-16 21:02:52 -07:00
Richard Liaw
e3fd5eceec
[minor] fix warning about docker cpus ( #10768 )
2020-09-14 09:08:34 -07:00
Ian Rodney
5bc2ba38fd
[docker] Detect CPUs in container correctly ( #10507 )
...
Co-authored-by: simon-mo <simon.mo@hey.com >
Co-authored-by: Richard Liaw <rliaw@berkeley.edu >
Co-authored-by: Alex Wu <itswu.alex@gmail.com >
2020-09-13 23:40:48 -07:00
Alex Wu
d6a9f0e2e4
[Core] Accelerator type API ( #10561 )
2020-09-06 20:58:40 -07:00
Alex Wu
a699f6a4d8
[Core] Fix override memory and object_store_memory in decorator ( #10563 )
2020-09-06 20:56:48 -07:00
fyrestone
e9b046306a
[Dashboard] Dashboard basic modules ( #10303 )
...
* Improve reporter module
* Add test_node_physical_stats to test_reporter.py
* Add test_class_method_route_table to test_dashboard.py
* Add stats_collector module for dashboard
* Subscribe actor table data
* Add log module for dashboard
* Only enable test module in some test cases
* CI run all dashboard tests
* Reduce test timeout to 10s
* Use fstring
* Remove unused code
* Remove blank line
* Fix dashboard tests
* Fix asyncio.create_task not available in py36; Fix lint
* Add format_web_url to ray.test_utils
* Update dashboard/modules/reporter/reporter_head.py
Co-authored-by: Max Fitton <mfitton@berkeley.edu >
* Add DictChangeItem type for Dict change
* Refine logger.exception
* Refine GET /api/launch_profiling
* Remove disable_test_module fixture
* Fix test_basic may fail
Co-authored-by: 刘宝 <po.lb@antfin.com >
Co-authored-by: Max Fitton <mfitton@berkeley.edu >
2020-08-29 23:09:34 -07:00
Ian Rodney
d6f2b0d933
[docker] Run profiling without sudo ( #10388 )
...
* fix profiling for docker
* small fixes
* use name
* do not import pwd on windows
2020-08-28 21:25:10 -07:00
SangBin Cho
92664249e8
Partially Use f string ( #10218 )
...
* flynt. trial 1.
* Trial 1.
* Addressed code review.
2020-08-20 18:21:16 -07:00
yncxcw
32cd94b750
[Core] Do not convert gpu id to int ( #9744 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu >
2020-08-11 12:09:46 -07:00
Alex Wu
12d75784a4
[Core] test_advanced_3.py::test_logging_to_driver (round 2) ( #9916 )
2020-08-05 15:04:36 -05:00
kisuke95
28b1f7710c
[Core] Error info pubsub (Remove ray.errors API) ( #9665 )
2020-08-04 14:04:29 +08:00
Clark Zinzow
9f969260e8
[core] Fix Ray service startup when logging redirection is disabled. ( #9547 )
2020-07-23 11:26:24 -05:00
Hao Chen
d49dadf891
Change Python's ObjectID to ObjectRef ( #9353 )
2020-07-10 17:49:04 +08:00
Alex Wu
46962f5db1
[Core] Log monitor multidriver ( #8953 )
2020-06-25 11:05:53 -07:00
Alex Wu
c152730e4a
[Core] Log output from different jobs to different drivers. ( #8885 )
...
* .
* .
* Correct now
* No interactivity errors
* format
* Filtering
* lint
* .
* No more filtering
* Removed interactivity
* .
* .
* .
* .
* .
* .
* Redirection works
* formatting
* something broken?
* .
* Works
* formatting
* redirect output
* formatting
* formatting
* Fix file descriptor leakage
* format
* .
* .
* .
* .
* .
* Refactor
* .
* Only run on job switch
* .
* cleanup
* .
* ...
* Review
* .
* .
* .
* .
* whoops
* .
* Should fix bug
* .
* .
* addressed comments
* formatting
* formatting
* Fix typo
* .
* .
* .
* .
Co-authored-by: Ubuntu <ubuntu@ip-172-31-14-33.us-west-2.compute.internal >
2020-06-23 18:45:32 -07:00
mehrdadn
8958728139
Windows bug fixes ( #7740 )
2020-03-30 20:39:23 -05:00
mehrdadn
fc23f79f82
Windows process issues ( #7739 )
2020-03-29 12:48:32 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. ( #7694 )
2020-03-23 11:42:05 -07:00
mehrdadn
a0700e2f86
Change /tmp to platform-specific temporary directory ( #7529 )
2020-03-16 18:10:14 -07:00
mehrdadn
4d42664b2a
Use prctl(PR_SET_PDEATHSIG) on Linux instead of reaper ( #7150 )
2020-03-03 11:45:42 -06:00
Alex Wu
0d3687a10d
No warning for docker memory > system memory ( #7151 )
2020-02-13 15:21:44 -08:00
Alex Wu
72c31e3e19
Ray nodes should respect docker limits ( #7039 )
2020-02-10 11:08:38 -08:00
ijrsvt
0826f95e1c
Including psutil & setproctitle ( #7031 )
2020-02-05 14:16:58 -08:00
Richard Liaw
52c33b53f7
[minor][core] fix gpu ids for SLURM ( #7014 )
...
* fix gpu ids
* fix
2020-02-02 16:09:22 -08:00
Edward Oakes
92525f35d1
Remove raylet client from Python worker ( #6018 )
2020-01-31 18:23:01 -08:00
Ziyad Edher
c480d1d1e4
Treat static methods as class methods instead of instance methods in actors ( #6756 )
...
* Treat static methods as class methods rather than instance methods
* Add tests for static methods in actors
* Revert formatting changes
* Readd future imports
* Restructure static method check
* Documentation enhancements
* Fix linting issues
2020-01-15 19:38:41 -06:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Lixin Wei
859dbad155
Fix estimate_available_memory() in utils.py ( #6302 )
2020-01-08 15:22:47 +08:00
Ujval Misra
5b40408678
[tune] Remove py2.7-specific code ( #6665 )
...
* Remove backwards compatability py2.7 code.
* Use exists_ok=True in ray
* nit
* nit
Co-authored-by: Richard Liaw <rliaw@berkeley.edu >
2020-01-03 01:03:13 -08:00
Yuhao Yang
3db8faab0d
[tune] fix log dir race condition ( #6420 )
2019-12-10 21:00:19 -08:00
Eric Liang
4edae7ea2b
Speed up task submissions a bit ( #5992 )
2019-10-25 00:10:37 -07:00
Edward Oakes
07c4c6367a
[core worker] Python core worker object interface ( #5272 )
2019-09-12 23:07:46 -07:00
Edward Oakes
dfd2a45f69
Simplify symlinking and don't print warnings ( #5615 )
2019-09-02 13:31:10 -07:00
Edward Oakes
0cc0abf857
Create session_latest symlink for Ray sessions ( #5580 )
2019-08-31 22:14:54 -07:00
Eric Liang
e2e30ca507
Ray, Tune, and RLlib support for memory, object_store_memory options ( #5226 )
2019-08-21 23:01:10 -07:00
Qing Wang
f2293243cc
[ID Refactor] Shorten the length of JobID to 4 bytes ( #5110 )
...
* WIP
* Fix
* Add jobid test
* Fix
* Add python part
* Fix
* Fix tes
* Remove TODOs
* Fix C++ tests
* Lint
* Fix
* Fix exporting functions in multiple ray.init
* Fix java test
* Fix lint
* Fix linting
* Address comments.
* FIx
* Address and fix linting
* Refine and fix
* Fix
* address
* Address comments.
* Fix linting
* Fix
* Address
* Address comments.
* Address
* Address
* Fix
* Fix
* Fix
* Fix lint
* Fix
* Fix linting
* Address comments.
* Fix linting
* Address comments.
* Fix linting
* address comments.
* Fix
2019-07-11 14:25:16 +08:00
Eric Liang
5aec750107
Add warning/error if object store memory exceeds available memory ( #4893 )
...
* exclude
* format
* add warning
* hatch
* reduce mem usage
* reduce object store mem
* set obj mem
2019-07-08 21:37:08 -07:00
Qing Wang
62e4b591e3
[ID Refactor] Rename DriverID to JobID ( #5004 )
...
* WIP
WIP
WIP
Rename Driver -> Job
Fix complition
Fix
Rename in Java
In py
WIP
Fix
WIP
Fix
Fix test
Fix
Fix C++ linting
Fix
* Update java/runtime/src/main/java/org/ray/runtime/config/RayConfig.java
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu >
* Update src/ray/core_worker/core_worker.cc
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu >
* Address comments
* Fix
* Fix CI
* Fix cpp linting
* Fix py lint
* FIx
* Address comments and fix
* Address comments
* Address
* Fix import_threading
2019-06-28 00:44:51 +08:00
Hao Chen
0131353d42
[gRPC] Migrate gcs data structures to protobuf ( #5024 )
2019-06-25 14:31:19 -07:00
Yuhong Guo
1a39fee9c6
Refactor ID Serial 1: Separate ObjectID and TaskID from UniqueID ( #4776 )
...
* Enable BaseId.
* Change TaskID and make python test pass
* Remove unnecessary functions and fix test failure and change TaskID to
16 bytes.
* Java code change draft
* Refine
* Lint
* Update java/api/src/main/java/org/ray/api/id/TaskId.java
Co-Authored-By: Hao Chen <chenh1024@gmail.com >
* Update java/api/src/main/java/org/ray/api/id/BaseId.java
Co-Authored-By: Hao Chen <chenh1024@gmail.com >
* Update java/api/src/main/java/org/ray/api/id/BaseId.java
Co-Authored-By: Hao Chen <chenh1024@gmail.com >
* Update java/api/src/main/java/org/ray/api/id/ObjectId.java
Co-Authored-By: Hao Chen <chenh1024@gmail.com >
* Address comment
* Lint
* Fix SINGLE_PROCESS
* Fix comments
* Refine code
* Refine test
* Resolve conflict
2019-05-22 14:46:30 +08:00
Si-Yuan
bd00735fe8
Fix tempfile issues ( #4605 )
2019-05-05 16:06:15 -07:00
Yuhong Guo
c2349cf12d
Remove local/global_scheduler from code and doc. ( #4549 )
2019-04-03 17:05:09 -07:00
Yuhong Guo
41b81af11b
Downgrade six to 1.0.0 ( #4180 )
2019-02-27 13:05:25 -08:00
Si-Yuan
21472b890a
Integrate "tempfile_service" into "ray.node.Node" ( #3953 )
2019-02-12 17:34:04 -08:00
Robert Nishihara
ef527f84ab
Stream logs to driver by default. ( #3892 )
...
* Stream logs to driver by default.
* Fix from rebase
* Redirect raylet output independently of worker output.
* Fix.
* Create redis client with services.create_redis_client.
* Suppress Redis connection error at exit.
* Remove thread_safe_client from redis.
* Shutdown driver threads in ray.shutdown().
* Add warning for too many log messages.
* Only stop threads if worker is connected.
* Only stop threads if they exist.
* Remove unnecessary try/excepts.
* Fix
* Only add new logging handler once.
* Increase timeout.
* Fix tempfile test.
* Fix logging in cluster_utils.
* Revert "Increase timeout."
This reverts commit b3846b89040bcd8e583b2e18cb513cb040e71d95.
* Retry longer when connecting to plasma store from node manager and object manager.
* Close pubsub channels to avoid leaking file descriptors.
* Limit log monitor open files to 200.
* Increase plasma connect retries.
* Add comment.
2019-02-07 19:53:50 -08:00
Si-Yuan
9295ab8f60
Various Python code cleanups. ( #3837 )
2019-02-03 10:16:24 -08:00
Richard Liaw
d128636bab
Ray Logging Configuration ( #3691 )
...
* fix logging for autoscaler
* module logging
* try this for logging
* yapf
* fix
* Initial logging setup
* momery
* ok
* remove basicconfig
* catch
* remove package logging
* print
* fix
* try_fix
* fix 1
* revert rllib
* logging level
* flake8
* fix
* fix
* Remove vestigal TODO
2019-01-30 21:01:12 -08:00
Si-Yuan
48139cf861
Migrate Python C extension to Cython ( #3541 )
2019-01-24 09:17:14 -08:00
Yuhong Guo
d2cf8561f2
Refactor code about ray.ObjectID. ( #3674 )
...
* Refactor code about ray.ObjectID.
* remove from_random and use nil_id instead of constructor
* remove id() in hash
* Lint and fix
* Change driver id to ObjectID
* Replace binary_to_hex(ObjectID.id()) to ObjectID.hex()
2019-01-13 01:47:29 -08:00