wassname/ray - ray - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/ray.git synced 2026-07-06 03:25:12 +08:00

Author	SHA1	Message	Date
Robert Nishihara	78e4b021ab	Functions for flushing done tasks and evicted objects. (#2033 )	2018-05-18 01:59:58 -07:00
Melih Elibol	bea97b425b	Fix python linting (#2076 )	2018-05-16 15:04:31 -07:00
Robert Nishihara	8fbb88485b	Create RemoteFunction class, remove FunctionProperties, simplify worker Python code. (#2052 ) * Cleaning up worker and actor code. Create remote function class. Remove FunctionProperties object. * Remove register_actor_signatures function. * Small cleanups. * Fix linting. * Support @ray.method syntax for actor methods. * Fix pickling bug. * Fix linting. * Shorten testBlockingTasks. * Small fixes. * Call get_global_worker().	2018-05-14 14:35:23 -07:00
Alok Singh	cdf94c18a4	Clean up syntax for supported Python versions. (#1963 ) * Use set/dict literal syntax Ran code through [pyupgrade](https://github.com/asottile/pyupgrade). This is supported in every Python version 2.7+. * Drop unnecessary string format specification No need to specify 0,1.. if paramters are passed in order. * Revert "Drop unnecessary string format specification" This reverts commit efa5ec85d30ff69f34e5ed93e31343fea7647bcb. * Undo changes to cloudpickle Drop use of set literal until cloudpickle uses it. * Reformat code with YAPF We need to set up a git pre-push hook to automatically run this stuff.	2018-05-03 07:45:11 -07:00
Robert Nishihara	7792032ee3	Fix UI issue for non-json-serializable task arguments. (#1892 ) * Fix UI issue for non-json-serializable task arguments. * Simplify approach.	2018-04-15 13:54:42 -07:00
Philipp Moritz	74162d1492	Lint Python files with Yapf (#1872 )	2018-04-11 10:11:35 -07:00
Robert Nishihara	7c9e291b4b	In the UI, display task breakdowns by default. (#1857 )	2018-04-09 13:24:38 -07:00
Robert Nishihara	5bde5e75e7	Implement unsafe method for flushing entire object table and task table. (#1824 ) * Implement unsafe method for flushing entire object table and task table. * Add test. * Fix test.	2018-04-04 18:29:24 -07:00
Robert Nishihara	8d52fe931b	Add experimental feature for flushing event logs and logfiles. (#1659 ) * Add experimental feature for flushing event logs and logfiles. * Add documentation.	2018-03-27 11:57:52 -07:00
Robert Nishihara	2922e1c388	Add API for getting total cluster resources. (#1736 ) * Add API for getting total cluster resources. * Add test.	2018-03-20 15:57:00 -07:00
Robert Nishihara	96913be939	Treat actor creation like a regular task. (#1668 ) * Treat actor creation like a regular task. * Small cleanups. * Change semantics of actor resource handling. * Bug fix. * Minor linting * Bug fix * Fix jenkins test. * Fix actor tests * Some cleanups * Bug fix * Fix bug. * Remove cached actor tasks when a driver is removed. * Add more info to taskspec in global state API. * Fix cyclic import bug in tune. * Fix * Fix linting. * Fix linting. * Don't schedule any tasks (especially actor creaiton tasks) on local schedulers with 0 CPUs. * Bug fix. * Add test for 0 CPU case * Fix linting * Address comments. * Fix typos and add comment. * Add assertion and fix test.	2018-03-16 11:18:07 -07:00
William Paul	f2b6a7b58d	Polished TensorFlowVariables code and documentation (#566 )	2018-02-12 15:38:58 -08:00
Alexey Tumanov	f1303291b4	Ray scheduler spillback plumbing + mechanism (#1362 ) * spillback mechanism and plumbing : adding spillback counter + timestamp * linting fix * documentation * Fix argument name.	2018-01-23 20:18:12 -08:00
Eric Liang	a2b190e65b	Fix occasional task timeline failure to get task ids (#1442 )	2018-01-21 12:04:44 -08:00
Melih Elibol	24b93b1123	fixes default type for product of empty shape. (#1341 )	2017-12-18 17:41:44 -08:00
Stephanie Wang	12fdb3f53a	Convert actor dummy objects to task execution edges. (#1281 ) * Define execution dependencies flatbuffer and add to Redis commands * Convert TaskSpec to TaskExecutionSpec * Add execution dependencies to Python bindings * Submitting actor tasks uses execution dependency API instead of dummy argument * Fix dependency getters and some cleanup for fetching missing dependencies * C++ convention * Make TaskExecutionSpec a C++ class * Convert local scheduler to use TaskExecutionSpec class * Convert some pointers to references * Finish conversion to TaskExecutionSpec class * fix * Fix * Fix memory errors? * Cast flatbuffers GetSize to size_t * Fixes * add more retries in global scheduler unit test * fix linting and cast fbb.GetSize to size_t * Style and doc * Fix linting and simplify from_flatbuf.	2017-12-14 20:47:54 -08:00
Robert Nishihara	c21e189371	Allow scheduling with arbitrary user-defined resource labels. (#1236 ) * Enable scheduling with custom resource labels. * Fix. * Minor fixes and ref counting fix. * Linting * Use .data() instead of .c_str(). * Fix linting. * Fix ResourcesTest.testGPUIDs test by waiting for workers to start up. * Sleep in test so that all tasks are submitted before any completes.	2017-12-01 11:41:40 -08:00
Robert Nishihara	2865128df0	Remove counter from run_function_on_all_workers. Also remove utilitie… (#1260 ) * Remove counter from run_function_on_all_workers. Also remove utilities for copying directories across machines. * Fix linting.	2017-11-26 18:29:10 -08:00
Philipp Moritz	e798a652bc	Change TaskSpec to allow multiple object IDs per argument. (#1204 ) * Implement object ID bags * linting * fix tests * fix linting * fix comments	2017-11-10 16:33:34 -08:00
Stephanie Wang	07f0532b9b	Local scheduler filters out dead clients during reconstruction (#1182 ) * Object table lookup returns vector of DBClientID instead of address strings * Add node IP address to DBClient notification * DB client cache stores entire DB client, convert addresses to std::string * get cached db client returns the client * Expose a call to initialize the redis cache * Local scheduler filters out dead clients during reconstruction * Remove node ip address from dbclient, use aux_address for plasma managers * Get entire db client entry when not found in cache * Fix common tests * Fix address in tests * Push error to driver if driver task did the put * Address Robert's comments and cleanup * Remove unused Redis command * Fix db test	2017-11-10 11:29:24 -08:00
Zongheng Yang	5a50e80b63	Make Monitor remove dead Redis entries from exiting drivers. (#994 ) * WIP: removing OL, OI, TT on client exit; no saving yet. * ray_redis_module.cc: update header comment. * Cleanup: just the removal. * Reformat via yapf: use pep8 style instead of google. * Checkpoint addressing comments (partially) * Add 'b' marker before strings (py3 compat) * Add MonitorTest. * Use `isort` to sort imports. * Remove some loggings * Fix flake8 noqa marker runtest.py * Try to separate tests out to monitor_test.py * Rework cleanup algorithm: correct logic * Extend tests to cover multi-shard cases * Add some small comments and formatting changes.	2017-09-26 00:11:38 -07:00
Eric Liang	d8aa826e63	[webui] Scalability fixes for the task timeline and visualizations (#935 ) * fixes * comments * fix test * Update ui.py * upd * Fix linting.	2017-09-10 15:47:44 -07:00
Robert Nishihara	f3c1248d98	Clone catapult and generate html files during installation. (#956 ) * Clone catapult and generate static html during setup. * Include UI files in installation. * Fix directory to clone catapult to and fix linting. * Use absolute path. * Make sure we find a sufficiently new version of python2 when building wheels. * Copy the trace_viewer_full.html file to the local directory if it is not present. * Make sure wheels fail to build if UI is not included.	2017-09-10 13:41:16 -07:00
Eric Liang	953878364e	[webui] Print out timeline link for full-screen trace viewing (#936 ) * up * update	2017-09-06 01:41:21 -07:00
Eric Liang	a2814567e1	[webui] Quick fix to timeline on task failure (#930 ) * foo * update * Move _add_missing_timestamps to task_profiles function.	2017-09-04 22:58:19 -07:00
Eric Liang	63d8d11714	[webui] Checkboxes should go to the left of their labels (#932 )	2017-09-04 17:05:13 -07:00
Robert Nishihara	8ed03b1cf0	Make task timeline work with ipywidgets==7.0.0, change slider default values. (#925 ) * Make task timeline work with ipywidgets==7.0.0. * Change initial UI slider values from 70-100 to 0-100.	2017-09-03 23:15:46 -07:00
Wapaul1	4db45c9c54	Improved layout of controls for Web UI (#876 ) * Improved layout of controls * Added explicit labels and some comments * Fix linting errors	2017-08-28 14:43:34 -07:00
Robert Nishihara	d43a435c68	Don't redirect worker output to log files if redirect_output=False. (#873 ) * Don't redirect worker output to log files if redirect_output=False. * Fix, handle case where RedirectOutput key is not in Redis.	2017-08-27 14:27:44 -07:00
Robert Nishihara	ca53e9ae7b	Fix bugs in task timeline visualization. (#836 ) * Fix bugs in task timeline visualization. * Some cleanups. * Remove print statements.	2017-08-13 23:39:37 -07:00
alanamarzoev	bfe473fa8c	Embedded task trace with object dependencies. (#818 ) * Embedded timeline * Yeah * Fixed arrows not showing up. * Fixed arrows not showing up, and added check boxes for the kinds of dependencies that should be included in the trace. * first * Fixes * Fixed typo in comments, added more comments. fixed linting. * Added more comments. * Formatting. * fixes * Fixed state.py linting. * Fixed ui.py linting errors. * Fixed linting errors. * Renamed task dependencies and included instructions for viewing arrows. * Fixed according to PR comments. * Fixed bug. * Undid changes to metadata blocks. * Fixes according to comments. * Fixed linting. * Fixed linting. * NOQA keyword added to link line.	2017-08-09 23:00:14 -07:00
Alexey Tumanov	fc885bd918	Adding basic support for a user-interpretable resource label (#761 ) * adding support for the user-interpretable label(UIR) * more plumbing for num_uirs further upstream; set to infty when specified on cmd line * pass default num_uirs for actors; update GlobalStateAPI * support num_uirs in ray.init() * local scheduler resource accounting: support num_uirs; prep for vectorized resource accounting * global scheduler test updated * Fix bug introduced by rebase. * Rename UIR -> CustomResource and add test. * Small changes and use constexpr instead of macros. * Linting and some renaming. * Reorder some code. * Remove cpus_in_use and fix bug. * Add another test and make a small change. * Rephrase documentation about feature stability.	2017-08-08 02:53:59 -07:00
alanamarzoev	64eaaaebf0	Show timeline button. (#809 )	2017-08-03 20:11:50 -07:00
alanamarzoev	99badc7ae4	UI functions in separate file. (#801 ) * UI file. * Fixed linting. * Change UI instructions slightly.	2017-08-02 19:32:18 -07:00
Robert Nishihara	8c8258de20	Move worker methods into Worker class and expose more TaskSpec fields to Python. (#796 ) * Move worker methods inside worker class. Move some helper methods from actor.py into utils.py and state.py. * Add more methods exposing task spec fields to Python. * Fix linting. * Fix error. * Remove unused code in default worker.	2017-08-01 17:16:57 -07:00
Philipp Moritz	c3b39b4d86	Pull Plasma from Apache Arrow and remove Plasma store from Ray. (#692 ) * Rebase Ray on top of Plasma in Apache Arrow * add thirdparty building scripts * use rebased arrow * fix * fix build * fix python visibility * comment out C tests for now * fix multithreading * fix * reduce logging * fix plasma manager multithreading * make sure old and new object IDs can coexist peacefully * more rebasing * update * fixes * fix * install pyarrow * install cython * fix * install newer cmake * fix * rebase on top of latest arrow * getting runtest.py run locally (needed to comment out a test for that to work) * work on plasma tests * more fixes * fix local scheduler tests * fix global scheduler test * more fixes * fix python 3 bytes vs string * fix manager tests valgrind * fix documentation building * fix linting * fix c++ linting * fix linting * add tests back in * Install without sudo. * Set PKG_CONFIG_PATH in build.sh so that Ray can find plasma. * Install pkg-config * Link -lpthread, note that find_package(Threads) doesn't seem to work reliably. * Comment in testGPUIDs in runtest.py. * Set PKG_CONFIG_PATH when building pyarrow. * Pull apache/arrow and not pcmoritz/arrow. * Fix installation in docker image. * adapt to changes of the plasma api * Fix installation of pyarrow module. * Fix linting. * Use correct python executable to build pyarrow.	2017-07-31 21:04:15 -07:00
Robert Nishihara	c394a65ffc	Wait longer when getting redis shards to initialize global state API. (#786 )	2017-07-31 17:56:11 -07:00
alanamarzoev	2b3190ad13	Chrome trace timeline with sliders. (#731 ) * Trace timeline with sliders. * Trace. * Switched ujson to json. * Fixed tests. * linting fixes * Fixed bug. * Cleaned up code. * Fixes according to comments. * removed checkpoints. * Undid accidental delete. * Fixed linting error. * Added documentation to notebook. * Undid accidental deletes. * Add comments and small formatting fixes. * Small fix.	2017-07-17 19:59:49 -07:00
Robert Nishihara	e0867c8845	Switch Python indentation from 2 spaces to 4 spaces. (#726 ) * 4 space indentation for actor.py. * 4 space indentation for worker.py. * 4 space indentation for more files. * 4 space indentation for some test files. * Check indentation in Travis. * 4 space indentation for some rl files. * Fix failure test. * Fix multi_node_test. * 4 space indentation for more files. * 4 space indentation for remaining files. * Fixes.	2017-07-13 21:53:57 +00:00
alanamarzoev	8464d77c76	Change event logs to store one Redis ZSET per worker. (#705 ) * Changing to zset * Fixed bug. * Fixed another bug. * Modified task_profiles. * Removed extra file. * Modified task_profiles test. * WIP * WIP * Undid changes * Updated * WIP * Made changes according to comments. * Removed unneeded print. * Removed ujson usage. * failing test * tests passing * Fixed linting errors and modified style. * Fixed bug. * Fixed linting * Fixed according to comments. * Redis crashing? * Fixed linting * Fixed linting	2017-07-09 01:42:29 +02:00
alanamarzoev	2b11a7bca2	Add task ID and object ID search boxes to web UI. (#704 ) * Task search box. * Cleaned up. * Small reformatting. * Add object table search box.	2017-07-01 17:48:23 -04:00
alanamarzoev	716469160e	Enable dumping profiling information to timeline format viewable by chrome tracing. (#703 ) * Chrome tracing timeline. * Modified decode statement. * Some cleanups and add test. * Remove example. * Fix test.	2017-06-30 12:14:11 -04:00
alanamarzoev	e16df6da9a	Updated task_profiles function to avoid future repetitive parsing. (#691 ) * Updated task_profiles function to avoid future repetitive parsing. * Fix indentation. * Fixed according to comments. * Included updated test for task_profiles function. * Simplify test. * Fix indentation. * Fix.	2017-06-22 19:21:18 -07:00
alanamarzoev	4d5ac9dad5	Include object size and hash in the table returned by the object_table function in the GlobalStateAPI. (#665 ) * added log_table function and a test * fixed log_files and added task_profiles * fixed formatting * fixed linting errors * fixes * removed file * more fixes * hopefully fixed * Small changes. * Fix linting. * Fix bug in log monitor. * Small changes. * Fix bug in travis. * Including data_size and hash in the ResultTableReply. * Included data_size and hash info in object_table. * Fixed bugs in ray_redis_module.cc. * Removing commented out code. * Fixes * Freed hash and data_size strings after using, and checked if they're null along with task_id and is_put. * Changed it so that data_size is set correctly. * Removed iostream import. * Included a check to ensure that the Redis string to long long conversion was successful. * Included separate data_size and hash null checks. * Fixed bug. * Made linting changes. * Another linting error. * Slight simplication.	2017-06-16 23:17:11 -07:00
alanamarzoev	cc4990b543	Task profiles function and test (#647 ) Expose some task profiling information through global state API.	2017-06-13 17:53:34 -07:00
alanamarzoev	f0339f3386	Expose log files through global state API. (#641 ) * added log_table function and a test * fixed log_files and added task_profiles * fixed formatting * fixed linting errors * fixes * removed file * more fixes * hopefully fixed * Small changes. * Fix linting. * Fix bug in log monitor. * Small changes. * Fix bug in travis.	2017-06-08 00:08:10 -07:00
Stephanie Wang	ee08c8274b	Shard Redis. (#539 ) * Implement sharding in the Ray core * Single node Python modifications to do sharding * Do the sharding in redis.cc * Pipe num_redis_shards through start_ray.py and worker.py. * Use multiple redis shards in multinode tests. * first steps for sharding ray.global_state * Fix problem in multinode docker test. * fix runtest.py * fix some tests * fix redis shard startup * fix redis sharding * fix * fix bug introduced by the map-iterator being consumed * fix sharding bug * shard event table * update number of Redis clients to be 64K * Fix object table tests by flushing shards in between unit tests * Fix local scheduler tests * Documentation * Register shard locations in the primary shard * Add plasma unit tests back to build * lint * lint and fix build * Fix * Address Robert's comments * Refactor start_ray_processes to start Redis shard * lint * Fix global scheduler python tests * Fix redis module test * Fix plasma test * Fix component failure test * Fix local scheduler test * Fix runtest.py * Fix global scheduler test for python3 * Fix task_table_test_and_update bug, from actor task table submission race * Fix jenkins tests. * Retry Redis shard connections * Fix test cases * Convert database clients to DBClient struct * Fix race condition when subscribing to db client table * Remove unused lines, add APITest for sharded Ray * Fix * Fix memory leak * Suppress ReconstructionTests output * Suppress output for APITestSharded * Reissue task table add/update commands if initial command does not publish to any subscribers. * fix * Fix linting. * fix tests * fix linting * fix python test * fix linting	2017-05-18 17:40:41 -07:00
Philipp Moritz	28f0882387	Expose function table to python global control state API (#542 ) * expose function table to python global control state API * fix * fix linting * add test for function table	2017-05-16 20:06:13 -07:00
Robert Nishihara	ec2534422b	Remove register_class from API. (#550 ) * Perform ray.register_class under the hood. * Fix bug. * Release worker lock when waiting for imports to arrive in get. * Remove calls to register_class from examples and tests. * Clear serialization state between tests. * Fix bug and add test for multiple custom classes with same name. * Fix failure test. * Fix linting and cleanups to python code. * Fixes to documentation. * Implement recursion depth for recursively registering classes. * Fix linting. * Push warning to user if waiting for class for too long. * Fix typos. * Don't export FunctionToRun if pickling the function fails. * Don't broadcast class definition when pickling class.	2017-05-16 18:38:52 -07:00
Robert Nishihara	0ac125e9b2	Clean up when a driver disconnects. (#462 ) * Clean up state when drivers exit. * Remove unnecessary field in ActorMapEntry struct. * Have monitor release GPU resources in Redis when driver exits. * Enable multiple drivers in multi-node tests and test driver cleanup. * Make redis GPU allocation a redis transaction and small cleanups. * Fix multi-node test. * Small cleanups. * Make global scheduler take node_ip_address so it appears in the right place in the client table. * Cleanups. * Fix linting and cleanups in local scheduler. * Fix removed_driver_test. * Fix bug related to vector -> list. * Fix linting. * Cleanup. * Fix multi node tests. * Fix jenkins tests. * Add another multi node test with many drivers. * Fix linting. * Make the actor creation notification a flatbuffer message. * Revert "Make the actor creation notification a flatbuffer message." This reverts commit af99099c8084dbf9177fb4e34c0c9b1a12c78f39. * Add comment explaining flatbuffer problems.	2017-04-24 18:10:21 -07:00

1 2

62 Commits