wassname/ray - ray - Gitea: Git with a cup of tea

mirror of https://github.com/wassname/ray.git synced 2026-06-28 03:18:59 +08:00

Author	SHA1	Message	Date
Peter Schafhalter	68b11c8251	[DataFrame] Speed up dtypes (#2118 ) * Don't recreate _block_partitions in _correct_dtypes Further dtypes performance optimizations Fix bugs Redesign speedup Address feedback * Remove _correct_column_dtypes	2018-05-23 16:35:17 -07:00
Kunal Gosar	4584193308	[DataFrame] Refactor GroupBy Methods and Implement Reindex (#2101 ) * fix 1D blocks case * Add groupby test code * begin writing tests * Fix bug on groupby(axis=1, ...) * implement reindex * fix index misalignment after groupby * fix df.apply bug * fix groupby.apply * fix agg funcs * finish groupby tests * finish test suite for groupby * fixing lint * undo new line * fix python2 index copy bug * Concat Series into ray.df * fixing python2 issues * resolving all python 2 tests * handle multiindex on apply * resolve comments * updating docstring * fix lint * fix lint again * address comments	2018-05-22 16:34:07 -07:00
Devin Petersohn	317c9450e7	[DataFrame] Test bugfixes (#2111 )	2018-05-21 23:01:19 -07:00
Peter Schafhalter	f1fc373de7	[DataFrame] Update initializations of IndexMetadata which use outdated APIs (#2103 ) * Update calls which use outdated APIs * Fix lengths of IndexMetadata	2018-05-21 09:19:41 -10:00
Alok Singh	f795173b51	Use flake8-comprehensions (#1976 ) * Add flake8 to Travis * Add flake8-comprehensions [flake8 plugin](https://github.com/adamchainz/flake8-comprehensions) that checks for useless constructions. * Use generators instead of lists where appropriate A lot of the builtins can take in generators instead of lists. This commit applies `flake8-comprehensions` to find them. * Fix lint error * Fix some string formatting The rest can be fixed in another PR * Fix compound literals syntax This should probably be merged after #1963. * dict() -> {} * Use dict literal syntax dict(...) -> {...} * Rewrite nested dicts * Fix hanging indent * Add missing import * Add missing quote * fmt * Add missing whitespace * rm duplicate pip install This is already installed in another file. * Fix indent * move `merge_dicts` into utils * Bring up to date with `master` * Add automatic syntax upgrade * rm pyupgrade In case users want to still use it on their own, the upgrade-syn.sh script was left in the `.travis` dir.	2018-05-20 16:15:06 -07:00
Peter Schafhalter	9e46de9830	[DataFrame] Update _inherit_docstrings (#2085 ) * Update _inherit_docstrings * Add groupby __init__	2018-05-18 17:50:41 -10:00
Simon Mo	0b07602c89	[DataFrame] Refactor __delitem__ (#2080 ) * Implement the bug fix * Fix flake8	2018-05-18 08:58:20 -10:00
Kunal Gosar	afbb260ca4	[DataFrame] Improve performance of iteration methods (#2026 ) * fix iterrows * make iteration methods performant * resolving comments * remove indexing from iterator * switch to iterator syntax	2018-05-17 11:45:21 -10:00
Peter Schafhalter	ae17ebd032	[DataFrame] Implement to_csv (#2014 ) * Add map, reduce, merge_dtypes bug fixes Unify dtypes on DataFrame creation Formatting and comments Cache dtypes Fix bug in _merge_dtypes Fix bug Changed caching logic Fix dtypes issue in read_csv Invalidate dtypes cache when inserting column Simplify unifying dtypes and improve caching Fix typo Better caching of dtypes Fix merge conflicts Implemented some to_csv functions Support read_csv from buffers Expose date_range, NaT, Timedelta from pandas Add testing utils Redirect imports to Pandas Fix imports Fix read_csv when index_col is specified Update imports from Pandas Fix bugs Use util API Fix nasty bug Add missing import Don't distribute reading of compressed files Add test utilities for Pandas tests Add test for to_csv Add warnings Fix rebase artifacts * Fix rebase artifacts * Fix bugs in read_csv indexing * Fix bugs in read_csv * Fix bug for IndexMetadata with _length 1 Remove testing imports * Rebase artifacts and formatting * Start to_csv without CSV formatter * Fix bug in _map_partitions * Initial implementation for improved to_csv * Fix bug with insert * Bugfixes * Remove CSV Formatter * Formatting * Fix python2 bug * Fix additional python2 issue	2018-05-17 11:35:17 -10:00
Kunal Gosar	7549209aea	[DataFrame] Allows DataFrame constructor to take in another DataFrame (#2072 ) * allow constructor to take in other DataFrame * rename _data to _frame_data	2018-05-16 16:17:20 -10:00
Simon Mo	825c227c2b	[DataFrame] Refactor indexers and implement setitem (#2020 ) * Reset to_pandas change; update current * Fix pd_df bug	2018-05-15 12:27:28 -07:00
Robert Nishihara	8fbb88485b	Create RemoteFunction class, remove FunctionProperties, simplify worker Python code. (#2052 ) * Cleaning up worker and actor code. Create remote function class. Remove FunctionProperties object. * Remove register_actor_signatures function. * Small cleanups. * Fix linting. * Support @ray.method syntax for actor methods. * Fix pickling bug. * Fix linting. * Shorten testBlockingTasks. * Small fixes. * Call get_global_worker().	2018-05-14 14:35:23 -07:00
Devin Petersohn	89e2eef3f3	[DataFrame] Fixing bugs in groupby (#2031 )	2018-05-10 11:44:19 -07:00
Kunal Gosar	b79912ec74	[DataFrame] Fixes dropna subset bug (#2018 ) * fix dropna * resolve comment	2018-05-10 08:25:24 -07:00
Devin Petersohn	72a3a6cb02	[DataFrame] Implement where (#1989 )	2018-05-09 14:05:52 -07:00
Kunal Gosar	d2c193ed2c	[DataFrame] Add direct pandas imports for MVP (#1960 ) * Add direct pandas imports for MVP * rebase artifact	2018-05-08 19:19:32 -07:00
Devin Petersohn	b1e32ca6c2	Fixing ascii error for Python2 (#2009 )	2018-05-07 11:56:40 -07:00
Peter Veerman	1f82a46473	[DataFrame] Implements df.update (#1997 ) * Working on fixing update * Fixing update implementation * Adding test * Addressing comments	2018-05-07 08:55:40 -07:00
Peter Veerman	12da021717	[DataFrame] Implements df.as_matrix (#2001 ) * Implement df.as_matrix * Addressing comments * Addressing comments	2018-05-06 23:36:39 -07:00
Rohan Singh	1848745223	[DataFrame] Implement quantile (#1992 ) * added quantile method * updated init for datetime signatures * updated documentation for _map_partitions return type * removed extraneous print call * updated for simplicity * fixed dtyping issues and error raising * updated datetime dtype checking * Fixing quantile implementation * Fix minor bug * Fixing diff	2018-05-06 18:25:13 -07:00
Devin Petersohn	ad1afeb268	[DataFrame] Impement sort_values and sort_index (#1977 )	2018-05-06 09:53:29 -07:00
Rohan Singh	9f28529e2c	[DataFrame] Implement rank (#1991 ) * rank method completed * added sanity checks * flake8 * updated sanity checks * flake8 * updated sanity checks and style * updated dtype logic * Fixing test	2018-05-06 09:32:33 -07:00
Hari Subbaraj	857458c37c	[DataFrame] Implemented prod, product, added test suite (#1994 ) * implemented prod/product, modified declaration for sum, added pandas test suite * fixed tests * removed test_analytics file * implemented nunique, skew * fixed requested changes * added nunique, skew * fixed tests in request * added newline back * fixed newlines hopefully * fixed flake8 issues * more flake8 issues * fixed test for prod	2018-05-05 21:25:42 -07:00
Jae Min Kim	36fd64800b	[DataFrame] Implemented __setitem__, select_dtypes, and astype (#1941 ) * added reindex, __setitem__, select_dtypes, and astype functionality * readded tests for astype and select_dtypes * fixed index issue with reindex * lint spacing * removing current reindex implementation for future pr * wrong testing func * errors now raised in the workers, but suppressing them can be an issue * updated code for select_dtypes * Update test_dataframe.py	2018-05-05 20:27:29 -07:00
Rohan Singh	8509a51291	[DataFrame] Implement diff (#1996 ) * added diff method * sanity checks * flake8 * updated sanity checks' * rebase and style updates * updated diff tests and cleaned up code * updated tests for periods * flake8	2018-05-05 00:44:52 -07:00
Hari Subbaraj	a58629f53f	[DataFrame] Implemented nunique, skew (#1995 ) * implemented prod/product, modified declaration for sum, added pandas test suite * fixed tests * removed test_analytics file * implemented nunique, skew * fixed changes for nunique, skew * fixed nunique test * added axis=1 test to skew * flake8 issues * more flake8 issues * resolve some flake issues	2018-05-04 12:22:10 -07:00
Kunal Gosar	4030356b51	[DataFrame] Implements filter and dropna (#1959 ) * implement filter * begin implementation of dropna * implement dropna * docs and tests * resolving comments * resolving merge * add error checking to dropna * fix update inplace call * Implement multiple axis for dropna (#13) * Implement multiple axis for dropna * Add multiple axis dropna test * Fix using dummy_frame in dropna * Clean up dropna multiple axis tests * remove unnecessary axis modification * Clean up dropna tests * resolve comments * fix lint	2018-05-04 12:21:16 -07:00
Peter Veerman	22d4950fae	[DataFrame] Implements df.pipe (#1999 ) * Add empty df test * Fix flake8 issues * rebase with master * reset master tests * Implement df.pipe * fix tests * Use test_pipe as a pytest.fixture * Add newline at EOF	2018-05-04 10:16:05 -07:00
Omkar Salpekar	a1d7bb31a4	[DataFrame] Apply() for Lists and Dicts (#1973 ) * working for non-string functions and not lists of functions * works with functions as strings now as well * fixed linting errors * throwing a warning if the input is a dictionary * added dict of lists functionality * fix minor indexing errors and lint * removed some commented out code * some comments and thoughts for apply * cleaned up code a little bit and added todos * improved performance * error checking and code cleanup and comments * small change * improved list performance a lot * agg calls apply for lists * addressing comments on the PR * col_metadata change * updated tests to expect TypeError where appropriate	2018-05-04 10:05:00 -07:00
Alok Singh	cdf94c18a4	Clean up syntax for supported Python versions. (#1963 ) * Use set/dict literal syntax Ran code through [pyupgrade](https://github.com/asottile/pyupgrade). This is supported in every Python version 2.7+. * Drop unnecessary string format specification No need to specify 0,1.. if paramters are passed in order. * Revert "Drop unnecessary string format specification" This reverts commit efa5ec85d30ff69f34e5ed93e31343fea7647bcb. * Undo changes to cloudpickle Drop use of set literal until cloudpickle uses it. * Reformat code with YAPF We need to set up a git pre-push hook to automatically run this stuff.	2018-05-03 07:45:11 -07:00
Kunal Gosar	d85ee0bc04	[DataFrame] Implements mode, to_datetime, and get_dummies (#1956 ) * implement mode and fix getitem * mode broken on misaligned partitions * fully implement mode * implement to_datetime * implement get_dummies * implement tests * fix __getitem__ * fix python2 compatibility * fix getitem bug * resolving comments * Adding documentation * resolving comment * resolve name change * speeding up getitem * complete rebase	2018-05-02 23:21:00 -07:00
Peter Schafhalter	d67b786291	[DataFrame] Fix dtypes (#1930 ) * Add map, reduce, merge_dtypes bug fixes Unify dtypes on DataFrame creation Formatting and comments Cache dtypes Fix bug in _merge_dtypes Fix bug Changed caching logic Fix dtypes issue in read_csv Invalidate dtypes cache when inserting column Simplify unifying dtypes and improve caching Fix typo Better caching of dtypes Fix merge conflicts * Correct dtypes on initialization	2018-05-02 23:04:19 -07:00
Devin Petersohn	4badc04bb2	[DataFrame] Add layer of abstraction to allow OID instantiation (#1984 )	2018-05-02 22:29:52 -04:00
Patrick Yang	5589426484	[DataFrame] Fix blocking issue on _IndexMetadata passing (#1965 ) * metadata passing fixes * fix flake8 * fix test failures * overhaul indexmetadata * variable name change * optimization for building coord df * addressing comments * subtle bug fixes	2018-05-01 23:27:49 -07:00
Devin Petersohn	7c1d569a49	[DataFrame] Implement df.merge (#1964 ) * Begin merge implementation * Some cleanup * Continue cleanup * Allowing merge on index * Copy dataframes to clear plasma read-only error * Make some notes, WIP * Cleaned up code a bit, still need more error checking * Adding error checking and addressing comments * Addressing comment * Adding test * Addressing rebase artifact * Fixing indexing bug * Some minor cleanup	2018-05-01 21:40:53 -04:00
Omkar Salpekar	1231aa0582	[DataFrame] Sample implement (#1954 ) * implemented sample - need to test * sample fully working * added sanity check tests * added some comments to clarify the _deploy_func call * some more clarifying comments * added explanatory comments * minor change in weights_sum for sample	2018-04-30 10:42:28 -07:00
Devin Petersohn	0c477fbbca	[DataFrame] Implement Inter-DataFrame operations (#1937 )	2018-04-30 06:42:07 -07:00
Devin Petersohn	1d1df7bbec	[DataFrame] Fully implement append, concat and join (#1932 )	2018-04-23 17:09:57 -07:00
Kunal Gosar	29c36f2bce	[DataFrame] Fix for __getitem__ string indexing (#1939 ) * edge case fixes for __getitem__ * Enable None indexing	2018-04-23 13:13:14 -07:00
Kunal Gosar	7c9f39241e	[DataFrame] Implementing write methods (#1918 ) * Add in write methods and functionality * infer highest available pickle version * Fix import rebase artifact * formatting changes to test * fix lint	2018-04-22 21:25:33 -07:00
Devin Petersohn	8f59546ef2	[DataFrame] Implementing API correct groupby with aggregation methods (#1914 )	2018-04-21 17:28:16 -07:00
adgirish	3c48783a16	[DataFrame] Adding read methods and tests (#1712 ) * Adding read methods and tests * Referencing internal partition method so constructors are more canonical with Pandas * Fixing to reference from_pandas in utils * Cleaning up unused imports * rerunning tests * fixing flake8 * resolving errors * Added sql and sas test * updating * Temporarily phasing out read_csv code for wrapper while diagnosing, added io tests to travis * Adding travis * restoring distributed read csv * resolving rebases * lint * Sampling out HD test * adding dep * fix pathing * Flagging out tests * resolving read_method issues * fix build issue * move additional dependencies to extras * fixing lint * removing IO dependencies * updated requirements doc	2018-04-20 18:33:08 -07:00
Omkar Salpekar	0728d4719b	[DataFrame] Eval fix (#1903 ) * eval now works without assignment - helper function a bit hacky * removed df.copy() from eval_helper * one test still failing for qury * all eval tests passing now * added check to eval arge verification * added tests to travis * added optimization and some comments * added pd.eval and passes all tests * added ray dataframe back to test file * optimizations and code cleanup for eval * changed position of pandas import in __init__ * fixed linting errors * fixing eval in __init__.py * fixed travis file - removed extra tests * removed test directory from linting exclude for travis	2018-04-18 08:48:32 -07:00
Devin Petersohn	3c817ad908	Add slice functionality (#1832 )	2018-04-16 08:50:56 -07:00
Patrick Yang	f505f0642f	[DataFrame] Pass read_csv kwargs to _infer_column (#1894 ) * pass kwargs to _infer_column * adding small test for non-comma delim * fix lint	2018-04-16 08:47:30 -07:00
Peter Schafhalter	1d605e8f8a	[DataFrame] Inherit documentation from Pandas (#1727 ) * Added _inherit_docstrings * DataFrame documentation inherits from Pandas * Fix formatting * Replace hasattr and document properties * Fix rebase * Override documentation for groupby * Override documentation for series * Don't overwrite property docstrings * Fix property __doc__ for python2	2018-04-12 20:30:19 -07:00
Omkar Salpekar	a3ddde398c	[DataFrame] Fixed repr, info, and memory_usage (#1874 ) * working with dataframes with too many rows and columns * repr works for jupyter notebooks now * added comments and test file * added repr test file to .travis.yml * added back ray.dataframe as pd to test file * fixed pandas importing issues in test file * getting the front and back of df more efficiently * only keeping dataframe tests in travis * fixing numpy array for row and col lengths issue * doesn't add dimensions if df is small enough * implemented memory_usage() * completed memory_usage - still failing 2 tests * only failing one test for memory_usage * all repr and dataframes tests passing now * fixing error related to python2 in info() * fixing python2 errors * fixed linting errosr * using _arithmetic_helper in memory_usage() * fixed last lint error * removed testing-specific code * adding back travis test * removing extra tests from travis * re-added concat test * fixes with new indexing scheme * code cleanup * fully working with new indexing scheme * added tests for info and memory_usage * removed test file	2018-04-11 08:07:07 -07:00
Devin Petersohn	806b2c844e	Fix getattr compat (#1871 )	2018-04-10 21:28:59 -07:00
Patrick Yang	521b549e4a	[DataFrame] Encapsulate index and lengths into separate class (#1849 ) * baseline impl for index_df.py * added skeleton for index_df.py * initial impl index_df * separate out partition and non-partition impls * add len function * drop returns index_df slice of dropped indices * housecleaning * Integrate index overhaul * Rename index df to index metadata * Fix flake8 issues * Addressing issues * fix import issue * Added metadata passing to constructor	2018-04-10 14:30:20 -07:00
Peter Schafhalter	405b05d58a	[DataFrame] Implemented __getattr__ (#1753 ) * __getattr__ accesses columns * Added test	2018-04-10 10:19:33 -07:00

1 2

81 Commits