catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-06-28 00:58:26 +08:00

Author	SHA1	Message	Date
Conner Fromknecht	99efa7a9f3	Fixed catalyst tests except example tests	2017-06-19 14:43:10 -07:00
Scott Sanderson	61feedbd16	TST: Add test for missing values in relabel.	2017-06-07 18:21:13 -04:00
Scott Sanderson	3b8a6b543e	BUG: Fix NoneType comparisons in PY3.	2017-06-07 18:21:03 -04:00
Scott Sanderson	5b9d5fecfb	ENH: Add `relabel` method to string classifiers. - Adds a `map` method to `LabelArray` that maps a unary function over the categories of a LabelArray, shrinking the underyling codes if possible. - Adds a new `.relabel` method to string-dtype classifiers that maps a unary function over the unique elements of the underlying LabelArray. This is useful for things like cleaning noisy label data.	2017-06-07 13:14:12 -04:00
Joe Jevnik	d07f133579	STY: remove unused imports and method, clean up docs	2016-10-28 15:04:18 -04:00
Joe Jevnik	af3e1016a0	TST: add tests for postprocess and to_workspace_value	2016-10-28 15:04:18 -04:00
Scott Sanderson	8b1136d9d5	ENH: Validate missing_values at term construction. Finds bugs in several bad tests that were constructing invalid terms.	2016-05-10 19:43:56 -04:00
Scott Sanderson	2431aaefb5	BUG: Fix bad error message for element_of. It referred to the wrong method name (`is_element`).	2016-05-10 16:57:59 -04:00
Scott Sanderson	e0aeda4c3e	BUG: Fix bytes/unicode issues in py3.	2016-05-05 01:46:35 -04:00
Scott Sanderson	b78501e54a	BUG: Fix broken isnull() on string classifiers. Adds a special case in NullFilter to handle LabelArrays correctly.	2016-05-04 17:26:27 -04:00
Scott Sanderson	5a1ed7b1d3	ENH: Make element_of work for ints too.	2016-05-04 16:31:58 -04:00
Scott Sanderson	c40bbfae03	TEST: More tests for string predicates.	2016-05-04 15:54:50 -04:00
Scott Sanderson	5f190395ad	ENH: Add support for strings in Pipeline. - Adds a new class, ``LabelArray``, which is a subclass of np.ndarray. LabelArray is conceptually similar to pandas.Categorical, in that it stores data with many duplicate values as indices into an array of unique values. For string data with many duplicates (e.g. time-series of tickers or or industry classifications), this provides multiple orders of magnitude of improvement when doing string operations, especially string comparison/matching operations. - Adds a new generic object "specialization" for `AdjustedArrayWindow`, and a corresponding ObjectOverwrite adjustment. - Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``. This method is called on the final result of any pipeline expression after screen filtering has occurred. The default implementation of ``postprocess`` is identity, but Classifier overrides it to coerce string columns into pandas.Categoricals before presenting them to the user.	2016-05-04 15:50:52 -04:00
Scott Sanderson	9a04621781	ENH: Add eq and __ne__ to Classifier.	2016-03-28 15:46:28 -04:00
Scott Sanderson	758d6c74fc	ENH: Add isnull and notnull for classifiers.	2016-03-25 15:11:18 -04:00

15 Commits