catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-06-28 14:47:08 +08:00

Author	SHA1	Message	Date
Conner Fromknecht	99efa7a9f3	Fixed catalyst tests except example tests	2017-06-19 14:43:10 -07:00
Ana Ruelas	092951470a	DOC: Fix invalid sphinx sections	2017-06-05 15:52:57 -04:00
Scott Sanderson	22df0a9cb9	MAINT/STY: Upgrade flake8 and fix new failures.	2017-05-15 11:45:04 -04:00
Joe Jevnik	0123bb8a97	ENH: prune the graph based on the initial workspace	2016-10-28 15:04:18 -04:00
Scott Sanderson	a8b67d352e	MAINT: Refactor in prep for downsampled terms. - Split out extra_rows handling into an `ExecutionPlan` subclass. `ExecutionPlan` now requires the dates and calendar against which a set of terms will be computed, and now defers to a term's `compute_extra_rows` method when deciding how many extra rows are required to compute for that term. This will allow downsampled terms to request enough extra rows to guarantee that we can maintain consistent calculation dates. As a consequence of the above, `TermGraph` now only deals with logical dependencies, not with metadata surrounding extra row calculations. This means that TermGraph can be used to generate dependency visualizations in interactive contexts where we don't yet have a calendar or start/end dates. - Refactored test_{filter,factor,classifier} to use check_terms instead of run_graph. This makes it easier to make changes to TermGraph, since the testing interface is now to simply provide a dict of terms. - Refactored BasePipelineTestCase to use fixtures to create an asset finder. This fixes a potential leak of the test's asset db, which was not being explicitly cleaned up. - Refactored test_technical to use BasePipelineTestCase. - Added a new special term, `InputDates()`, which can be used to request date labels for inputs. Like `AssetExists`, `InputDates` is provided in the initial workspace by default. - Added a default (failing) `_compute` method to `AssetExists` which provides a more useful error than AttributeError.	2016-08-17 16:52:09 -04:00
Jean Bredeche	e6af4e4f1b	ENH: made `exchange` a required parameter to `Asset` and its subclasses This required updating a lot of tests.	2016-08-02 23:21:39 -04:00
Scott Sanderson	49bb8264dc	ENH: Finish adding groupby to rank/top/bottom. - Added test coverage for grouped and masked top/bottom. - Added test coverage for grouped rank on datetime factors. - Fixed an issue where grouped rank would fail on datetime inputs because unary-negative isn't defined for datetimes. We now instead directly invoke a function from rank.pyx that does the normalizations as neeeded. - Fixed an issue where GroupedRowTransform assumed that it produced the same dtype as its input. This isn't true for rank() of a datetime-dtype factor. GroupedRowTransform now takes a required dtype parameter. - Similarly, fixed an issue where GroupedRowTransform assumed that its missing_value was the same as its parent's, which isn't true for rank() of a datetime-dtype factor. GroupedRowTransform now takes a required dtype parameter. - Fixed an issue where Factor.demean() and Factor.zscore() weren't properly cached because their static_identity included a closure that was dynamically generated on each invocation. They both now always use a function defined at module scope.	2016-07-26 02:57:35 -04:00
Joe Jevnik	958d455a7a	ENH: Support default params for terms	2016-07-12 18:49:24 -04:00
dmichalowicz	393f82e81e	ENH: Add single-column input/output capabilities to pipeline terms	2016-06-23 10:24:09 -04:00
Joe Jevnik	5925107052	TST: fix doctests to actually run	2016-06-21 15:07:03 -04:00
dmichalowicz	86486803b6	BUG: custom factor outputs naming collisions	2016-05-25 15:41:16 -04:00
Scott Sanderson	65de1215e0	Merge pull request #1204 from quantopian/tell-me-what-my-choices-were Tell me what my choices were	2016-05-19 18:52:04 -04:00
dmichalowicz	1ec0bced6d	ENH: Add builtin factors for correlation and regression	2016-05-18 15:11:12 -04:00
Scott Sanderson	4a513360b6	ENH: Include choices in no-output-found errormsg.	2016-05-17 17:51:24 -04:00
Joe Jevnik	784d5f4a16	Merge pull request #1199 from quantopian/boybands-factor BollingerBands factor	2016-05-13 15:35:10 -04:00
Joe Jevnik	78db90a858	STY: flake8	2016-05-12 17:01:17 -04:00
Joe Jevnik	f494d6f0d1	BUG: Fix check that pipeline argument is hashable. Adds test coverage for the caes where it is not hashable.	2016-05-11 21:37:12 -04:00
Scott Sanderson	8b1136d9d5	ENH: Validate missing_values at term construction. Finds bugs in several bad tests that were constructing invalid terms.	2016-05-10 19:43:56 -04:00
Scott Sanderson	4d42cddae4	ENH: Fail fast on outputs in CustomClassifier. We don't support multiple outputs for CustomClassifier because we use LabelArrays for string classifiers.	2016-05-04 15:54:50 -04:00
Scott Sanderson	5f190395ad	ENH: Add support for strings in Pipeline. - Adds a new class, ``LabelArray``, which is a subclass of np.ndarray. LabelArray is conceptually similar to pandas.Categorical, in that it stores data with many duplicate values as indices into an array of unique values. For string data with many duplicates (e.g. time-series of tickers or or industry classifications), this provides multiple orders of magnitude of improvement when doing string operations, especially string comparison/matching operations. - Adds a new generic object "specialization" for `AdjustedArrayWindow`, and a corresponding ObjectOverwrite adjustment. - Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``. This method is called on the final result of any pipeline expression after screen filtering has occurred. The default implementation of ``postprocess`` is identity, but Classifier overrides it to coerce string columns into pandas.Categoricals before presenting them to the user.	2016-05-04 15:50:52 -04:00
dmichalowicz	d9bfcaabde	ENH: Support multiple outputs for custom factors	2016-04-21 10:57:29 -04:00
Scott Sanderson	3c53b4944b	TEST: Test not calling super()._validate.	2016-03-19 19:09:16 -04:00
Scott Sanderson	53d3b0855b	ENH: Add support for Classifiers. Classifiers are computations that represent grouping keys. They can be used in conjuction with normalization functions like ``zscore`` or ``demean`` to perform normalizations over subsets of a dataset. Notable changes: - Added ``demean()`` and ``zscore()`` methods to ``Factor``. - Added a classifier versions of ``Latest`` and ``CustomTermMixin``. The .latest attribute of int64 dataset columns no produces a classifier by default. - Added ``Everything``, a classifier that maps all data to the same value. - Added ``zipline.lib.normalize``, which implements a naive, pure-Python grouped normalize function. This will likely be moved to Cython in a subsequent PR.	2016-03-19 17:04:28 -04:00
Scott Sanderson	535d05e714	MAINT: Remove notion of "atomic" pipeline terms. Replace it by distinguishing between "Loadable" and "Computable". This is useful because it's now possible to write computable terms that don't require any inputs (e.g. an `Always` filter or an `Everything` classifier).	2016-03-08 13:49:45 -05:00
Scott Sanderson	d889f8b08b	BUG: Don't use deprecated attribute of exception.	2016-02-16 13:43:25 -05:00
Scott Sanderson	0115cdc46c	MAINT: Fail fast on unsupported dtypes.	2016-02-12 21:23:47 -05:00
Scott Sanderson	09be7acaa8	TEST: Test forwarding of missing_value.	2016-02-12 21:23:47 -05:00
Scott Sanderson	c105735574	DEV: Add support for specifying missing_value. Consequently, enable support for `int`-dtyped Factors and BoundColumns.	2016-02-12 21:23:47 -05:00
Scott Sanderson	a96dd70634	MAINT: Rename ConstantLoader to PrecomputedLoader.	2016-02-12 21:21:19 -05:00
Scott Sanderson	0c15f50231	TEST: Add dedicated testing dataset.	2016-02-12 21:20:18 -05:00
Scott Sanderson	28fdecc98b	ENH: Make .latest return a Filter on bool columns.	2016-02-12 21:20:18 -05:00
Scott Sanderson	5f49fa22cb	MAINT: Upgrade numpy and fix warnings. Mostly fixes ambiguous calls to numpy.full, and uses explicitly-united NaT values.	2016-02-11 18:46:39 -05:00
Joe Jevnik	68cf236944	TST: Add test case for adding columns in subclass	2015-12-29 10:12:39 -05:00
llllllllll	32baac4e4b	ENH: Make datasets have subclass relationships	2015-12-22 12:25:30 -05:00
Scott Sanderson	2235a53581	ENH: Add EWMA and `DollarVolume` factors.	2015-12-11 22:13:27 -05:00
Scott Sanderson	8220d1ee86	ENH: Adds support for different typed adjusted arrays and adds an EarningsCalendar loader. - Moves most of AdjustedArray back into Python. The window iterator is the only part that's performance-intensive. - Adds a bootleg templating system for creating specialized versions of AdjustedArrayWindow for each concrete type we care about. - Adds support for differently dtyped terms in pipeline. This allows us to use datetime64s which are needed in the EarningsCalendar. - Adds EarningsCalendar dataset for the next and previous earnings announcements in pipeline. - Adds in memory loader for EarningsCalendar. - Adds blaze loader for EarningsCalendar.	2015-12-08 20:24:06 -05:00
Richard Frank	2dabda6b76	MAINT: Reworked Term atomicity	2015-10-12 16:11:19 -04:00
Richard Frank	e880fa3e34	PERF: Batch load atomic terms by dataset Added CompositeTerm and now we dispatch more generally on atomic	2015-10-12 10:48:28 -04:00
Scott Sanderson	f82a01841b	MAINT: Rename ALL the things. zipline.modelling.* -> zipline.pipeline.* zipline.data.ffc.loaders -> zipline.pipeline.loaders tests/modelling -> tests/pipeline	2015-10-01 18:03:53 -04:00

39 Commits