catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-06-30 08:25:24 +08:00

Author	SHA1	Message	Date
Scott Sanderson	b0a93d57a0	DOC: Clarify expect_bounded docstring.	2016-09-02 13:33:55 -04:00
Scott Sanderson	8b2446aec6	ENH: Dont allow length=1 regressions/correlations. They're not meaningful, and they cause warnings from numpy. Implemented in terms of a new preprocessor, `expect_bounded`, which takes a tuple of `upper_bound` and `lower_bound`.	2016-09-02 12:49:09 -04:00
Scott Sanderson	115f055c83	MAINT: Clean up downsampling boilerplate. Consolidate docs and mixin applications into one place.	2016-08-17 16:52:09 -04:00
Joe Jevnik	bfa3e6f153	TST: 32b compat doctests	2016-06-21 15:07:03 -04:00
Joe Jevnik	efd7bf72c3	TST: py3 compat doctests	2016-06-21 15:07:03 -04:00
Joe Jevnik	5925107052	TST: fix doctests to actually run	2016-06-21 15:07:03 -04:00
Scott Sanderson	3395b33f1e	BUG: Fix multiple bugs in PanelDailyBarReader. - Return a value from `verify_all_indices_unique` so that `panel` isn't unconditionally `None` in `PanelDailyBarReader`. - Fix a bug where we always set the volume of every asset to `1e9`. - Add minimal suite of tests for get_spot_value, which catch both of the above. NOTE: There are still several issues with `PanelDailyBarReader`. The docstring for `get_spot_value` claims that it will return -1 on days where an asset didn't trade, which isn't the case. It also claims that it will raise `NoDataOnDate` when a request is made outside the panel range, but it just raises a KeyError. We also still have no coverage for `load_raw_arrays`, so it's likely that there are more bugs lurking.	2016-05-06 10:59:14 -04:00
Jean Bredeche	a068eb374a	Merge pull request #1182 from quantopian/no-more-dups DEV: Ensure there are no duplicates in the data passed into TradingAlgorithm.run	2016-05-06 09:55:23 -04:00
Scott Sanderson	bd0f138081	TEST/MAINT: Refactor unique axis verification. Break it into a standalone function that handles any pandas type.	2016-05-05 14:20:47 -04:00
Scott Sanderson	5f190395ad	ENH: Add support for strings in Pipeline. - Adds a new class, ``LabelArray``, which is a subclass of np.ndarray. LabelArray is conceptually similar to pandas.Categorical, in that it stores data with many duplicate values as indices into an array of unique values. For string data with many duplicates (e.g. time-series of tickers or or industry classifications), this provides multiple orders of magnitude of improvement when doing string operations, especially string comparison/matching operations. - Adds a new generic object "specialization" for `AdjustedArrayWindow`, and a corresponding ObjectOverwrite adjustment. - Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``. This method is called on the final result of any pipeline expression after screen filtering has occurred. The default implementation of ``postprocess`` is identity, but Classifier overrides it to coerce string columns into pandas.Categoricals before presenting them to the user.	2016-05-04 15:50:52 -04:00
Joe Jevnik	59c8e371a2	ENH: Updates the cli, data bundles and extensions. Adds the data bundle concept which makes it easy for users to register loading functions to build out minute and daily data along with an assets db and adjustments db. By default we have provided a `quandl` bundle which pulls from the public domain WIKI dataset. Users may register new bundles by decorating an ingest function with `zipline.data.bundles.register(<name>)`. This also provides a `yahoo_equities` function for creating an ingestion function that will load a static set of assets from yahoo. The cli is now structured as a couple of subcommands and has been changed to `python -m zipline`. The old behavior of `run_algo.py` has been moved to the `run` subcommand. This is almost entirely the same except that it now takes the name of the data bundle to use, defaulting to `quandl`. The next subcommand is `ingest` which takes the name of a data bundle to ingest. This will run the loading machinery and write the data to a specified location that `run` can find. There is also a `clean` subcommand which deletes the data that was written with `ingest`. Extensions have also been added to zipline. This is an experimental feature where users can provide an extra set of python files to run at the start of the process. These can be used to configure aspects of zipline. Right now the only thing that is supported in an extension file is the registration of a new data bundle.	2016-05-03 18:38:24 -04:00
Andrew Liang	5809ae17f1	DEV: Better error message for sid= in get_open_orders Let the user to know to use asset= instead	2016-04-26 12:23:57 -04:00
Joe Jevnik	bc0b117dc9	MAINT: make the data loading apis more consistent. Changes BcolzDailyBarWriter to not be an abc, data is passed as an iterator of (sid, dataframe) pairs to the write method. Changes the AssetsDBWriter to be a single class which accepts an engine at construction time and has a `write` method for writing dataframes for the various tables. We no longer support writing the various other data types, callers should coerce their data into a dataframe themselves. See zipline.assets.synthetic for some helpers to do this. Adds many new fixtures and updates some existing fixtures to use the new ones: WithDefaultDateBounds A fixture that provides the suite a START_DATE and END_DATE. This is meant to make it easy for other fixtures to synchronize their date ranges without depending on eachother in strange ways. For example, WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should both have data for the same dates, so they may use depend on WithDefaultDates without forcing a dependency between them. WithTmpDir, WithInstanceTmpDir Provides the suite or individual test case a temporary directory. WithBcolzDailyBarReader Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzDailyBarWriter.write WithBcolzDailyBarReaderFromCSVs Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from a collection of CSV files and then converted into the bcolz data through BcolzDailyBarWriter.write_csvs WithBcolzMinuteBarReader Provides the suite a BcolzMinuteBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzMinuteBarWriter.write WithAdjustmentReader Provides the suite a SQLiteAdjustmentReader which reads from an in memory sqlite database. The data will be read from dataframes and then converted into sqlite with SQLiteAdjustmentWriter.write WithDataPortal Provides each test case a DataPortal object with data from temporary resources.	2016-04-15 23:46:10 -04:00
Scott Sanderson	1f237d43a3	MAINT: Make preprocessor factories closures.	2016-03-25 15:11:18 -04:00
Scott Sanderson	1245552340	DEV: Add expect_dimensions preprocessor.	2016-03-25 15:11:18 -04:00
Scott Sanderson	b85eb36da8	TEST: Add test for demean example.	2016-03-25 15:11:18 -04:00
Joe Jevnik	a3dbf7590e	TST: doctest failure	2016-01-13 16:36:20 -05:00
Joe Jevnik	6280614a69	DOC: whatsnew	2016-01-13 16:36:20 -05:00
Joe Jevnik	5351b60a4c	ENH: adds optionally for preprocessors	2016-01-13 15:26:37 -05:00
Joe Jevnik	5a235bdaef	ENH: allows users to specify the cutoff time for data query in blaze loaders This allows people to set their cutoff time to the time they will actually execute 'before_trading_start'. Currently this is just passed to the constructor of the loader; however, I would like to make this managed by the algorithm simulation runner. This would help keep all of the loaders in sync and lock 'before_trading_start's execution to the time the data is queried for.	2016-01-13 15:26:13 -05:00
Scott Sanderson	b6175de5f1	DOC/TEST: Add doctest and docs for coerce kwargs.	2016-01-12 17:51:13 -05:00
Scott Sanderson	43b6344d5f	ENH: Add ``coerce`` preprocessor.	2016-01-12 17:36:36 -05:00
Scott Sanderson	4469fbef76	MAINT: Just the value if dtype doesn't have a name.	2015-12-10 17:34:38 -05:00
llllllllll	48536add73	TST: fix doctests	2015-12-09 11:22:13 -05:00
Scott Sanderson	8220d1ee86	ENH: Adds support for different typed adjusted arrays and adds an EarningsCalendar loader. - Moves most of AdjustedArray back into Python. The window iterator is the only part that's performance-intensive. - Adds a bootleg templating system for creating specialized versions of AdjustedArrayWindow for each concrete type we care about. - Adds support for differently dtyped terms in pipeline. This allows us to use datetime64s which are needed in the EarningsCalendar. - Adds EarningsCalendar dataset for the next and previous earnings announcements in pipeline. - Adds in memory loader for EarningsCalendar. - Adds blaze loader for EarningsCalendar.	2015-12-08 20:24:06 -05:00
llllllllll	4238391f6f	DOC: docstring cleanup	2015-10-19 16:35:03 -04:00
llllllllll	3fb91e4d39	MAINT: cleanup doctests	2015-10-19 16:35:03 -04:00
llllllllll	e9ec709453	MAINT: expect_value doctest and lambda over toolz	2015-10-19 16:35:03 -04:00
llllllllll	22ffe3fe49	ENH: adds expect_element preprocessor	2015-10-19 16:35:03 -04:00
Stewart Douglas	4e2039c9b0	ENH: Coerce user input with API method decorator Previously we have capitalized input strings at different levels in our code: in the user-facing API methods and in the asset finder. This commit moves input string capitalization exclusively to the API method to which the string was supplied. Specifically, the string is capitalized by a preprocess API method decorator. The preprocess decorator passes the input string to the newly defined ensure_upper_case() method, which returns a TypeError if the argument supplied is not a string. ensure_upper_case() is defined in a new file, zipline/utils/input_validation.py. The existing expect_types() method is also moved there. Various tests in tests/test_assets.py are modified to account for the fact that the asset finder method lookup_symol() no longer capitalizes its supplied argument.	2015-10-08 15:41:33 -04:00

30 Commits