catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-07-03 01:58:14 +08:00

Author	SHA1	Message	Date
Scott Sanderson	10ab8dc875	MAINT: Remove double import.	2016-08-17 16:52:09 -04:00
Eddie Hebert	51eda06323	MAINT: Add equity to naming of bar data classes. In preparation of adding futures, add equity to the names of both the classes and methods for writing bcolz data. Futures data will use a different minutes per day with a separate reader. This change will allow both equity and futures fixtures to be side by side. Also, break out the method which generates the dataframes and trading days member into fixtures (`EquityMinuteBarData` and `EquityDailyBarData`) on which the `*BarReader` fixture depends. This fixture is separated out to enable reader/writers in different formats to use the same data setup. (There is internal code which needs to write minute and daily bar data in a database format.)	2016-06-30 08:21:42 -04:00
dmichalowicz	393f82e81e	ENH: Add single-column input/output capabilities to pipeline terms	2016-06-23 10:24:09 -04:00
Richard Frank	3d7f63f8c1	MAINT: Removed unused ExceptionSource No longer used since our lazy data access changes.	2016-06-15 10:43:20 -04:00
Scott Sanderson	bc302beec9	MAINT: Rework event datasets. - Refactored EventsLoader and BlazeEventsLoader to not require a subclass per dataset. Instead, you now pass a map from columns to event fields directly to the EventsLoader constructor. - Removed a large number of Quantopian-specific datasets and associated tests. - Rewrote the core logic of EventsLoader and BlazeEventsLoader to share index calculations across multiple requested columns. - Fixed a bug where event fields were incorrectly forward-filled when null values were present in an event.	2016-06-10 19:22:27 -04:00
Scott Sanderson	5f190395ad	ENH: Add support for strings in Pipeline. - Adds a new class, ``LabelArray``, which is a subclass of np.ndarray. LabelArray is conceptually similar to pandas.Categorical, in that it stores data with many duplicate values as indices into an array of unique values. For string data with many duplicates (e.g. time-series of tickers or or industry classifications), this provides multiple orders of magnitude of improvement when doing string operations, especially string comparison/matching operations. - Adds a new generic object "specialization" for `AdjustedArrayWindow`, and a corresponding ObjectOverwrite adjustment. - Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``. This method is called on the final result of any pipeline expression after screen filtering has occurred. The default implementation of ``postprocess`` is identity, but Classifier overrides it to coerce string columns into pandas.Categoricals before presenting them to the user.	2016-05-04 15:50:52 -04:00
Joe Jevnik	59c8e371a2	ENH: Updates the cli, data bundles and extensions. Adds the data bundle concept which makes it easy for users to register loading functions to build out minute and daily data along with an assets db and adjustments db. By default we have provided a `quandl` bundle which pulls from the public domain WIKI dataset. Users may register new bundles by decorating an ingest function with `zipline.data.bundles.register(<name>)`. This also provides a `yahoo_equities` function for creating an ingestion function that will load a static set of assets from yahoo. The cli is now structured as a couple of subcommands and has been changed to `python -m zipline`. The old behavior of `run_algo.py` has been moved to the `run` subcommand. This is almost entirely the same except that it now takes the name of the data bundle to use, defaulting to `quandl`. The next subcommand is `ingest` which takes the name of a data bundle to ingest. This will run the loading machinery and write the data to a specified location that `run` can find. There is also a `clean` subcommand which deletes the data that was written with `ingest`. Extensions have also been added to zipline. This is an experimental feature where users can provide an extra set of python files to run at the start of the process. These can be used to configure aspects of zipline. Right now the only thing that is supported in an extension file is the registration of a new data bundle.	2016-05-03 18:38:24 -04:00
Joe Jevnik	bc0b117dc9	MAINT: make the data loading apis more consistent. Changes BcolzDailyBarWriter to not be an abc, data is passed as an iterator of (sid, dataframe) pairs to the write method. Changes the AssetsDBWriter to be a single class which accepts an engine at construction time and has a `write` method for writing dataframes for the various tables. We no longer support writing the various other data types, callers should coerce their data into a dataframe themselves. See zipline.assets.synthetic for some helpers to do this. Adds many new fixtures and updates some existing fixtures to use the new ones: WithDefaultDateBounds A fixture that provides the suite a START_DATE and END_DATE. This is meant to make it easy for other fixtures to synchronize their date ranges without depending on eachother in strange ways. For example, WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should both have data for the same dates, so they may use depend on WithDefaultDates without forcing a dependency between them. WithTmpDir, WithInstanceTmpDir Provides the suite or individual test case a temporary directory. WithBcolzDailyBarReader Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzDailyBarWriter.write WithBcolzDailyBarReaderFromCSVs Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from a collection of CSV files and then converted into the bcolz data through BcolzDailyBarWriter.write_csvs WithBcolzMinuteBarReader Provides the suite a BcolzMinuteBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzMinuteBarWriter.write WithAdjustmentReader Provides the suite a SQLiteAdjustmentReader which reads from an in memory sqlite database. The data will be read from dataframes and then converted into sqlite with SQLiteAdjustmentWriter.write WithDataPortal Provides each test case a DataPortal object with data from temporary resources.	2016-04-15 23:46:10 -04:00
Eddie Hebert	16fd6681a6	ENH: Rewrite of Zipline to use lazy access pattern More documentation to follow in release notes. Based on lazy-mainline branch, see for more details. Also-By: Jean Bredeche <jean@quantopian.com> Also-By: Andrew Liang <aliang@quantopian.com> Also-By: Abhijeet Kalyan <akalyan@quantopian.com>	2016-04-04 16:12:58 -04:00
Maya Tydykov	6b3560ade8	MAINT: remove redundant function and move to utils	2016-03-29 13:12:51 -04:00
Scott Sanderson	16c5aecba6	DEV: Add utility for permuting rows in an array. Useful for testing rank-order functions on arrays.	2016-03-25 15:11:18 -04:00
Joe Jevnik	721dd36116	TST: move test_utils and adds test fixture classes Renames zipline.utils.test_utils to zipline.testing Adds zipline.testing.fixtures.ZiplineTestCase to manage setup and teardown and adds mixins to define fixtures like an asset finder or trading calendar.	2016-03-10 15:39:52 -05:00

12 Commits