catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-07-01 09:59:46 +08:00

Author	SHA1	Message	Date
Eddie Hebert	f4891b0a08	TST: Key trading calendar fixture with Asset types Instead of using strings of 'equities' and 'futures', use the Asset subclasses to key the trading calendar fixtures.	2016-08-08 03:49:48 -04:00
Eddie Hebert	dd2c7db22d	TST: Use sum for volume on daily data resample. Change the mock minute data to no longer use an increasing arange, so that a days worth of minute data can be summed and fit inside of a uint32. This change was required because of working on new test data that looked like [0, 100, 200, 0, ] which was resulting in a daily rollup of 0 data, when the coverage needed a non-0 value. Also, factor out the resampling function, with an eye on a making it easier to convert from minute bars to daily bars during ingest/load processes.	2016-08-05 14:24:14 -04:00
Eddie Hebert	e934c6aeaf	TST: Make room for multiple calendars in tests. When adding fixtures for futures data, there will be a need for multiple calendars in the fixture ecosystem. e.g. a test that includes both equities and futures would need an overall calendar which encompasses both equities and futures; however, the test data for equities should still still be limited to the bounds set by the NYSE calendar. Make the fixtures that setup trading calendars and values dervied from the trading calendar (e.g. trading sessions) accept an iterable of calendars which need to be created, then populate those values into a dict keyed by the calendar name. Change `WithNYSETradingDays` to include sessions in the name, since we are moving to session as the name for the 'day' unit. Provide `trading_days` which is really "NYSE trading sessions` on `WithTradingSessions` for backwards compatibility.	2016-08-05 12:17:27 -04:00
Jean Bredeche	9ae725b940	ENH: update `register_calendar` API to take a specific name	2016-08-02 23:12:07 -04:00
Joe Jevnik	4265a13edf	Revert "Merge pull request #1354 from quantopian/revert-1302-point-in-time-asset-db" This reverts commit `3b633011c6`, reversing changes made to `70ac5323de`.	2016-08-02 14:25:10 -04:00
Joe Jevnik	814a2be7b7	Revert "Point in time asset db"	2016-07-27 23:29:08 -04:00
Joe Jevnik	bc10447b9e	TST: add assert_equal dispatch for other ndframe objects	2016-07-26 13:34:58 -04:00
Jean Bredeche	5a0f840917	Clean up daily bar reader/writer to take advantage of new trading calendar. The reader is backwards-compatible with the previous format. In USEquityLoader, use dailyreader's trading_calendar. This is backwards compatible and will fall back to the NYSE calendar if the reader doesn’t have a calendar specified.	2016-07-15 15:13:57 -04:00
Jean Bredeche	295cfa3846	Fix some mistakes from a previous merge. No tests failed, which was worrisome. Will file issues to take a look later.	2016-07-14 15:40:36 -04:00
Jean Bredeche	e22108b7ef	Merge pull request #1312 from quantopian/24-5-backtesting Re-implemented the calendar API.	2016-07-14 10:05:18 -04:00
Joe Jevnik	958d455a7a	ENH: Support default params for terms	2016-07-12 18:49:24 -04:00
Jean Bredeche	6fb4923cc7	Re-implemented the Calendar API. Instead of having separate ExchangeCalendar and TradingSchedule objects, we now just have TradingCalendar. The TradingCalendar keeps track of each session (defined as a contiguous set of minutes between an open and a close). It's also responsible for handling the grouping logic of any given minute to its containing session, or the next/previous session if it's not a market minute for the given calendar.	2016-07-12 13:13:50 -04:00
Eddie Hebert	51eda06323	MAINT: Add equity to naming of bar data classes. In preparation of adding futures, add equity to the names of both the classes and methods for writing bcolz data. Futures data will use a different minutes per day with a separate reader. This change will allow both equity and futures fixtures to be side by side. Also, break out the method which generates the dataframes and trading days member into fixtures (`EquityMinuteBarData` and `EquityDailyBarData`) on which the `*BarReader` fixture depends. This fixture is separated out to enable reader/writers in different formats to use the same data setup. (There is internal code which needs to write minute and daily bar data in a database format.)	2016-06-30 08:21:42 -04:00
dmichalowicz	393f82e81e	ENH: Add single-column input/output capabilities to pipeline terms	2016-06-23 10:24:09 -04:00
Richard Frank	69b6cff964	Merge pull request #1289 from quantopian/wildcard wildcard object and doctests	2016-06-22 18:09:57 -04:00
Joe Jevnik	d608e0af4f	Merge pull request #1276 from quantopian/blaze-loader-checkpoints ENH: add ffill checkpointing to blaze core loader	2016-06-21 16:48:08 -04:00
Joe Jevnik	5925107052	TST: fix doctests to actually run	2016-06-21 15:07:03 -04:00
Joe Jevnik	c0d08f9c0d	TST: Adds wildcard object for assert_equal	2016-06-20 14:20:18 -04:00
Eddie Hebert	9f02f147b0	Merge pull request #1283 from quantopian/custom-paths-for-fixtures TST: Allow customization of various fixture paths.	2016-06-20 10:08:27 -04:00
Joe Jevnik	cb67ee425e	TST: coverage	2016-06-17 17:59:56 -04:00
Eddie Hebert	d6793e7a71	TST: Allow customization of various fixture paths. To support testing configurations which need control over the full path to the asset, adjustment, and equity bcolz directories; which is required by some of our internal testing which exercises servers which coordinate these files via a date slug in the full path. Also, by allowing customization of the full path, it is now possible to have the AssetFinder and AdjustmentReader sqlite databases be written to disk, which is also required for our server testing setup.	2016-06-17 16:13:31 -04:00
Richard Frank	3d7f63f8c1	MAINT: Removed unused ExceptionSource No longer used since our lazy data access changes.	2016-06-15 10:43:20 -04:00
Scott Sanderson	bc302beec9	MAINT: Rework event datasets. - Refactored EventsLoader and BlazeEventsLoader to not require a subclass per dataset. Instead, you now pass a map from columns to event fields directly to the EventsLoader constructor. - Removed a large number of Quantopian-specific datasets and associated tests. - Rewrote the core logic of EventsLoader and BlazeEventsLoader to share index calculations across multiple requested columns. - Fixed a bug where event fields were incorrectly forward-filled when null values were present in an event.	2016-06-10 19:22:27 -04:00
Andrew Daniels	02a91ec4ab	MAINT: Removes the set_first_trading_day method of DataPortal Since the first trading day is now passed directly to the DataPortal on init, there's no need for a method that does this. Moves all the additional logic/assignments into the init. Also corrects an issue where we would never create certain attributes if self._first_trading_day was None. Adds the ability to specify the first trading day for a data portal in a test case when using the WithDataPortal fixture.	2016-06-08 13:34:23 -04:00
jfkirk	d437a5d675	MAINT: Rebase fixes	2016-06-08 13:34:23 -04:00
jfkirk	2a8f69fc01	MAINT: DataPortal env -> asset_finder	2016-06-08 13:34:22 -04:00
jfkirk	d9fc514fa8	TST: Adds TradingSchedule test fixture	2016-06-08 13:34:20 -04:00
jfkirk	26742dda67	MAINT: Removes obsolete tradingcalendar module	2016-06-08 13:34:19 -04:00
jfkirk	241abda2a5	STY: Flake8	2016-06-08 13:34:19 -04:00
jfkirk	4b7390ac81	WIP: Refactors tests to use TradingSchedule	2016-06-08 13:34:19 -04:00
jfkirk	c8304e8601	ENH: Adds ExchangeCalendar, TradingSchedule, and implementations Conflicts: tests/data/test_minute_bars.py tests/data/test_us_equity_pricing.py tests/finance/test_slippage.py tests/pipeline/test_engine.py tests/pipeline/test_us_equity_pricing_loader.py tests/serialization_cases.py tests/test_algorithm.py tests/test_assets.py tests/test_bar_data.py tests/test_benchmark.py tests/test_exception_handling.py tests/test_fetcher.py tests/test_finance.py tests/test_history.py tests/test_perf_tracking.py tests/test_security_list.py tests/utils/test_events.py zipline/algorithm.py zipline/data/data_portal.py zipline/data/us_equity_loader.py zipline/errors.py zipline/finance/trading.py zipline/testing/core.py zipline/utils/events.py	2016-06-08 13:34:18 -04:00
Andrew Daniels	71f12ec272	MAINT: Adds first_trading_day arg to DataPortal Instead of inferring it from the minute/daily writer, we now require the first trading day to be passed explicitly, so the creator of the DataPortal controls what is used as the first trading day.	2016-06-02 13:16:43 -04:00
Eddie Hebert	2f80e94203	TST: Enable sourcing daily data from minute data. Allow `WithBcolzDailyBarData` to opt-in to reading data defined by `WithBcolzMinuteBarData`, so that the daily and minute test for the same asset and dts correlate between the two readers. The correlation is relevant for history tests which blend daily and minute data. Also, make the test data for the split and mergers assets in the minute suite align at the thousands place if the adjustmets are applied correctly, by starting the prices with a base of 4000 and then halving the start value each day.	2016-06-02 12:28:53 -04:00
Scott Sanderson	c03bbbc928	BUG: Delete attrs before firing callbacks. Prevents failures to remove sqlite files when cleaning up temporary directories.	2016-05-25 14:17:57 -04:00
Joe Jevnik	784d5f4a16	Merge pull request #1199 from quantopian/boybands-factor BollingerBands factor	2016-05-13 15:35:10 -04:00
Joe Jevnik	a345e6f3f5	TST: Clean up metaclass usage in fixtures	2016-05-12 17:00:51 -04:00
Joe Jevnik	9b76731143	ENH: adds with_metaclasses and tests for metautils	2016-05-12 15:58:19 -04:00
Maya Tydykov	6b60e447a0	MAINT: incorporate string support STY: remove unused imports MAINT: change dtype to object for compatibility with python3 MAINT: rename pipeline columns and constants for clarity MAINT: rename column	2016-05-12 10:50:31 -04:00
Scott Sanderson	8de45540f2	ENH: NaN semantics for LabelArray missing values.	2016-05-04 15:54:50 -04:00
Scott Sanderson	bb6f908036	TEST: Add test for categorical postprocessing.	2016-05-04 15:54:50 -04:00
Scott Sanderson	5f190395ad	ENH: Add support for strings in Pipeline. - Adds a new class, ``LabelArray``, which is a subclass of np.ndarray. LabelArray is conceptually similar to pandas.Categorical, in that it stores data with many duplicate values as indices into an array of unique values. For string data with many duplicates (e.g. time-series of tickers or or industry classifications), this provides multiple orders of magnitude of improvement when doing string operations, especially string comparison/matching operations. - Adds a new generic object "specialization" for `AdjustedArrayWindow`, and a corresponding ObjectOverwrite adjustment. - Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``. This method is called on the final result of any pipeline expression after screen filtering has occurred. The default implementation of ``postprocess`` is identity, but Classifier overrides it to coerce string columns into pandas.Categoricals before presenting them to the user.	2016-05-04 15:50:52 -04:00
Joe Jevnik	59c8e371a2	ENH: Updates the cli, data bundles and extensions. Adds the data bundle concept which makes it easy for users to register loading functions to build out minute and daily data along with an assets db and adjustments db. By default we have provided a `quandl` bundle which pulls from the public domain WIKI dataset. Users may register new bundles by decorating an ingest function with `zipline.data.bundles.register(<name>)`. This also provides a `yahoo_equities` function for creating an ingestion function that will load a static set of assets from yahoo. The cli is now structured as a couple of subcommands and has been changed to `python -m zipline`. The old behavior of `run_algo.py` has been moved to the `run` subcommand. This is almost entirely the same except that it now takes the name of the data bundle to use, defaulting to `quandl`. The next subcommand is `ingest` which takes the name of a data bundle to ingest. This will run the loading machinery and write the data to a specified location that `run` can find. There is also a `clean` subcommand which deletes the data that was written with `ingest`. Extensions have also been added to zipline. This is an experimental feature where users can provide an extra set of python files to run at the start of the process. These can be used to configure aspects of zipline. Right now the only thing that is supported in an extension file is the registration of a new data bundle.	2016-05-03 18:38:24 -04:00
Joe Jevnik	efac476976	ENH: make BcolzMinuteBarWriter.write take iterable Updates the BcolzMinuteBarWriter.write api to allow users to pass their data as a stream instead of requiring that they loop over their data externally. This matches the API presented by BcolzDailyBarWriter.	2016-04-29 16:14:48 -04:00
Maya Tydykov	0191d9d903	MAINT: move filtering for null date rows back to dataframe TST: test both next and prev event frame loading and use EventsLoader. BUG: remove extra arg MAINT: call list on zip for compatibility with python 3	2016-04-25 16:11:12 -04:00
Maya Tydykov	e41c99d077	MAINT: add an event date col field to each loader MAINT: add event date col field and filter rows where this field is null TST: modify tests to filter nulls in event date col MAINT: calculate value repeats by vectorized computation on separate start and end dates. MAINT: pass DatetimeIndex instead of list of strings	2016-04-25 11:42:08 -04:00
Eddie Hebert	a13e336ef5	Merge pull request #1157 from quantopian/use-carray-instead-of-read-all-on-small-size PERF: Improve read time for smaller num of assets.	2016-04-21 22:25:01 -04:00
Eddie Hebert	66d05aa563	PERF: Improve read time for smaller num of assets. The BcolzDailyBarReader was optimized for the pipeline case of reading all assets at once. Now that the reader is also used to support daily history the case of reading a data for a small number of assets is more common, particularly in algorithms that use the history API which have a high rotation of assets (e.g. an algorithm which pipeline uses to set the active universe) Remove the bottleneck in reading a small number of assets by conditionally reading the slice for each asset from the carray, instead of reading the data for all equities and then indexing into that full array. On a certain number of assets, it is still better to read all the data at once. On the Quantopian dataset, which holds data for 20000 about for the last 10 years of equity data (where not all equities trade over the full range), stored in 118 blosc blp files per column, the tipping point where the 'read all' mode wins out between 3000-4000 assets. That number was tested by trying to exercise a worst case scenario where the equities were spread out evenly across the blp files, by stepping along a sorted list of assets that were alive over a query range which spanned 70 trading days. ``` size = 3000 sids = [assets[i] for i in range(0, len(assets), len(assets) / size)][:size] ``` Also, add parameter to WithBcolzDailyBarReader fixture which allows the test to specify what the threshold count for reading all data should be, so that the test_us_equity_pricing can be forced into either mode to make sure that both branches in logic are covered by all test cases. On local dev machine this patch improves the read time of `load_raw_array` for one asset from 100 ms to 96.5 µs. (10^5 improvement.) With reading only asset per call a being an observed common case when populating the non-cached values in USEquityHistoryLoader.	2016-04-21 20:43:52 -04:00
Richard Frank	8c92f2d241	TST: What if we don't gc... Looks like we removed ref cycles elsewhere, so windows builds are passing without this.	2016-04-21 18:41:57 -04:00
Jean Bredeche	9d1e15ddde	BUG: Fetcher wasn't working properly in `before_trading_start`. We were trying to use the previous day in before_trading_start because we were looking for the previous market minute, then normalizing it. That's no longer the case, as we want to use today's date for fetcher lookups in before_trading_start. Also refactored a bit how dataportal determines if a query should be routed to the fetcher data structures.	2016-04-21 15:09:14 -04:00
Maya Tydykov	1531568899	ENH: add custom dataset for estimize MAINT: alphabetize constants MAINT: remove obsolete column TST: refactor tests to use common code MAINT: remove unneeded fields from dataset MAINT: remove obsolete earnings estimates columns and refactor	2016-04-19 11:29:03 -04:00

1 2

67 Commits