catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-06-28 19:47:13 +08:00

Author	SHA1	Message	Date
Maya Tydykov	11d666daaa	TST: add test for 13d filings dataset MAINT: add 13d filings to factors init MAINT: rename constant MAINT: add event_date_col field	2016-04-28 11:59:49 -04:00
Maya Tydykov	e726cc94c9	ENH: add 13d filings dataset to pipeline	2016-04-28 11:53:45 -04:00
Andrew Liang	d69b960c49	BUG: Don't save empty positions when user access non-existent position Previously, whenever we try to access a missing value on the Positions dict, we return a default Position and save it to the dict. Instead, just return the Position	2016-04-26 13:28:35 -04:00
Andrew Liang	5809ae17f1	DEV: Better error message for sid= in get_open_orders Let the user to know to use asset= instead	2016-04-26 12:23:57 -04:00
Jean Bredeche	c404c60d68	BUG: don't allow ordering in before_trading_start	2016-04-26 10:56:36 -04:00
Maya Tydykov	b7765fe0d3	Merge pull request #1153 from quantopian/filter-nulls-in-expected-cols Filter nulls in expected cols	2016-04-25 16:32:45 -04:00
Maya Tydykov	0191d9d903	MAINT: move filtering for null date rows back to dataframe TST: test both next and prev event frame loading and use EventsLoader. BUG: remove extra arg MAINT: call list on zip for compatibility with python 3	2016-04-25 16:11:12 -04:00
Maya Tydykov	390295481c	TST: add test for blaze loader with null data in date col MAINT: fix blaze query	2016-04-25 11:42:10 -04:00
Maya Tydykov	e41c99d077	MAINT: add an event date col field to each loader MAINT: add event date col field and filter rows where this field is null TST: modify tests to filter nulls in event date col MAINT: calculate value repeats by vectorized computation on separate start and end dates. MAINT: pass DatetimeIndex instead of list of strings	2016-04-25 11:42:08 -04:00
Maya Tydykov	f8aa7c2ef4	TST: add test for case when null in expected column	2016-04-25 11:42:06 -04:00
Jean Bredeche	02ded435f6	DEV: Don't log an error if we can't find a matching asset/field/day triple in fetcher data	2016-04-25 09:47:18 -04:00
Eddie Hebert	66d05aa563	PERF: Improve read time for smaller num of assets. The BcolzDailyBarReader was optimized for the pipeline case of reading all assets at once. Now that the reader is also used to support daily history the case of reading a data for a small number of assets is more common, particularly in algorithms that use the history API which have a high rotation of assets (e.g. an algorithm which pipeline uses to set the active universe) Remove the bottleneck in reading a small number of assets by conditionally reading the slice for each asset from the carray, instead of reading the data for all equities and then indexing into that full array. On a certain number of assets, it is still better to read all the data at once. On the Quantopian dataset, which holds data for 20000 about for the last 10 years of equity data (where not all equities trade over the full range), stored in 118 blosc blp files per column, the tipping point where the 'read all' mode wins out between 3000-4000 assets. That number was tested by trying to exercise a worst case scenario where the equities were spread out evenly across the blp files, by stepping along a sorted list of assets that were alive over a query range which spanned 70 trading days. ``` size = 3000 sids = [assets[i] for i in range(0, len(assets), len(assets) / size)][:size] ``` Also, add parameter to WithBcolzDailyBarReader fixture which allows the test to specify what the threshold count for reading all data should be, so that the test_us_equity_pricing can be forced into either mode to make sure that both branches in logic are covered by all test cases. On local dev machine this patch improves the read time of `load_raw_array` for one asset from 100 ms to 96.5 µs. (10^5 improvement.) With reading only asset per call a being an observed common case when populating the non-cached values in USEquityHistoryLoader.	2016-04-21 20:43:52 -04:00
Maya Tydykov	e5ccd814e8	Merge pull request #1143 from quantopian/add-final-val-col-to-estimates ENH: add actual value column to estimates dataset.	2016-04-21 16:23:55 -04:00
Jean Bredeche	9d1e15ddde	BUG: Fetcher wasn't working properly in `before_trading_start`. We were trying to use the previous day in before_trading_start because we were looking for the previous market minute, then normalizing it. That's no longer the case, as we want to use today's date for fetcher lookups in before_trading_start. Also refactored a bit how dataportal determines if a query should be routed to the fetcher data structures.	2016-04-21 15:09:14 -04:00
Jean Bredeche	6423a2cfbd	Merge branch 'master' into check-keyword-args	2016-04-21 12:31:45 -04:00
Maya Tydykov	bd58140b97	ENH: add actual value column to estimates dataset.	2016-04-21 11:45:00 -04:00
Jean Bredeche	c323506f40	BUG: we were improperly checking iterable kwargs in BarData	2016-04-21 11:06:46 -04:00
dmichalowicz	d9bfcaabde	ENH: Support multiple outputs for custom factors	2016-04-21 10:57:29 -04:00
Maya Tydykov	1531568899	ENH: add custom dataset for estimize MAINT: alphabetize constants MAINT: remove obsolete column TST: refactor tests to use common code MAINT: remove unneeded fields from dataset MAINT: remove obsolete earnings estimates columns and refactor	2016-04-19 11:29:03 -04:00
Andrew Liang	8aac0ab19f	BUG: Week rule plus time rule doesn't work The next trigger for the week rule get recalculated every time the rule is triggered	2016-04-18 17:05:43 -04:00
Joe Jevnik	bc0b117dc9	MAINT: make the data loading apis more consistent. Changes BcolzDailyBarWriter to not be an abc, data is passed as an iterator of (sid, dataframe) pairs to the write method. Changes the AssetsDBWriter to be a single class which accepts an engine at construction time and has a `write` method for writing dataframes for the various tables. We no longer support writing the various other data types, callers should coerce their data into a dataframe themselves. See zipline.assets.synthetic for some helpers to do this. Adds many new fixtures and updates some existing fixtures to use the new ones: WithDefaultDateBounds A fixture that provides the suite a START_DATE and END_DATE. This is meant to make it easy for other fixtures to synchronize their date ranges without depending on eachother in strange ways. For example, WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should both have data for the same dates, so they may use depend on WithDefaultDates without forcing a dependency between them. WithTmpDir, WithInstanceTmpDir Provides the suite or individual test case a temporary directory. WithBcolzDailyBarReader Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzDailyBarWriter.write WithBcolzDailyBarReaderFromCSVs Provides the suite a BcolzDailyBarReader which reads from bcolz data written to a temporary directory. The data will be read from a collection of CSV files and then converted into the bcolz data through BcolzDailyBarWriter.write_csvs WithBcolzMinuteBarReader Provides the suite a BcolzMinuteBarReader which reads from bcolz data written to a temporary directory. The data will be read from dataframes and then converted to bcolz files with BcolzMinuteBarWriter.write WithAdjustmentReader Provides the suite a SQLiteAdjustmentReader which reads from an in memory sqlite database. The data will be read from dataframes and then converted into sqlite with SQLiteAdjustmentWriter.write WithDataPortal Provides each test case a DataPortal object with data from temporary resources.	2016-04-15 23:46:10 -04:00
Eddie Hebert	5f9d0a148d	BUG: Prevent out of order history arrays. Fix a bug where if history were called with assets `[1, 2]` and then subsequently, `[2, 1]`, the loader would return the cached array in order for `[1, 2]`. Instead cache an AdjustedArray for each asset, then when a history window is requested, check if each asset has a sufficient cache, and if not then read values for the assets which are missing or need to be refreshed. An added benefit of this change is that if a subsequent call to history has a smaller number of assets than the previous, no new data needs to be read from disk. e.g. a call with assets `[1, 2, 3]` and then `[1, 2]` would use the cached values for `1` and `2` from the first call. Conversely, if the second call has more assets, then only the data for the new assets needs to be retrieved. e.g. a history with `[1, 2]`, then `[1, 2, 3]` would only need (assuming `1` and `2` have not expired) to retrieve data for `3`. Unfortunately, the benefit here is not great because `load_raw_arrays` is optimized for reading many assets, and pulls the entire daily bar dataset into memory. This change makes tuning `load_raw_arrays` so that faster reads (e.g. by slicing from the carray for each asset, instead of pulling all data into a numpy array), when only a few assets are requested, more beneficial than it would have been previously.	2016-04-15 22:44:00 -04:00
Andrew Liang	6d6cd58c3b	BUG: Recalculate trigger for week rule if we miss the first one If we start the simulation on a day so that we miss the trigger (the first for the sim) for that week, recalculate the trigger for next week	2016-04-15 15:09:08 -04:00
Andrew Liang	1ee3c5f049	BUG: week_end rule with offset=0 skips every other week	2016-04-15 10:17:18 -04:00
Eddie Hebert	76e14eda2f	ENH: Add expiring cache. Add a cache interface which supports expirable entries with a changeable backend for the cache into which they are entered. The default cache is a `dict` but could swapped for `cachetools.LRUCache` or any other cache which supports `__get__`, `__set__`, and `__del__`. So that consumers can change the use of `CachedObjects` stored in a cache from: ``` self._cache = {} ... try: obj = self._cache[key] try: return obj.unwrap(dt) except Expired: pass except KeyError: pass ... self._cache[key] = CachedObject(value, new_expiration) ``` to: ``` self._cache = ExpiringCache(LRUCache(maxsize=6)) ... try: return self._cache.get(key, dt) except KeyError: # Get fresh value ... self._cache.set(key, value, new_expiration) ```	2016-04-14 16:10:32 -04:00
Jean Bredeche	63bd7589b7	BUG: support passing an empty list to `data` methods. Our type checking code was a bit too aggressive.	2016-04-14 11:11:08 -04:00
Andrew Liang	8dc3ed73ab	FIX: Check types of args passed to api methods on data	2016-04-13 09:47:07 -04:00
Jean Bredeche	fac5905c10	Merge pull request #1114 from quantopian/handle-data-optional ENH: make handle_data optional	2016-04-13 09:31:41 -04:00
Richard Frank	70befd490b	MAINT: Don't store data portal everywhere Removed lots of data portal references that participated in ref cycles and prevented deterministic cleanup of dbs.	2016-04-12 19:33:22 -04:00
Richard Frank	8b610a2ab7	TST: Cleaned up test references to adjustments db If we don't clean them up, then windows can delete the temp dir with the db.	2016-04-12 19:33:22 -04:00
Richard Frank	5254b273b2	PERF: Reimplemented remember_last with a weak_lru_cache which won't leak instances whose methods have been decorated (specifically DataPortal instances) MAINT: Not using functools32 anymore	2016-04-12 19:33:21 -04:00
Richard Frank	32a400a9fb	BUG: Fixing bitness issues on 32-bit systems by being explicit with sizes	2016-04-12 17:07:50 -04:00
Eddie Hebert	8313c8c36c	Merge pull request #1125 from quantopian/enforce-sorted-on-minute-bars BUG: Enforce sorted order on minutes to delete.	2016-04-12 15:18:30 -04:00
Eddie Hebert	d27f85e16b	BUG: Enforce sorted order on minutes to delete. The intervals are returned as a set, so order is not guaranteed, which becomes exposed when reading windows which span multiple years. The deletion of values from the regular sized minute array assumes that intervals can be reversed to delete the array from the back.	2016-04-12 14:16:10 -04:00
Jean Bredeche	bd5e2b183d	BUG: Properly log partially filled sell orders.	2016-04-12 13:57:50 -04:00
Jean Bredeche	f6902f0368	BUG: bar_data.history too limiting on iterable types In before_trading_start, history needs to call DataPortal.get_adjustments, and that method wasn’t correctly checking for iterables.	2016-04-11 14:02:27 -04:00
Scott Sanderson	4449f289c2	TEST: Test that the mask is what we expect.	2016-04-07 17:29:47 -04:00
dmichalowicz	8db59b387b	TST: Overhaul test case	2016-04-07 17:29:47 -04:00
dmichalowicz	5bae74adda	ENH: Allow passing a mask when creating a factor	2016-04-07 17:29:47 -04:00
Eddie Hebert	0a3a2f3653	BUG: Ensure matched input length to minute writer. When the dts and length of cols are mismatched the writer behaves in unintended ways. e.g. in a case where a consumer passed dts which had minutes with no trades removed, but regular (market minute for day) sized arrays for the data with `0`'s on minutes without trades, the non trade minutes from cols are written to slots in the output where a trade is intended. Protect against this misuse by checking that all lengths are equal when using the `write_cols` method. Make a separate `_write_cols` method for use by both `write_cols` and `write`, since the `write` method which takes a DataFrame has the matched input length enforced by the DataFrame.	2016-04-07 13:53:59 -04:00
Jean Bredeche	4203c54417	ENH: make handle_data optional	2016-04-07 09:50:09 -04:00
Andrew Liang	a8491879ce	FIX: time_rules should trigger only at dt specified Previously, time_rules triggered when the dt specified has passed	2016-04-05 17:51:10 -04:00
Jean Bredeche	dc01c45dc4	DEV: Apply adjustments for portfolio and account in BTS completely copied from https://github.com/quantopian/zipline/pull/1104/ All credit goes to Andrew Liang (@lianga888)	2016-04-05 11:37:34 -04:00
Eddie Hebert	16fd6681a6	ENH: Rewrite of Zipline to use lazy access pattern More documentation to follow in release notes. Based on lazy-mainline branch, see for more details. Also-By: Jean Bredeche <jean@quantopian.com> Also-By: Andrew Liang <aliang@quantopian.com> Also-By: Abhijeet Kalyan <akalyan@quantopian.com>	2016-04-04 16:12:58 -04:00
Eddie Hebert	be08a77d76	BUG: Prevent writing int max instead of nan. np.array.astype can not be relied upon to convert nan's reliably to 0 Fix by calling nan_to_num on the float arrays before converting to uint32.	2016-03-30 14:35:06 -04:00
Maya Tydykov	e8185a1512	MAINT: reorganize - move testing mixin to fixtures BUG: correctly create asset finder MAINT: rename fixture STY: fixes for flake8 STY: add space around assignment MAINT: add var back to constructor MAINT: remove unused import MAINT: compare var with None directly MAINT: fix merge errors	2016-03-29 13:15:16 -04:00
Maya Tydykov	06dd6e958d	TST: recfator tests to use fixtures MAINT: use np.array MAINT: return cols rather than modifying attribute	2016-03-29 13:12:50 -04:00
Maya Tydykov	8a28e82d32	ENH: add dividends to pipeline MAINT: remove record date - not needed. MAINT: restructure dividends dataset. MAINT: restructure dividends factors. WIP: update dividends tests. MAINT: correct the way to get the 'next' event frame.	2016-03-29 13:12:50 -04:00
Scott Sanderson	9a04621781	ENH: Add eq and __ne__ to Classifier.	2016-03-28 15:46:28 -04:00
Scott Sanderson	0ebb72fe0d	TEST: Explicitly use int64 everywhere. Otherwise these tests will fail on 32-bit systems.	2016-03-28 12:21:58 -04:00

1 2 3 4 5 ...

937 Commits