catalyst

mirror of https://github.com/wassname/catalyst.git synced 2026-07-05 07:20:07 +08:00

Author	SHA1	Message	Date
Eddie Hebert	8878d0ddd5	STY: Wrap/reformat lines over 80 chars. Newer versions of flake8 detect these versions, though current zipline version of flake8 does not.	2017-02-08 00:47:33 -05:00
Scott Sanderson	d82bf7a1e3	BUG: Fix bad error handling in history loader. Fixes a bug where we'd fail to raise an error if the start/end of a history window call don't aren't in the loader's calendar. We were started dropping this error after a previous change swapped out calls to `index.get_loc` with calls to `index.searchsorted` to avoid creating hash tables in pandas.	2017-01-30 13:23:47 -05:00
Eddie Hebert	b90091e494	BUG: Fix end session metadata for minute bar writer. When opening with a new `end_session`, i.e. opening for append, write the new end session to the metadata. Fixes an issue where the calendar on minute bar readers did not include the recently appended day, causing reads on the last values to fail. According, update append test to read a value, instead of checking table length.	2017-01-22 15:14:05 -05:00
Eddie Hebert	df87dfb227	ENH: Add sorted to sid list when truncating. For repeatable order of truncates between invocations.	2017-01-17 17:25:28 -05:00
Eddie Hebert	1d75143f54	ENH: Add a method to open existing minute bar directory. Remove need for a consumer that is editing an existing minute bars directory to reread the values which should not change from the metadata. Add a test to the append on new day and truncate, which would be the common usage of this method.	2017-01-17 17:25:27 -05:00
Eddie Hebert	1e51dbec0a	STY: Use def statements instead of lambda assignment. (#1639 ) From pep-0008: ``` Always use a def statement instead of an assignment statement that binds a lambda expression directly to an identifier. Yes: def f(x): return 2x No: f = lambda x: 2x The first form means that the name of the resulting function object is specifically 'f' instead of the generic '<lambda>'. This is more useful for tracebacks and string representations in general. The use of the assignment statement eliminates the sole benefit a lambda expression can offer over an explicit def statement (i.e. that it can be embedded inside a larger expression) ```	2017-01-06 13:39:07 -05:00
Kathryn Glowinski	dd560ba5f0	BUG: Datetimes should be converted in utc. (#1635 ) * BUG: Datetimes should be converted in utc. * DOC: Making note of UTC req. and moving comment.	2017-01-05 14:13:23 -05:00
Eddie Hebert	e913519734	ENH: Add a reader writer pair for HDF5 minute bar updates. This format is intended for storing data for all sids of an asset type, e.g. equities or futures for a session. bcolz is not used to avoid the overhead of creating the directories and files for each asset (which numbers around ~8000 for active equities) can be removed since the update is meant to be read at once, instead of supporting the random access pattern needed by the simulation. This patch only adds the reader/writer pair, with the management of finding the paths to delta files and the application of the updates to the bcolz write left to internal loader code. Also, the update reader interface is intentionally constrained to the data for an entire session to allow for an implementation that allows for mid-session updates.	2017-01-04 12:09:10 -05:00
Kathryn Glowinski	df6cb62925	Adjustments to Component Dfs (#1620 ) * ENH: SQLiteAdjustmentReader can return DF versions of tables.	2016-12-27 13:44:17 -05:00
Scott Sanderson	bd84338aca	Merge pull request #1599 from quantopian/memory-savings Memory savings	2016-11-28 10:52:23 -05:00
Andrew Daniels	adc9900860	MAINT: Improve minute writer handling of non-trading minutes (#1602 ) Previously, if input to the BcolzMinuteBarWriter had the first bar on a non-trading minute, the next trading session would be considered the "first day" in the input. Now, we consider the previous trading session the "first day". The intention is to correctly associate minutes after official trading hours on half days with session that closed early, not the following session (a future improvement here would be to not accept minutes outside trading hours).	2016-11-26 21:12:12 -05:00
Scott Sanderson	b091b40604	PERF: Use searchsorted instead of get_loc. On pandas < 18, `get_loc` triggers allocation of a large hash table, so we don't want to call get_loc on minutely `DatetimeIndex`es.	2016-11-22 14:26:58 -05:00
Andrew Daniels	d65af6c706	BUG: Ensure minute OHLC values can safely be converted to uint32 (#1598 ) Otherwise, we either raise an exception or filter out all unsafe values. This addresses an issue where the BcolzMinuteBarWriter would scale up values to convert to uint32, but the resulting values were too large, and would be mangled. Based on the approach we take in the BcolzDailyBarWriter.	2016-11-22 14:11:43 -05:00
Eddie Hebert	5624e0f391	BUG: Fix minute bar last traded after half day. When the following conditions occur, - a `nan` occurred after a half day (e.g. on the Monday after Thanksgiving, where the Friday would be a half day.) -data was written to the span between the early close and where the market close would have been if it were not an early close session - a `nan` also occured on the last minute of the early market session. the exisitng implementation would incorrectly return a `nan` when requesting a forward filled price. The steps that caused this error were. 1. Request for `'price'` on the market open of the day after the early close. 2. `nan` is found for that minute 3. `get_last_traded_dt` is called, and finds a volume that occurs after the early close. e.g. `18:47` when the market close was `18:00`. 4. The minute position for `18:47` is used, when calling `find_positon_of_minute`, since that value is after the `market_close` the minute is set to the position of `18:00`` due to the delta logic in 5. Since there is also no data in at `18:00`, a `nan` is returned, even though there were valid minutes earlier in the session. e.g. a non-zero volume at `16:47` should have been used, but was not. Fix by checking the current minute against the minute close when searching for the last traded minute. If the minute is greater than the market close for the corresponding day, continue the search until the minute position is within the trading session. This could also be fixed by enforcing that only zeros can be written between an early close and the minute where the close would have been, but this fix allows the reader to work with existing data.	2016-11-15 15:09:19 -05:00
Eddie Hebert	48324cf791	Merge pull request #1592 from quantopian/remove-duplicate-get-rolls MAINT: Remove duplicate get_rolls in reader.	2016-11-14 15:43:27 -05:00
Eddie Hebert	00ebae7729	MAINT: Remove duplicate get_rolls in reader. The rolls are already calculated and assigned to `rolls_by_asset` earlier in the `load_raw_arrays` method, so remove the duplication. The change should not affect results.	2016-11-11 11:09:02 -05:00
Eddie Hebert	08fb7333d6	PERF: Speed up retrieval of HistoryLoader calendar. The use of `slice_indexer` on all market minutes was taking about 110ms on my development machine. This change to getting the start and end indices changes the entire `_calendar` method to take 10ms on the same machine. Noticed while creating a `HistoryLoader` in a notebook context.	2016-11-11 10:54:07 -05:00
Eddie Hebert	57d35f6aac	BUG: Fix bad attribute lookup on session continuous future reader. Use `roll_style` not `roll`. Also, add test case to cover using the session bar reader `get_value`, by adding a test which uses `close`, since only `contract` was being exercised, which does not exercise the session daily bar reader.	2016-11-08 15:48:28 -05:00
Eddie Hebert	f7fdc56777	Merge pull request #1583 from quantopian/allow-sliding-window-to-reset ENH: Allow arbitrary history queries.	2016-11-07 22:31:13 -05:00
Eddie Hebert	6ff1d55504	ENH: Allow arbitrary history queries. In preparation for using `DataPortal` in notebooks, remove restriction on the `HistoryLoader` to dates that are monotonically increasing. Notebook usage of the `DataPortal` is more useful when the end of the history window can be arbitrary dates without having to restart the notebook kernel. Due to the implementation of the prefetch and caching logic, the end date of history calls could previously only increase. e.g. `2016-11-01`, `2016-11-02`, `2016-11-03`. This pattern was sufficient for backtesting and live simulations, since the current time of the algorithm only ever increases. With this change, which resets the underlying sliding window when the last fetched idx is greater than the Now calls to history in the same process with end dates such `2016-11-01`, `2016-10-31`, `2015-11-02` should work.	2016-11-07 16:40:51 -05:00
Andrew Daniels	f94a161c7a	BUG: Allows 'contract' in get_spot_value with daily frequency (#1582 ) Also removes duplicate check in test_current_contract.	2016-11-07 16:28:48 -05:00
Eddie Hebert	676fb9cb89	Merge pull request #1580 from quantopian/research-compatible-history-loader ENH: Allow configurable history prefetch length.	2016-11-04 14:07:33 -04:00
Eddie Hebert	a3df1e3cef	ENH: Allow configurable history prefetch length. To support using a `DataPortal` and `HistoryLoader` in a notebook, allow the prefetch length to be configurable, so that it can be set to 0. Unlike backtesting where the prefetch is useful for repeated history windows viewed from datetimes which are monotonically increasing by a small amount, the notebook usage of history windows needs only to retrieve the exact data needed for the window specified. This patch also fixes some boundary conditions related to rolls and adjustments which were uncovered by querying for the adjustments with an end date near the end of the window.	2016-11-04 13:30:30 -04:00
Andrew Daniels	a0e36d492d	PERF: Use ctable.resize to speed up BcolzMinuteBarWriter.truncate (#1578 ) This is significantly faster than the previous approach of writing a new ctable with a slice of the existing table.	2016-11-04 10:31:41 -04:00
Scott Sanderson	563a8b34f3	STY: Put 0 at the end. (#1569 )	2016-10-28 15:14:22 -04:00
Scott Sanderson	e89410dc30	MAINT: Consolidate data_portal names. Rename _get_daily_window_for_sids to _get_daily_window_data. Rename _get_minute_window_for_assets to _get_minute_window_data. Rename _get_daily_data to get_daily_spot_value.	2016-10-28 14:35:05 -04:00
Scott Sanderson	ac74a9dff5	Merge pull request #1561 from quantopian/micro-optimizations-2 Micro optimizations 2	2016-10-28 10:36:33 -04:00
Eddie Hebert	e1bafe1ecc	BUG: Use proxy for settlement on future adjustments. Instead of using the difference between the session close of the front contract before the roll and and the open of back contract on the beginning of the roll, use the close of both at the end of the session before the roll. The closes of the session prior to roll is in lieu of settlement data.	2016-10-27 12:40:59 -04:00
Scott Sanderson	48c725b5ea	PERF: Call concatenate directly instead of hstack. Avoids a couple function calls in a hot path.	2016-10-26 23:49:48 -04:00
Scott Sanderson	0cbc2ca388	PERF: Don't round until after we hstack.	2016-10-26 23:30:12 -04:00
Scott Sanderson	1e889987eb	MAINT/PERF: Remove redundant method call. `_get_minute_window_data` was just forwarding its input to a method with the same signature.	2016-10-26 23:28:34 -04:00
Scott Sanderson	d18080553b	PERF: Pull out loop-invariant code. This shaves off 20 out of 160 seconds for an algorithm that makes a large number of large universe, short window_length `history()` calls.	2016-10-26 23:27:33 -04:00
Scott Sanderson	16e3cb50cc	PERF: Use vectorized assignment into dataframe. This is a dramatic speedup (~25% in local benchmarks) for history calls with a large number of assets and a short window length.	2016-10-26 21:10:40 -04:00
Scott Sanderson	57a0822b60	BUG: Return NaT instead of None in daily reader.	2016-10-26 17:32:27 -04:00
Scott Sanderson	52b71af848	PERF: Vectorize assignments in get_history_window.	2016-10-26 17:32:27 -04:00
Scott Sanderson	fc153999e2	PERF: Remove attribute access in inner loop.	2016-10-26 17:32:27 -04:00
Eddie Hebert	9294e39ea0	MAINT: Add more info to history calendar KeyError. There have been cases where the requested start or end date is not in the history calendar. Add the beginning and of the calendar to the KeyError to give more detail to figure out root cause.	2016-10-26 14:41:37 -04:00
Eddie Hebert	642e404982	Merge pull request #1556 from quantopian/volume-based-rolls ENH: Volume based rolls for futures.	2016-10-25 15:21:41 -04:00
Eddie Hebert	473c8fddba	ENH: Volume based rolls for futures. Add roll style which takes the volume of the contracts into account. If the volume moves from the front to the back before the auto close date, the roll is put at that session. Also, factors out some of the common logic shared with calendar based rolls.	2016-10-25 14:08:21 -04:00
Eddie Hebert	a823cceabc	MAINT: Return nan from daily bcolz get_value. Match the behavior of the minute bar reader, now that the session and minute bar readers share a common interface. isnull is slightly slower than checking against -1; however, n cases where we check against illiquid trades in a tight loop, volume is checked which is not using nan. The change here should be marginal with regards to performance.	2016-10-25 11:25:09 -04:00
Eddie Hebert	18096f750a	BUG: Fix session from minute reader's last traded. The last traded dt provided from the session bar reader which resamples from minutes should provide a dt that is a session label, not one that is at the minute frequency.	2016-10-24 13:58:58 -04:00
Eddie Hebert	202b557c48	MAINT: Prevent hiding of KeyError in adjustments. If a KeyError occurred in the adjustment logic, the exception would be swallowed by the try block, which was intended to just check whether or not there was an adjustment reader adjusted. Discovered when some logic in a futures adjustment reader were failing because of a mismatch of minute and session labels, which resulted in no adjustments during windows when there should have been.	2016-10-24 11:33:00 -04:00
Eddie Hebert	e82fef41dd	PERF: Speedup minute to session sampling. The minute to session sampling reading was creating two DataFrame objects, the first to hold the minute data, and then a second returned by the `DataFrame.groupby` to sample down to sessions. Instead use the arrays returned by the minute readers `load_raw_arrays` and implement sampling logic which takes advantage that the minutes being passed start with the first minute of the first session and end with the last minute of the last session. On my machine this takes the tests in `test/test_continuous_futures` from ~4.0 to about ~0.1 seconds.	2016-10-24 09:59:22 -04:00
Eddie Hebert	ce37ea64a9	ENH: Add adjusted history for continuous futures. Add `.adj('mul')` and `.adj('add')` methods on ContinuousFuture, which when used with `history`, will calculate and apply adjustments so that the values are adjusted to account for discounts and premiums during rolls. Example usage in an algo: ``` from zipline.api import continuous_future def initialize(context): context.cl_add = continuous_future('CL', offset=0, roll='calendar').adj('add') context.cl_mul = continuous_future('CL', offset=0, roll='calendar').adj('mul') context.cl = continuous_future('CL', offset=0, roll='calendar') schedule_function(print_history) def print_history(context, data): frame = data.history([context.cl, context.cl_add, context.cl_mul], ['price', 'sid'], 20, '1d') print 'unadjusted' print frame.loc[:, :, context.cl] print 'adjusted add' print frame.loc[:, :, context.cl_add] print 'adjusted mul' print frame.loc[:, :, context.cl_mul] ```	2016-10-21 10:18:12 -04:00
Eddie Hebert	ea749b081f	MAINT: Remove unused parameter. Was left in as an artifact of development branch.	2016-10-17 17:04:10 -04:00
Eddie Hebert	3d7d2c139b	MAINT: Begin making a common adjustment interface. Start making the equity adjustments calculations for the history loader conform to the same method signature as `load_adjustments` provided by `SQLiteAdjustmentReader, so that an `AdjustmentReader` interface can begin to take form. This prepares for creating a `DispatchAdjustmentReader` which will route adjustment calculations for equities to the `HistoryCompatibleUSEquityAdjustmentReader` and continuous futures to a not yet implemented adjustment reader. All of these readers will share the `load_adjustments` method.	2016-10-17 16:29:33 -04:00
Eddie Hebert	34d4e4b974	MAINT: Perspective offset for load adjustments. Add a perspective offset to `AdjustedArrayWindow` and `AdjustedArray`, so that `HistoryLoader` does not need to twiddle with offsets to support viewing the data from the bar after end of the window, (Which is the case when a '1d' history window is retrieved in minute mode, which is explained in the docstring for `HistoryLoader.history`) Presently, this simplifies the logic in `HistoryLoader._get_adjustments_in_range`, and other incoming AdjustmentReader's, (e.g. the roll based adjustment reader for continous futures.) This patch should also make it easier for history and pipeline to converge on a singular `load_adjustments` method.	2016-10-17 14:23:39 -04:00
Eddie Hebert	2f16c08dcd	ENH: Add history for continuous futures. Enable unadjusted history for continuous futures. The history array is filled by the values for the underlying contracts, where the contract used changes based on rolls. e.g., if a `1d` history window was over the range `2016-01-20` -> `2016-02-29` with contracts with a suffix of `F16` that rolls at the beginning of the session on `2016-01-26`, `G16` on `2016-02-26`, and `H16` on `2016-03-26`. The `2016-01-20` -> `2016-01-25` portion would use the values for `F16', the `2016-01-26` -> `2016-02-25` portion would use `G16` and the `2016-02-26` -> `2016-02-29` portion would use `H16`. Using the same contracts as above, a `1m` history window over the range (using a timezone of US/Eastern) `2016-01-25 4:00PM` -> `2016-01-25 7:00PM` would fill the `4:00PM` -> `6:00PM` portion with data for `F16` and the `6:01PM` -> `7:00PM` portion with data for `G16`, since the beginning of the `2016-01-26` session is `2016-01-25 6:01PM`. Supports `1d` and `1m`. Also adds the `sid` field to `history` to assist in showing the active contract at each dt in the window.	2016-10-16 22:40:08 -04:00
Eddie Hebert	c25b3d93f4	ENH: Add current chain for continuous futures. Add `chain`field to current, as well as supporting methods in DataPortal and OrderedContracts. Enables the following example: ``` from zipline.api import continuous_future def initialize(context): context.primary_cl = continuous_future('CL', offset=0, roll='calendar') schedule_function(print_current_chain) def print_current_chain(context, data): chain = data.current_chain(context.primary_cl) print 'datetime={0}'.format(get_datetime()) print 'primary={0}'.format(chain[0]) print 'secondary={0}'.format(chain[1]) print 'tertiary={0}'.format(chain[2]) ``` ``` datetime=2015-12-23 14:31:00+00:00 primary=Future(1058201602 [CLG16]) secondary=Future(1058201603 [CLH16]) tertiary=Future(1058201604 [CLJ16]) ``` Also: - make return types of OrderedContracts methods compatible across architectures. (Noticed while adding `active_chain` method.) - Add year suffix to future contract names in test data.	2016-10-11 16:16:16 -04:00
Eddie Hebert	fea7d899cd	Merge pull request #1529 from quantopian/current-contract ENH: Add continuous future current contract.	2016-10-07 23:39:01 -04:00

1 2 3 4 5 ...

331 Commits