Combine the equity and future readers into asset dispatch readers, so
that simulations that use both asset types can access data for each.
This patch enables `history` for future assets in algorithms; however,
it does not add extra coverage in the `test_data_portal` or `test_history`
to cover future assets. Those tests will follow, however putting this in
separately since it shows that the wrapping of the readers in the asset
dispatch reader does not break existing equity strategies.
Add `AssetDispatchSessionBarReader` and corresponding minute and session
bar version of that reader.
This reader routes requests to the appropriate reader based on the asset
type of the requested sids.
`load_raw_array` in the dispatch reader batches the sid by asset type
and then interleaves the results in the out arrays, so that the arrays
data corresponds with sids in the order that sids are passed to the
method, to meet the expected behavior of `load_raw_arrays`.
The dispatch redaer is intended for use by the data portal when using
both future and equities. The dispatch reader will also be passed to the
to the `HistoryLoader`s contained within the data portal, where the
batched `load_raw_arrays` will be used.
Also, BUG:
- Fix the return of `MinuteResampleSessionBarReader.load_raw_arrays` to
match all other readers.
- Use the input dt for the `MinuteResampleSessionBarReader.load_raw_arrays`
as a session label, instead of a minute dt, since it is a session bar
reader.
(Both of these bugs where discovered when using the resample reader for
future data in the dispatch tests.)
Working towards history results which contain mixed asset types, add
a reader which makes `load_raw_arrays` return results indexed on the
session/minute ranges specified by the specified `trading_calendar`
instead of the calendar of the backing reader.
This reader will be used to make Equity readers align with Future
readers. It is intended for use as part of another reader (which will
dispatch queries based on asset type and then recombined results) which
will be passed to the `[Minute|Session]HistoryLoaders in the data portal.
The daily/session bar reader's `spot_price` took the same parameters and
returned the same kind of output as the minute bar reader's `get_value`.
Standardize on one method to make a common interface, which may be
formally factored out in a later patch; to help enable writing reader
implementations or mixins which can be agnostic to the bar frequency.
We were mistakenly using the minute_per_day field.
We now expose from the metadata object the version from which the
metadata was read. This allows a new test that verifies the version is
read correctly.
The new TradingCalendar method is called `minute_index_to_session_labels`.
It takes a DatetimeIndex of in-order market minutes and returns a
DatetimeIndex of the corresponding sessions.
The new method is approximately 100x faster than mapping
`minute_to_session_label` over a large DatetimeIndex.
In the data portal, remove methods that make a distinction between
future and equity asset type. Instead rely on the pricing reader
dispatching.
In support of incoming work which will upsample equity history arrays to
the larger future calendar.
Also, remove perf tracker tests which were using an equity
reader/writer, to be added back in later.
* First pass.
* Improvements and fixes
- Update usages of BcolzMinuteBarWriter
- Updates with rebuilt example data
- Expose calendar from BcolzMinuteBarMetadata instead of calendar_name
- Keep market_opens and market_closes in metadata for compatibility
* Store start_session and end_session in minute bcolz metadata
- start_session replaces first_trading_day
- Add end_session to limit to correct days
* For last_available_dt, get last close from calendar to maintain tz
* Bumps version and handles earlier versionson read
* Rebuilt example data on python 3
* Indicate metadata fields that are deprecated
Implement a `SessionBarReader` which uses a minute bar reader as a
backing source, resampling the minute bars into the box around the
corresponding session data.
Also, add future/CME test cases to resample suite.
Adds a new ``downsample`` method to all computable terms. Computable
terms (Filters, Factors, and Classifiers) can be downsampled to yearly,
quarterly, monthly, or weekly frequency.
The result of ``term.downsample`` is a new term of the same
family (Filter/Factor/Classifier) as ``term``. The downsampled term
computes by delegating to the original term; repeatedly calling its
``compute`` method with length-1 date ranges.
Downsampled terms take advantage of a new ``compute_extra_rows`` Term
method, which allows terms to dynamically request that additional extra
rows of themselves be computed based on the dates for which they're
being computed. This ensures, for example, that a monthly-downsampled
term always computes at the start of a month, even when a
naively-calculated pipeline window would end in the middle of the month.
- Split out extra_rows handling into an `ExecutionPlan` subclass.
`ExecutionPlan` now requires the dates and calendar against which a
set of terms will be computed, and now defers to a term's
`compute_extra_rows` method when deciding how many extra rows are
required to compute for that term. This will allow downsampled terms
to request enough extra rows to guarantee that we can maintain consistent
calculation dates.
As a consequence of the above, `TermGraph` now only deals with logical
dependencies, not with metadata surrounding extra row calculations.
This means that TermGraph can be used to generate dependency
visualizations in interactive contexts where we don't yet have a
calendar or start/end dates.
- Refactored test_{filter,factor,classifier} to use check_terms instead
of run_graph. This makes it easier to make changes to TermGraph,
since the testing interface is now to simply provide a dict of terms.
- Refactored BasePipelineTestCase to use fixtures to create an asset
finder. This fixes a potential leak of the test's asset db, which was
not being explicitly cleaned up.
- Refactored test_technical to use BasePipelineTestCase.
- Added a new special term, `InputDates()`, which can be used to request
date labels for inputs. Like `AssetExists`, `InputDates` is provided
in the initial workspace by default.
- Added a default (failing) `_compute` method to `AssetExists` which
provides a more useful error than AttributeError.
* MAINT: Use TradingCalendar objects for bundles
Instead of trading days, opens, and closes, register now takes a
TradingCalendar object, along with a start_session and end_session. The
ingest function is now passed these values instead as well.
* Accept calendar name in addition to the actual object
* Updates bundles documentation for changes
* Fix typo in docs
* Use class formatting
* Force start_session and end_session within the bounds of the calendar
* Use UTC timestamps in test_core
* Document Trading Calendar API in appendix.rst
Also, move `DailyHistoryAggregator` to `resample` module, so that tools
for converting from minute to session bars are collocated.
This patch is in preparation of adding a daily bar reader which
resamples minute data, which will be located in the `resample` module
and share the test cases and expected results in `test_resample`.
* BUG: Fixes asset writer to the select the latest asset to hold a sid
When constructing the asset_info dataframe, we were previously taking
the first symbol/sid pair to include, when we should be taking the most
recent.
* Ensure groups are sorted by increasing end_date
* Updates test_lookup_symbol_change_ticker to also cover asset_name
- Don't create unnecessary extra data (requires passing fastd_period=1
to TA-Lib or else it fills the FastK with NaNs even though it must
have already computed them...
- Use random_sample instead of random_integers so that we're not
dependent on integer arithmetic.
- Pass array_decimal to assert_equal so that we do almost equal checking
on results.
Use the future asset equity pricing reader, instead of reading directly
from the bcolz table. Required since the format for writing the future
data now uses the minute bar reader/writer pair.
Add test cases to `test_data_portal` asserting both equity and future
`get_spot_value` results.
Also, add direct coverage of last_traded_dt in the `test_data_portal`
module.
Prepares for adding test coverage of `get_last_traded_dt` for `Future` assets.
Change the mock minute data to no longer use an increasing arange, so
that a days worth of minute data can be summed and fit inside of a
uint32.
This change was required because of working on new test data that looked
like [0, 100, 200, 0, ] which was resulting in a daily rollup of 0 data,
when the coverage needed a non-0 value.
Also, factor out the resampling function, with an eye on a making it
easier to convert from minute bars to daily bars during ingest/load
processes.
When adding fixtures for futures data, there will be a need for multiple
calendars in the fixture ecosystem. e.g. a test that includes both
equities and futures would need an overall calendar which encompasses
both equities and futures; however, the test data for equities should
still still be limited to the bounds set by the NYSE calendar.
Make the fixtures that setup trading calendars and values dervied from
the trading calendar (e.g. trading sessions) accept an iterable of
calendars which need to be created, then populate those values into a
dict keyed by the calendar name.
Change `WithNYSETradingDays` to include sessions in the name,
since we are moving to session as the name for the 'day' unit.
Provide `trading_days` which is really "NYSE trading sessions` on
`WithTradingSessions` for backwards compatibility.