This commit removes the ability to reference a shared TradingEnvironment through the zipline.finance.trading module. In place, the classes that require a TradingEnvironment, or its child AssetFinder, contain their own references to those objects.
This commit also adds serialization utilities that allow for the pickling/unpickling of objects without unintentionally their TradingEnvironments or AssetFinders.
Remove pieces that are no longer used now that the simple transforms are
wrappers around history via the SIDData object.
Move window length related pieces into batch_transform, since the rest
of the utils module is no longer used.
Overhaul the core HistoryContainer logic to be more robust to changing
universes.
Major Changes
-------------
* Remove `return_frame` cache. The original purpose of using
return_frames was to avoid having to create new DataFrames on each
iteration of handle_data, but we ended up having to copy the return
frames anyway because user code could mutate the frames in place.
Removing the return_frames reduces unnecessary copying, and reduces
the logic of `get_history` to just forward-filling and concatenating
two DataFrames.
* Use a `MultiIndex`ed DataFrame to represent
`last_known_prior_values`. This makes lookups faster and greatly
simplifies the logic of adding and dropping sids.
* HistoryContainer no longer attempts to determine its universe based on
the contents of its internal buffers. The TradingAlgorithm
controlling the container is now responsible for explicitly calling
`add_sids` or `drop_sids` when securities enter or leave the
algorithm's universe. These methods, along with the internal
`_realign` method, provide a clean interface for changing the universe
of securities managed by the container.
* Refactor index mutation logic in `RollingPanel` into a
`MutableIndexRollingPanel` subclass. Maintenance of the old behavior
is regrettably necessary to support `BatchTransform`.
* Refactor shared logic from `roll` and `get_history` into a single
`aggregate_ohlcv_panel` method that's responsible for collapsing an
OHLCV buffer into a frame.
Adding a copy of the Event's dt field as datetime via the
`alias_dt` generator, so that the API was forgiving and allowed
both datetime and dt on a SIDData object, was creating noticeable
overhead, even on an noop algorithms.
Instead of incurring the cost of copying the datetime value and
assigning it to the Event object on every event that is passed
through the system, add a property to SIDData which acts as an
alias `datetime` to `dt`.
Eventually support for `data['foo'].datetime` may be removed,
and could be considered deprecated.
Use the six module to import functions and types that are
consistent between Python 2 and 3, so that one code base can
support both versions.
- Use integer types instead of int and long.
- Use string_types instead of basestring.
- Account for iteritems, itervalues, iterkeys.
- Use six.moves for filter and zip, reduce
- Use compatible bytes for md5 hasher.
- xrange and range
`for s in data` and methods like `for s in data.keys` were not producing
the same list of active sids
Make the other iteration methods match __iter__ by using the contains
method to check whether or not the sid is active.
For use of data outside of the algoscript context, which needs access
to all data fields use data._data
The underlying RollingPanel in batch_transform was always accumulating
all values to ever appear in data.
However, at any given algo time the desired return value is what the
current active sids are.
Instead, mask down to the sids that are passed in as the data parameter.
So that with minute data, 2.5 orders of magnitude of data can
be cut, allowing for longer window_lenghts, when the daily
values are what are desired for a signal.
Expect the same shape of data for the supplemental data, to make
working and preparing with the supplemental data consistent with
what is passed to the algorithm.