Pandas 0.18 doesn't like having null-ish values in categoricals. Fixing
this properly requires re-thinking the semantics for missing_value on
pipeline terms, so we're punting on that until after we've upgraded to
0.18.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
LabelArray is conceptually similar to pandas.Categorical, in that it
stores data with many duplicate values as indices into an array of
unique values. For string data with many duplicates (e.g. time-series
of tickers or or industry classifications), this provides multiple
orders of magnitude of improvement when doing string operations,
especially string comparison/matching operations.
- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
and a corresponding ObjectOverwrite adjustment.
- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
This method is called on the final result of any pipeline expression
after screen filtering has occurred. The default implementation of
``postprocess`` is identity, but Classifier overrides it to coerce
string columns into pandas.Categoricals before presenting them to the
user.
Renames zipline.utils.test_utils to zipline.testing
Adds zipline.testing.fixtures.ZiplineTestCase to manage setup and
teardown and adds mixins to define fixtures like an asset finder or
trading calendar.