- Adds a `map` method to `LabelArray` that maps a unary function over
the categories of a LabelArray, shrinking the underyling codes if
possible.
- Adds a new `.relabel` method to string-dtype classifiers that maps a
unary function over the unique elements of the underlying LabelArray.
This is useful for things like cleaning noisy label data.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
LabelArray is conceptually similar to pandas.Categorical, in that it
stores data with many duplicate values as indices into an array of
unique values. For string data with many duplicates (e.g. time-series
of tickers or or industry classifications), this provides multiple
orders of magnitude of improvement when doing string operations,
especially string comparison/matching operations.
- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
and a corresponding ObjectOverwrite adjustment.
- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
This method is called on the final result of any pipeline expression
after screen filtering has occurred. The default implementation of
``postprocess`` is identity, but Classifier overrides it to coerce
string columns into pandas.Categoricals before presenting them to the
user.