- Added test coverage for grouped and masked top/bottom.
- Added test coverage for grouped rank on datetime factors.
- Fixed an issue where grouped rank would fail on datetime inputs
because unary-negative isn't defined for datetimes. We now instead
directly invoke a function from rank.pyx that does the normalizations
as neeeded.
- Fixed an issue where GroupedRowTransform assumed that it produced the
same dtype as its input. This isn't true for rank() of a
datetime-dtype factor. GroupedRowTransform now takes a required dtype
parameter.
- Similarly, fixed an issue where GroupedRowTransform assumed that its
missing_value was the same as its parent's, which isn't true for
rank() of a datetime-dtype factor. GroupedRowTransform now takes a
required dtype parameter.
- Fixed an issue where Factor.demean() and Factor.zscore() weren't
properly cached because their static_identity included a closure that
was dynamically generated on each invocation. They both now always
use a function defined at module scope.
Classifiers are computations that represent grouping keys. They can be
used in conjuction with normalization functions like ``zscore`` or
``demean`` to perform normalizations over subsets of a dataset.
Notable changes:
- Added ``demean()`` and ``zscore()`` methods to ``Factor``.
- Added a classifier versions of ``Latest`` and ``CustomTermMixin``.
The .latest attribute of int64 dataset columns no produces a
classifier by default.
- Added ``Everything``, a classifier that maps all data to the same
value.
- Added ``zipline.lib.normalize``, which implements a naive, pure-Python
grouped normalize function. This will likely be moved to Cython in a
subsequent PR.