catalyst

wassname/catalyst

Fork 0

mirror of https://github.com/wassname/catalyst.git synced 2026-07-01 03:57:42 +08:00

Commit Graph

Author	SHA1	Message	Date
Scott Sanderson	49bb8264dc	ENH: Finish adding groupby to rank/top/bottom. - Added test coverage for grouped and masked top/bottom. - Added test coverage for grouped rank on datetime factors. - Fixed an issue where grouped rank would fail on datetime inputs because unary-negative isn't defined for datetimes. We now instead directly invoke a function from rank.pyx that does the normalizations as neeeded. - Fixed an issue where GroupedRowTransform assumed that it produced the same dtype as its input. This isn't true for rank() of a datetime-dtype factor. GroupedRowTransform now takes a required dtype parameter. - Similarly, fixed an issue where GroupedRowTransform assumed that its missing_value was the same as its parent's, which isn't true for rank() of a datetime-dtype factor. GroupedRowTransform now takes a required dtype parameter. - Fixed an issue where Factor.demean() and Factor.zscore() weren't properly cached because their static_identity included a closure that was dynamically generated on each invocation. They both now always use a function defined at module scope.	2016-07-26 02:57:35 -04:00
Scott Sanderson	53d3b0855b	ENH: Add support for Classifiers. Classifiers are computations that represent grouping keys. They can be used in conjuction with normalization functions like ``zscore`` or ``demean`` to perform normalizations over subsets of a dataset. Notable changes: - Added ``demean()`` and ``zscore()`` methods to ``Factor``. - Added a classifier versions of ``Latest`` and ``CustomTermMixin``. The .latest attribute of int64 dataset columns no produces a classifier by default. - Added ``Everything``, a classifier that maps all data to the same value. - Added ``zipline.lib.normalize``, which implements a naive, pure-Python grouped normalize function. This will likely be moved to Cython in a subsequent PR.	2016-03-19 17:04:28 -04:00

Author

SHA1

Message

Date

Scott Sanderson

49bb8264dc

ENH: Finish adding groupby to rank/top/bottom.

- Added test coverage for grouped and masked top/bottom.

- Added test coverage for grouped rank on datetime factors.

- Fixed an issue where grouped rank would fail on datetime inputs
  because unary-negative isn't defined for datetimes.  We now instead
  directly invoke a function from rank.pyx that does the normalizations
  as neeeded.

- Fixed an issue where GroupedRowTransform assumed that it produced the
  same dtype as its input.  This isn't true for rank() of a
  datetime-dtype factor.  GroupedRowTransform now takes a required dtype
  parameter.

- Similarly, fixed an issue where GroupedRowTransform assumed that its
  missing_value was the same as its parent's, which isn't true for
  rank() of a datetime-dtype factor.  GroupedRowTransform now takes a
  required dtype parameter.

- Fixed an issue where Factor.demean() and Factor.zscore() weren't
  properly cached because their static_identity included a closure that
  was dynamically generated on each invocation.  They both now always
  use a function defined at module scope.

2016-07-26 02:57:35 -04:00

Scott Sanderson

53d3b0855b

ENH: Add support for Classifiers.

Classifiers are computations that represent grouping keys. They can be
used in conjuction with normalization functions like ``zscore`` or
``demean`` to perform normalizations over subsets of a dataset.

Notable changes:

- Added ``demean()`` and ``zscore()`` methods to ``Factor``.

- Added a classifier versions of ``Latest`` and ``CustomTermMixin``.
  The .latest attribute of int64 dataset columns no produces a
  classifier by default.

- Added ``Everything``, a classifier that maps all data to the same
  value.

- Added ``zipline.lib.normalize``, which implements a naive, pure-Python
  grouped normalize function.  This will likely be moved to Cython in a
  subsequent PR.

2016-03-19 17:04:28 -04:00

2 Commits