Commit Graph

9 Commits

Author SHA1 Message Date
dmichalowicz 1ec0bced6d ENH: Add builtin factors for correlation and regression 2016-05-18 15:11:12 -04:00
Scott Sanderson 8de45540f2 ENH: NaN semantics for LabelArray missing values. 2016-05-04 15:54:50 -04:00
Scott Sanderson 5cd7d79818 MAINT: Restore support for bytes/unicode AdjustedArrays. 2016-05-04 15:54:50 -04:00
Scott Sanderson 5f190395ad ENH: Add support for strings in Pipeline.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
  LabelArray is conceptually similar to pandas.Categorical, in that it
  stores data with many duplicate values as indices into an array of
  unique values.  For string data with many duplicates (e.g. time-series
  of tickers or or industry classifications), this provides multiple
  orders of magnitude of improvement when doing string operations,
  especially string comparison/matching operations.

- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
  and a corresponding ObjectOverwrite adjustment.

- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
  This method is called on the final result of any pipeline expression
  after screen filtering has occurred. The default implementation of
  ``postprocess`` is identity, but Classifier overrides it to coerce
  string columns into pandas.Categoricals before presenting them to the
  user.
2016-05-04 15:50:52 -04:00
Scott Sanderson 0115cdc46c MAINT: Fail fast on unsupported dtypes. 2016-02-12 21:23:47 -05:00
Scott Sanderson c105735574 DEV: Add support for specifying missing_value.
Consequently, enable support for `int`-dtyped Factors and BoundColumns.
2016-02-12 21:23:47 -05:00
Richard Frank 18db1904bc BUG: Need to format message, not ValueError instance 2016-01-06 16:02:18 -05:00
Richard Frank 1499051df7 BUG: TypeError message had only str of numpy.dtype class
We want to use the dtype of the data that was passed in.
2016-01-06 15:29:58 -05:00
Scott Sanderson 8220d1ee86 ENH: Adds support for different typed adjusted arrays and adds an
EarningsCalendar loader.

- Moves most of AdjustedArray back into Python. The window iterator is
  the only part that's performance-intensive.

- Adds a bootleg templating system for creating specialized versions of
  AdjustedArrayWindow for each concrete type we care about.

- Adds support for differently dtyped terms in pipeline. This allows us
  to use datetime64s which are needed in the EarningsCalendar.

- Adds EarningsCalendar dataset for the next and previous earnings
  announcements in pipeline.

- Adds in memory loader for EarningsCalendar.

- Adds blaze loader for EarningsCalendar.
2015-12-08 20:24:06 -05:00