Commit Graph

270 Commits

Author SHA1 Message Date
Scott Sanderson bddb453272 BUG: F.window_safe implies f.demean().window_safe. 2016-09-22 12:41:50 -04:00
Scott Sanderson 918de6ad26 MAINT: Use explicit floats in np.full. 2016-09-20 17:12:08 -04:00
Scott Sanderson 46cf54b180 MAINT: Remove outdated compat code. 2016-09-20 17:12:07 -04:00
Scott Sanderson 6aeba11176 STY: Fix flake8 failures. 2016-09-20 17:12:07 -04:00
Scott Sanderson 493e18252d MAINT: Temporarily ignore pandas warnings in categoricals.
Pandas 0.18 doesn't like having null-ish values in categoricals.  Fixing
this properly requires re-thinking the semantics for missing_value on
pipeline terms, so we're punting on that until after we've upgraded to
0.18.
2016-09-20 17:12:07 -04:00
Scott Sanderson a9c02935c6 Revert "MAINT: Remove support for custom string Column missing values."
This reverts commit 1b1e842e2339d6d0ee40cdfe34dcd27b4e4a7c0c.
2016-09-20 17:12:07 -04:00
Scott Sanderson ed365dc5fe MAINT: Remove support for custom string Column missing values.
Pandas 0.18 deprecated passing "null-ish" values to pd.categorical.  The
expectation, instead, is that you use categorical's native support for
missing data, which means the user will always get NaN's for missing
entries of the categorical.

A follow-up to this change should probably drop support for custom
missing values entirely and to use LabelArray/categorical for integer
data.
2016-09-20 17:12:07 -04:00
Scott Sanderson da8ed8919e MAINT: Pandas compat for rolling_*. 2016-09-20 17:12:07 -04:00
Scott Sanderson 9aa866e434 MAINT: Use sort_values() instead of sort().
pd.DataFrame.sort() is deprecated.
2016-09-20 17:12:07 -04:00
Scott Sanderson ef88dfdad2 MAINT: Use dataframe.iteritems instead of iterkv.
iterkv is deprecated.
2016-09-20 17:12:07 -04:00
Scott Sanderson 259f10a2d9 MAINT: Pass float to np.full explicitly. 2016-09-20 17:12:07 -04:00
Scott Sanderson 434d7c69d3 TEST/MAINT: Silence no_checkpoints warning. 2016-09-20 17:12:07 -04:00
Scott Sanderson 0d8e99956e MAINT: Fix numpy deprecation warnings. 2016-09-20 16:24:55 -04:00
Scott Sanderson be30c0072d MAINT: Explicitly use float64 in test. 2016-09-20 16:24:54 -04:00
Scott Sanderson d2f0632101 MAINT: Don't use .loc with integers. 2016-09-20 16:24:54 -04:00
Scott Sanderson 12101c55c8 STY: Don't assign variables that won't be created. 2016-09-02 12:53:01 -04:00
Scott Sanderson 8b2446aec6 ENH: Dont allow length=1 regressions/correlations.
They're not meaningful, and they cause warnings from numpy.

Implemented in terms of a new preprocessor, `expect_bounded`, which
takes a tuple of `upper_bound` and `lower_bound`.
2016-09-02 12:49:09 -04:00
Eddie Hebert a3c1f4ce36 MAINT: Standardize reader get value methods.
The daily/session bar reader's `spot_price` took the same parameters and
returned the same kind of output as the minute bar reader's `get_value`.

Standardize on one method to make a common interface, which may be
formally factored out in a later patch; to help enable writing reader
implementations or mixins which can be agnostic to the bar frequency.
2016-08-24 12:46:36 -04:00
Scott Sanderson bdc72ec4c0 BUG: Fix broken graph visualizations. 2016-08-18 11:07:17 -04:00
Scott Sanderson c53ef150ad BUG: Force iterator for py3. 2016-08-17 16:52:09 -04:00
Scott Sanderson a66731b9f3 BUG/TEST: Fix test assertion in py3. 2016-08-17 16:52:09 -04:00
Scott Sanderson 115f055c83 MAINT: Clean up downsampling boilerplate.
Consolidate docs and mixin applications into one place.
2016-08-17 16:52:09 -04:00
Scott Sanderson d917a64d45 ENH: Add non-windowed downsampling. 2016-08-17 16:52:09 -04:00
Scott Sanderson 5f686173f1 STY: Flake8 cleanup. 2016-08-17 16:52:09 -04:00
Scott Sanderson 91276c7274 ENH: Add support for downsampling.
Adds a new ``downsample`` method to all computable terms.  Computable
terms (Filters, Factors, and Classifiers) can be downsampled to yearly,
quarterly, monthly, or weekly frequency.

The result of ``term.downsample`` is a new term of the same
family (Filter/Factor/Classifier) as ``term``.  The downsampled term
computes by delegating to the original term; repeatedly calling its
``compute`` method with length-1 date ranges.

Downsampled terms take advantage of a new ``compute_extra_rows`` Term
method, which allows terms to dynamically request that additional extra
rows of themselves be computed based on the dates for which they're
being computed.  This ensures, for example, that a monthly-downsampled
term always computes at the start of a month, even when a
naively-calculated pipeline window would end in the middle of the month.
2016-08-17 16:52:09 -04:00
Scott Sanderson 1444a78330 MAINT: Refactor in prep for downsampled terms.
- Split out extra_rows handling into an `ExecutionPlan` subclass.
  `ExecutionPlan` now requires the dates and calendar against which a
  set of terms will be computed, and now defers to a term's
  `compute_extra_rows` method when deciding how many extra rows are
  required to compute for that term. This will allow downsampled terms
  to request enough extra rows to guarantee that we can maintain consistent
  calculation dates.

  As a consequence of the above, `TermGraph` now only deals with logical
  dependencies, not with metadata surrounding extra row calculations.
  This means that TermGraph can be used to generate dependency
  visualizations in interactive contexts where we don't yet have a
  calendar or start/end dates.

- Refactored test_{filter,factor,classifier} to use check_terms instead
  of run_graph.  This makes it easier to make changes to TermGraph,
  since the testing interface is now to simply provide a dict of terms.

- Refactored BasePipelineTestCase to use fixtures to create an asset
  finder.  This fixes a potential leak of the test's asset db, which was
  not being explicitly cleaned up.

- Refactored test_technical to use BasePipelineTestCase.

- Added a new special term, `InputDates()`, which can be used to request
  date labels for inputs.  Like `AssetExists`, `InputDates` is provided
  in the initial workspace by default.

- Added a default (failing) `_compute` method to `AssetExists` which
  provides a more useful error than AttributeError.
2016-08-17 16:52:09 -04:00
Scott Sanderson 765f9b6d57 MAINT: Improve/test errors for insufficient data. 2016-08-17 16:52:09 -04:00
Scott Sanderson 007e1f9cfb BUG/TEST: Fix stochastic oscillator test.
- Don't create unnecessary extra data (requires passing fastd_period=1
  to TA-Lib or else it fills the FastK with NaNs even though it must
  have already computed them...

- Use random_sample instead of random_integers so that we're not
  dependent on integer arithmetic.

- Pass array_decimal to assert_equal so that we do almost equal checking
  on results.
2016-08-09 17:55:24 -04:00
Eddie Hebert e934c6aeaf TST: Make room for multiple calendars in tests.
When adding fixtures for futures data, there will be a need for multiple
calendars in the fixture ecosystem. e.g. a test that includes both
equities and futures would need an overall calendar which encompasses
both equities and futures; however, the test data for equities should
still still be limited to the bounds set by the NYSE calendar.

Make the fixtures that setup trading calendars and values dervied from
the trading calendar (e.g. trading sessions) accept an iterable of
calendars which need to be created, then populate those values into a
dict keyed by the calendar name.

Change `WithNYSETradingDays` to include sessions in the name,
since we are moving to session as the name for the 'day' unit.

Provide `trading_days` which is really "NYSE trading sessions` on
`WithTradingSessions` for backwards compatibility.
2016-08-05 12:17:27 -04:00
Jean Bredeche e6af4e4f1b ENH: made exchange a required parameter to Asset and its subclasses
This required updating a lot of tests.
2016-08-02 23:21:39 -04:00
Gil Wassermann 483397e554 ENH: Added AtLeastN filter 2016-08-02 16:34:32 -04:00
Joe Jevnik 4265a13edf Revert "Merge pull request #1354 from quantopian/revert-1302-point-in-time-asset-db"
This reverts commit 3b633011c6, reversing
changes made to 70ac5323de.
2016-08-02 14:25:10 -04:00
Scott Sanderson f13294de4e ENH: Rename StrictlyTrue to All and add Any().
Also, moved All() and Any() to `zipline.pipeline.filters.smoothing`.
2016-08-01 22:10:28 -04:00
Gil Wassermann 574d7b197f TEST: test for rolling nature of smoothing filter 2016-08-01 15:35:22 -04:00
Gil Wassermann 7623c0f6eb MAINT: .sum() behaviour 2016-08-01 13:48:14 -04:00
Gil Wassermann c10af2a0b9 TEST: more thorough testing 2016-08-01 11:40:14 -04:00
Gil Wassermann 73de8e6182 STY: style changes and strictly_true_filter 2016-08-01 11:16:02 -04:00
Gil Wassermann 694d9e952a ENH: added smoothing to zipline 2016-08-01 08:20:10 -04:00
Joe Jevnik 814a2be7b7 Revert "Point in time asset db" 2016-07-27 23:29:08 -04:00
Jean Bredeche 3305933089 DEV: Change daily mode to use last minute of session instead of session itself. 2016-07-27 09:20:24 -04:00
Jean Bredeche 2462929368 Revert "Merge pull request #1340 from quantopian/by-daily-i-mean-minutely"
This reverts commit f4456719b0, reversing
changes made to 4be07e4628.
2016-07-26 16:20:14 -04:00
Joe Jevnik 7fd8c29880 ENH: add point in time aspect to equity symbol mapping
Changes the overlap behavior so that it is an error to write data which
would have two companies holding the same ticker. Other than one test
around which company would win in that case, all the other tests are
passing. That single test has been changed to check the write-time
error.
2016-07-26 13:34:58 -04:00
Jean Bredeche bcb547d5a8 DEV: Change daily mode to use last minute of session instead of session itself. 2016-07-26 12:49:49 -04:00
Scott Sanderson 49bb8264dc ENH: Finish adding groupby to rank/top/bottom.
- Added test coverage for grouped and masked top/bottom.

- Added test coverage for grouped rank on datetime factors.

- Fixed an issue where grouped rank would fail on datetime inputs
  because unary-negative isn't defined for datetimes.  We now instead
  directly invoke a function from rank.pyx that does the normalizations
  as neeeded.

- Fixed an issue where GroupedRowTransform assumed that it produced the
  same dtype as its input.  This isn't true for rank() of a
  datetime-dtype factor.  GroupedRowTransform now takes a required dtype
  parameter.

- Similarly, fixed an issue where GroupedRowTransform assumed that its
  missing_value was the same as its parent's, which isn't true for
  rank() of a datetime-dtype factor.  GroupedRowTransform now takes a
  required dtype parameter.

- Fixed an issue where Factor.demean() and Factor.zscore() weren't
  properly cached because their static_identity included a closure that
  was dynamically generated on each invocation.  They both now always
  use a function defined at module scope.
2016-07-26 02:57:35 -04:00
Andrey Portnoy 9e3404646e add groupby to rank, top, and bottom 2016-07-25 23:53:33 -04:00
ChrisPappalardo 5888cf1657 ENH: add true range technical factor 2016-07-25 12:37:25 -04:00
Scott Sanderson 3cc1cf078a TEST: Parameterize over window_length. 2016-07-24 21:21:40 -04:00
Gil Wassermann ea01fb074a STY: style changes 2016-07-22 16:08:33 -04:00
Gil Wassermann b4aa0aecbb STY: Flake8 2016-07-22 15:08:34 -04:00
Gil Wassermann 36a727f4af ENH: sum vs nansum cleared up 2016-07-22 14:56:59 -04:00