Commit Graph

3008 Commits

Author SHA1 Message Date
Scott Sanderson ed4e4ffe5d Merge pull request #812 from quantopian/adjusted-array-dict
MAINT: Make load_adjusted_array return a dict.
2015-11-03 12:14:13 -05:00
Scott Sanderson 8cd4f7d100 MAINT: Make load_adjusted_array return a dict.
Rather than a list that's ordered the same as the received columns.
Most nontrivial loaders were constructing dicts internally and then
converting back to lists, only to have the engine convert **back again**
into a dict.  This cuts out the middleman, and prevents bugs due to
incorrect ordering of the output arrays.
2015-11-03 11:16:21 -05:00
James Kirk f98349e24d Merge pull request #803 from quantopian/future-flip
Future flip fix
2015-11-02 10:13:41 -05:00
jfkirk a1584cebe7 STY: Factors-out event price handling 2015-11-02 10:02:58 -05:00
jfkirk 7d29bb6a67 BUG: Fixes failure to account for Futures transaction prices 2015-10-30 12:04:38 -04:00
warren-oneill 3f66e6c66a TST: adds future flip test 2015-10-30 10:07:46 -04:00
Richard Frank cb29f39688 Merge pull request #798 from quantopian/monthly_pipeline
ENH: Makes chunk_size configurable in attach_pipeline
2015-10-27 16:45:38 -04:00
Richard Frank d7add5b248 ENH: Makes chunk_size configurable in attach_pipeline
Default first chunk is smaller for more immediate results
2015-10-27 16:15:54 -04:00
Scott Sanderson e49cc3a6d1 Merge pull request #793 from quantopian/treasury-cleanup
Treasury cleanup
2015-10-25 17:55:42 -04:00
Scott Sanderson 653e7cbbf2 Merge pull request #690 from quantopian/time-tests
TEST: Run nosetests with timings.
2015-10-25 16:50:02 -04:00
Scott Sanderson 01888918dd MAINT: Use itemgetter instead of homegrown func. 2015-10-25 16:37:59 -04:00
Scott Sanderson 75f7c44223 BUG: Better check for last date.
Use get_loc to find the trading day that ended 2 days before now.
2015-10-25 16:37:59 -04:00
Scott Sanderson 8fd18e5aa6 DOC: Comment on treasury division by 100. 2015-10-25 16:37:59 -04:00
Scott Sanderson 0710062e6a DOC: Docstring edits. 2015-10-25 16:37:59 -04:00
Scott Sanderson cabe22ae8e ENH: Always use Adjusted Close for benchmarks.
Previously we were using Close, and we calculated returns on the first
day of a window against the Open for that day.  We now always look back
an extra day to get the previous day's close.
2015-10-25 16:37:59 -04:00
Scott Sanderson df4cda4dc9 ENH: Remove defaults from get_benchmark_data. 2015-10-25 16:37:59 -04:00
Scott Sanderson d82cfb1e64 MAINT: Final polish on loader rewrites.
- Fixes an issue with the canadian treasury loader where it would never
  have enough data to not redownload because it can only download data
  in the last 10 years.
- Uses module objects directly instead of lazy imports.
- Adds lots of docstrings.
2015-10-25 16:37:59 -04:00
Scott Sanderson 71db6d3fdc MAINT: Remove unused loader_utils file. 2015-10-25 16:37:59 -04:00
Scott Sanderson 24d26f9e63 MAINT: Rewrite the benchmark loader. 2015-10-25 16:37:59 -04:00
Scott Sanderson 948196d2de MAINT: Remove unused loader_utils functions. 2015-10-25 16:37:59 -04:00
Scott Sanderson c9e165aa2d ENH: Rewrite Canadian treasury loader. 2015-10-25 16:37:59 -04:00
Scott Sanderson 8c38278783 ENH: Rewrite treasury loader using pandas.
Replaces our custom XML parsing with a single call to `pd.read_csv`
against the federal reserve's API.  This produces nearly identical
results as compared to the old loader, but it's dramatically simpler and
roughly 10x faster on my machine.

The average difference in magnitude between new and old is approximately
10e-7, and only one entry is different to a degree greater than the
number of significant figures provided by treasury.gov.

Additionally, the new loader correctly ignores Columbus Day of 2010, for
which the old loader erroneously produced an all-NaN row.

This also changes the interface that treasury modules modules are
required to implement. Modules must now supply a `get_treasury_data`
function that returns a `DataFrame` with a daily `DatetimeIndex` and a
column for each supported treasury duration.

Detailed comparison between results from new and old loader::

    from zipline.data.treasuries import get_treasury_data
    new = get_treasury_data() # New implementation
    old = pd.read_csv(  # Previously cached data
        '/home/ssanderson/.zipline/data/treasury_curves.csv'
        parse_dates=[0],
        index_col=0,
    )
    # These columns were unused.
    del old['tid']; del old['date']
    old = old.tz_localize('UTC')
    old.dropna(how='all')
    # old data erroneously contained an all-NaN entry for Columbus Day
    # in 2010.  Remove before comparing.
    old = old.dropna(how='all')

    In [25]: len(new) == len(old)
    Out[25]: True

    In [26]: abs(old - new).max()
    Out[26]:
    10year    2.000000e-04
    1month    6.938894e-18
    1year     1.000000e-04
    20year    1.000000e-04
    2year     2.000000e-04
    30year    1.000000e-04
    3month    1.000000e-03
    3year     1.000000e-04
    5year     1.387779e-17
    6month    1.000000e-04
    7year     1.000000e-04
    dtype: float64

    In [27]: abs(old - new).mean()
    Out[27]:
    10year    3.097414e-08
    1month    4.396534e-19
    1year     1.548707e-08
    20year    3.624502e-08
    2year     4.646120e-08
    30year    1.830496e-08
    3month    1.549427e-07
    3year     1.548707e-08
    5year     1.702619e-18
    6month    1.548707e-08
    7year     1.548707e-08
    dtype: float64

Since www.treasury.gov only reports values up to three significant
digits, we should only care about differences of greater than 1e-3.

There is exactly one such difference: the entry for the three month bond
on 1999-10-01::

    In [60]: new[(abs(new - old) >= 1e-3).any(axis=1)].T
    Out[60]:
    Time Period  1999-10-01 00:00:00+00:00
    1month                             NaN
    3month                          0.0498
    6month                          0.0501
    1year                           0.0530
    2year                           0.0573
    3year                           0.0583
    5year                           0.0590
    7year                           0.0622
    10year                          0.0600
    20year                          0.0657
    30year                          0.0615

    In [61]: old[(abs(new - old) >= 1e-3).any(axis=1)].T
    Out[61]:
            1999-10-01 00:00:00+00:00
            10year                     0.0600
            1month                        NaN
            1year                      0.0530
            20year                     0.0657
            2year                      0.0573
            30year                     0.0615
            3month                     0.0488
            3year                      0.0583
            5year                      0.0590
            6month                     0.0501
            7year                      0.0622

The US Treasury website (our old source) provides a value of 0.488 here,
whereas the Federal Reserve site (our new source) provides a value of
0.498.
2015-10-25 16:37:59 -04:00
Scott Sanderson 3c954af08c MAINT: Just do searchsorted with the date.
Previously we were converting our date to a string, then calling
`searchsorted` on the DatetimeIndex with the string, which would cause
pandas to convert the string back into a date to actually do the lookup.
2015-10-25 16:37:59 -04:00
Scott Sanderson 854b6638b2 MAINT: Remove default values from dump_treasury_curves.
We never call the function without passing them explicitly.
2015-10-25 16:37:59 -04:00
Jean Bredeche 0bc96a772d Merge pull request #794 from quantopian/vectorize-final-risk-calc
ENH: vectorize mean algorithm returns calculation
2015-10-25 10:07:26 -04:00
Jean Bredeche b0b159e12d ENH: vectorize mean algorithm returns calculation
In a sample backtest on my machine, this takes the final risk
calculations down from ~10 seconds to ~0.8 seconds.
2015-10-24 13:18:52 -04:00
Stewart Douglas 358ab41569 TST: test history() in before_trading_start() 2015-10-23 13:32:36 -04:00
Stewart Douglas 6795ea74c9 ENH: Update next_market_minute() & previous_market_minute()
Previously we were not accounting for cases where we would invoke
next_market_minute() with a time on a trading day *before* the
market open, or previous_market_minute() with a time on a trading
day *after* the market close.
2015-10-23 10:30:06 -04:00
John Ricklefs f599795d27 ENH: Allow pipelines to run with matching start/end dates 2015-10-22 14:23:00 -04:00
Scott Sanderson acce0779c9 DOC: Better docstring descriptions for mask. 2015-10-21 22:53:03 -04:00
Eddie Hebert 8543b32468 Merge pull request #791 from quantopian/pipeline-effective-dates
MAINT: Set dividend effective date to ex_date.
2015-10-21 16:44:07 -04:00
Eddie Hebert 55b25bdd3f MAINT: Set dividend effective date to ex_date.
The price shock occurs on the effective_date. Had changed the effective_date to
be day before the ex_date with the belief that pipeline was applying values up
and until the effective_date, but the lookback windows apply before the
effective_date. Thus, the price shock calculation should still use the previous
days data but be dated on the ex_date to stay aligned with splits and
merger dating.
2015-10-21 16:43:13 -04:00
llllllllll 420df53d78 ENH: pull sentinel construction into a function 2015-10-19 16:55:32 -04:00
llllllllll 5112421334 BUG: Fix the firstlineno of the validated functions 2015-10-19 16:55:32 -04:00
llllllllll b8452b88c3 TST: test case where there are more sids requested than available 2015-10-19 16:35:03 -04:00
llllllllll 1371bf2cd0 BUG: support case there are more sids requested than available in a blaze dataset 2015-10-19 16:35:03 -04:00
llllllllll 4238391f6f DOC: docstring cleanup 2015-10-19 16:35:03 -04:00
llllllllll 7afe7d6b45 ENH: remove blazeloader repr, too verbose 2015-10-19 16:35:03 -04:00
llllllllll c714fe58d1 BUG: Makes macro dataset loader return a concrete ndarray.
We cannot use fancy strided arrays with pipeline yet.
2015-10-19 16:35:03 -04:00
llllllllll aedbbcc6f0 BLD: blaze ecosystem commits 2015-10-19 16:35:03 -04:00
llllllllll e4abddd286 ENH: updates tests to use first and last col 2015-10-19 16:35:03 -04:00
llllllllll 3fb91e4d39 MAINT: cleanup doctests 2015-10-19 16:35:03 -04:00
llllllllll c58f0137e4 MAINT: map(retrieve_asset) -> retrieve_all 2015-10-19 16:35:03 -04:00
llllllllll e9ec709453 MAINT: expect_value doctest and lambda over toolz 2015-10-19 16:35:03 -04:00
llllllllll 0fff04d9c1 DOC: update doctest 2015-10-19 16:35:03 -04:00
llllllllll 1db29a9f0f ENH: handle amendments between trading days 2015-10-19 16:35:03 -04:00
llllllllll 0183d0a914 ENH: Allows Float64Adjustments to act on a range of columns 2015-10-19 16:35:03 -04:00
llllllllll c7ca0166cc MAINT: refactor tmp_db_uri to use make_simple_assetinfo 2015-10-19 16:35:03 -04:00
llllllllll 971848f38b MAINT: reshuffle logic in from_blaze to make the control flow easier 2015-10-19 16:35:03 -04:00
llllllllll f897bcdfed rename expr to dataset in type error 2015-10-19 16:35:03 -04:00