Commit Graph

13 Commits

Author SHA1 Message Date
Scott Sanderson bd147a084e BUG: Fix crash on .latest for integer-typed columns.
Int columns get coerced to float on load, and we don't currently support
non-float columns from CustomFactors.
2015-09-21 13:20:26 -04:00
Scott Sanderson 26fd6fda8b ENH/BUG: Modeling API enhancements.
- Fixes an error where Modeling API data known as of the close of `day
  N` would be shown to algorithms during `before_trading_start` as of
  the close of the same day.  Algorithms should now only receive data
  during `before_trading_start/handle_data` that was known as of the
  simulation time at which the function would be called.

- All Term instances now have a `mask` attribute that must be a `Filter`
  or an instance of `AssetExists()`.  `mask` can be used to specify that
  a Factor should be computed in a manner that ignores the values that
  were not `True` in the mask.

- Changed the interface for `FFCLoader.load_adjusted_array` and
  `Term._compute` from `(columns, mask)`, with mask as a DataFrame, to
  `(columns, dates, assets, mask)`, where mask is a numpy array.  This
  is primarily to avoid having to reconstruct extra DataFrames when
  using masks produced by non `AssetExists` filters.

- Adds `BoundColumn.latest`, which gives the most-recently-known value
  of a column.
2015-09-16 01:47:11 -04:00
Scott Sanderson 46882bfcb9 TST: Remove unnecessary instance attr assignments. 2015-09-16 01:28:16 -04:00
Scott Sanderson 5730de25a4 DEV: Kill compute_from_{arrays,windows}.
All terms just implement `_compute` now. (We reserve `compute` for the
public API of `CustomFactor`.)

Also removed `TestingTermMixin` and its subclasses in favor of just
using `CustomFactor.`
2015-09-16 01:28:15 -04:00
Scott Sanderson 58ceb7b7bb DEV: Add zipline.utils.memoize.
- Moved zipline.utils.lazyval.
- Added `remember_last` which is just `lru_cache(1)` with simpler logic.
2015-09-16 01:28:15 -04:00
jfkirk 6e6ef447d2 TST: Adds tearDownClass methods to delete TradingEnvironments 2015-09-10 11:53:29 -04:00
jfkirk 35ed8c28a8 TST: Fixes modelling test to use new TradingEnvironment framework 2015-09-10 11:53:28 -04:00
Stewart Douglas d3516959a3 MAINT: Don't set string to upper before writing, remove unused libs 2015-09-10 11:53:27 -04:00
Stewart Douglas 1ef2274d11 MAINT: Update tests to conform to new reader/writer structure 2015-09-10 11:53:26 -04:00
Scott Sanderson 780263da06 ENH: Return asset-indexed DataFrame for data.factors.
This makes ordering with the returned assets much easier, and there's no
performance degradation for non-broadcasting operations on the Index.

Timings
-------

    from random import sample
    finder = AssetFinder(create_table=False, assets.db')
    assets = load_8000_assets(finder)
    AAPL = finder.retrieve_asset(24)
    RANDOM_ASSETS = sample(assets, 500)
    df = DataFrame(
        index=assets,
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )
    df_int = DataFrame(
        index=map(int, assets),
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )

    %timeit df.loc[24]
    %timeit df_int.loc[24]

    10000 loops, best of 3: 45.3 µs per loop
    10000 loops, best of 3: 44.7 µs per loop

    %timeit df.loc[AAPL]
    %timeit df_int.loc[AAPL]

    10000 loops, best of 3: 45.1 µs per loop
    10000 loops, best of 3: 44.8 µs per loop

    %timeit df.loc[RANDOM_ASSETS]
    %timeit df_int.loc[RANDOM_ASSETS]

    1000 loops, best of 3: 1.53 ms per loop
    100 loops, best of 3: 2.18 ms per loop

    %timeit df.sum()
    %timeit df_int.sum()

    10000 loops, best of 3: 56 µs per loop
    10000 loops, best of 3: 55.7 µs per loop

    %timeit df.index == 3
    %timeit df_int.index == 3

    1000 loops, best of 3: 253 µs per loop
    100000 loops, best of 3: 6.76 µs per loop

    %timeit df.iloc[:50]
    %timeit df_int.iloc[:50]

    10000 loops, best of 3: 44.3 µs per loop
    10000 loops, best of 3: 44 µs per loop
2015-08-26 18:33:54 -04:00
Richard Frank 30847a10a7 BUG: Interface of load_adjusted_array is to return a list of arrays
but MultiColumnLoader was returning a list of lists of arrays in some
cases.
2015-08-19 10:12:19 -04:00
Scott Sanderson 7bb20eb297 MAINT: Check dates before computing factor_matrix.
In SimpleFFCEngine.factor_matrix barf with a useful error if end_date <=
start_date.
2015-08-03 12:06:24 -04:00
Scott Sanderson ef4f642e62 ENH: Compute engine architecture for FFC API.
This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation.  It contains:

A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs.  Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`.  Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.

A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.

New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`.  Computed
filter results control which assets are available in the factor matrix
on each day.
2015-07-29 12:30:46 -04:00