Commit Graph

4 Commits

Author SHA1 Message Date
Scott Sanderson 780263da06 ENH: Return asset-indexed DataFrame for data.factors.
This makes ordering with the returned assets much easier, and there's no
performance degradation for non-broadcasting operations on the Index.

Timings
-------

    from random import sample
    finder = AssetFinder(create_table=False, assets.db')
    assets = load_8000_assets(finder)
    AAPL = finder.retrieve_asset(24)
    RANDOM_ASSETS = sample(assets, 500)
    df = DataFrame(
        index=assets,
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )
    df_int = DataFrame(
        index=map(int, assets),
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )

    %timeit df.loc[24]
    %timeit df_int.loc[24]

    10000 loops, best of 3: 45.3 µs per loop
    10000 loops, best of 3: 44.7 µs per loop

    %timeit df.loc[AAPL]
    %timeit df_int.loc[AAPL]

    10000 loops, best of 3: 45.1 µs per loop
    10000 loops, best of 3: 44.8 µs per loop

    %timeit df.loc[RANDOM_ASSETS]
    %timeit df_int.loc[RANDOM_ASSETS]

    1000 loops, best of 3: 1.53 ms per loop
    100 loops, best of 3: 2.18 ms per loop

    %timeit df.sum()
    %timeit df_int.sum()

    10000 loops, best of 3: 56 µs per loop
    10000 loops, best of 3: 55.7 µs per loop

    %timeit df.index == 3
    %timeit df_int.index == 3

    1000 loops, best of 3: 253 µs per loop
    100000 loops, best of 3: 6.76 µs per loop

    %timeit df.iloc[:50]
    %timeit df_int.iloc[:50]

    10000 loops, best of 3: 44.3 µs per loop
    10000 loops, best of 3: 44 µs per loop
2015-08-26 18:33:54 -04:00
Richard Frank 30847a10a7 BUG: Interface of load_adjusted_array is to return a list of arrays
but MultiColumnLoader was returning a list of lists of arrays in some
cases.
2015-08-19 10:12:19 -04:00
Scott Sanderson 7bb20eb297 MAINT: Check dates before computing factor_matrix.
In SimpleFFCEngine.factor_matrix barf with a useful error if end_date <=
start_date.
2015-08-03 12:06:24 -04:00
Scott Sanderson ef4f642e62 ENH: Compute engine architecture for FFC API.
This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation.  It contains:

A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs.  Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`.  Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.

A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.

New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`.  Computed
filter results control which assets are available in the factor matrix
on each day.
2015-07-29 12:30:46 -04:00