Commit Graph

25 Commits

Author SHA1 Message Date
Scott Sanderson bd147a084e BUG: Fix crash on .latest for integer-typed columns.
Int columns get coerced to float on load, and we don't currently support
non-float columns from CustomFactors.
2015-09-21 13:20:26 -04:00
Scott Sanderson 26fd6fda8b ENH/BUG: Modeling API enhancements.
- Fixes an error where Modeling API data known as of the close of `day
  N` would be shown to algorithms during `before_trading_start` as of
  the close of the same day.  Algorithms should now only receive data
  during `before_trading_start/handle_data` that was known as of the
  simulation time at which the function would be called.

- All Term instances now have a `mask` attribute that must be a `Filter`
  or an instance of `AssetExists()`.  `mask` can be used to specify that
  a Factor should be computed in a manner that ignores the values that
  were not `True` in the mask.

- Changed the interface for `FFCLoader.load_adjusted_array` and
  `Term._compute` from `(columns, mask)`, with mask as a DataFrame, to
  `(columns, dates, assets, mask)`, where mask is a numpy array.  This
  is primarily to avoid having to reconstruct extra DataFrames when
  using masks produced by non `AssetExists` filters.

- Adds `BoundColumn.latest`, which gives the most-recently-known value
  of a column.
2015-09-16 01:47:11 -04:00
Scott Sanderson 46882bfcb9 TST: Remove unnecessary instance attr assignments. 2015-09-16 01:28:16 -04:00
Scott Sanderson 5730de25a4 DEV: Kill compute_from_{arrays,windows}.
All terms just implement `_compute` now. (We reserve `compute` for the
public API of `CustomFactor`.)

Also removed `TestingTermMixin` and its subclasses in favor of just
using `CustomFactor.`
2015-09-16 01:28:15 -04:00
Scott Sanderson 58ceb7b7bb DEV: Add zipline.utils.memoize.
- Moved zipline.utils.lazyval.
- Added `remember_last` which is just `lru_cache(1)` with simpler logic.
2015-09-16 01:28:15 -04:00
Thomas Wiecki 6f2e1672d7 BUG: Forward initialize args and kwargs to user-defined function.
The initialize method of TradingAlgorithm no longer accepts and
silently ignores args and kwargs, but instead forwards them
to the user-defined function referenced by self._initialize.

To avoid passing unexpected arguments to self._initialize, the
following additional adjustments are made:

 - pop 'namespace' from the kwargs supplied to TradingAlgorithm
   rather than simply get()ing it

 - do not pass an AssetFinder to the TradingAlgorithm in
   test_modelling_algo.py, as this has been deprecated and will
   cause self._initialize to fail
2015-09-14 10:42:20 -04:00
jfkirk 4770c5956c TST: Fixes modelling test tearDown 2015-09-10 11:53:29 -04:00
jfkirk 6e6ef447d2 TST: Adds tearDownClass methods to delete TradingEnvironments 2015-09-10 11:53:29 -04:00
jfkirk 262f0b7d09 MAINT: Removes mutable default method args
Also removes accidental modifications to Jenkins
2015-09-10 11:53:29 -04:00
jfkirk 265bffbe01 TST: Updates modelling test base for new TradingEnvironment 2015-09-10 11:53:29 -04:00
jfkirk 3c1eea4ca1 MAINT: Removes references to trading.environment 2015-09-10 11:53:29 -04:00
jfkirk 35ed8c28a8 TST: Fixes modelling test to use new TradingEnvironment framework 2015-09-10 11:53:28 -04:00
jfkirk dc964a7e7d MAINT: Removes the ability to reference a global TradingEnvironment
This commit removes the ability to reference a shared TradingEnvironment through the zipline.finance.trading module. In place, the classes that require a TradingEnvironment, or its child AssetFinder, contain their own references to those objects.

This commit also adds serialization utilities that allow for the pickling/unpickling of objects without unintentionally their TradingEnvironments or AssetFinders.
2015-09-10 11:53:28 -04:00
Stewart Douglas d3516959a3 MAINT: Don't set string to upper before writing, remove unused libs 2015-09-10 11:53:27 -04:00
Stewart Douglas 1ef2274d11 MAINT: Update tests to conform to new reader/writer structure 2015-09-10 11:53:26 -04:00
Scott Sanderson 6e8a4b8144 ENH: Improvements to rank().
- Add an `ascending=True` keyword to `rank()`.

- Add `top(N)` and `bottom(N)` methods to Factor.  These return Filters
  that pass the top and bottom N elements each day.

- Add a slightly faster path for rank(method='ordinal').  I had
  originally thought the fast path was 2-3x faster because I had my
  benchmark data axes flipped.  The actual speedup is only 5-10%, which
  means it probably wasn't worth the effort to Cythonize...but we have a
  slightly faster version now so we might as well use it.

- Refactor test_filter and test_factor to make it easier to implement
  and test transformations on factors.  These tests now subclass
  BaseFFCTestCase, which provides facilities for passing a dict of terms
  and an "initial_workspace", the values for which are used by
  SimpleFFCEngine rather than needing to manually manage the inputs and
  outputs of each term.
2015-08-31 00:32:33 -04:00
Scott Sanderson a04dcfa6b8 TEST: Rename test. 2015-08-29 23:55:59 -04:00
Scott Sanderson 90e81d0df0 MAINT: Add TermGraph class.
Use a subclass of networkx.DiGraph to encapsulate the state of our
dependency graph.
2015-08-29 23:55:59 -04:00
Scott Sanderson 780263da06 ENH: Return asset-indexed DataFrame for data.factors.
This makes ordering with the returned assets much easier, and there's no
performance degradation for non-broadcasting operations on the Index.

Timings
-------

    from random import sample
    finder = AssetFinder(create_table=False, assets.db')
    assets = load_8000_assets(finder)
    AAPL = finder.retrieve_asset(24)
    RANDOM_ASSETS = sample(assets, 500)
    df = DataFrame(
        index=assets,
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )
    df_int = DataFrame(
        index=map(int, assets),
        data=np.random.randn(len(assets), 4),
        columns=['a', 'b', 'c', 'd'],
    )

    %timeit df.loc[24]
    %timeit df_int.loc[24]

    10000 loops, best of 3: 45.3 µs per loop
    10000 loops, best of 3: 44.7 µs per loop

    %timeit df.loc[AAPL]
    %timeit df_int.loc[AAPL]

    10000 loops, best of 3: 45.1 µs per loop
    10000 loops, best of 3: 44.8 µs per loop

    %timeit df.loc[RANDOM_ASSETS]
    %timeit df_int.loc[RANDOM_ASSETS]

    1000 loops, best of 3: 1.53 ms per loop
    100 loops, best of 3: 2.18 ms per loop

    %timeit df.sum()
    %timeit df_int.sum()

    10000 loops, best of 3: 56 µs per loop
    10000 loops, best of 3: 55.7 µs per loop

    %timeit df.index == 3
    %timeit df_int.index == 3

    1000 loops, best of 3: 253 µs per loop
    100000 loops, best of 3: 6.76 µs per loop

    %timeit df.iloc[:50]
    %timeit df_int.iloc[:50]

    10000 loops, best of 3: 44.3 µs per loop
    10000 loops, best of 3: 44 µs per loop
2015-08-26 18:33:54 -04:00
Scott Sanderson f7039d6f52 ENH: Make data available in before_trading_start. 2015-08-21 12:37:17 -04:00
Richard Frank 30847a10a7 BUG: Interface of load_adjusted_array is to return a list of arrays
but MultiColumnLoader was returning a list of lists of arrays in some
cases.
2015-08-19 10:12:19 -04:00
Scott Sanderson b89fc0c028 BUG: Fix error from RequiredWindowLengthMixin.
WindowLengthNotSpecified expects an argument.
2015-08-04 01:41:03 -04:00
Scott Sanderson 7bb20eb297 MAINT: Check dates before computing factor_matrix.
In SimpleFFCEngine.factor_matrix barf with a useful error if end_date <=
start_date.
2015-08-03 12:06:24 -04:00
Scott Sanderson 5da03d2df5 BUG: Make NumExprFilter return ndarray.
- Previously it was returning a DataFrame because of how we applied an &
  with a DataFrame mask.  The error was masked by the fact that
  `np.assert_array_equal` coerces inputs to arrays before comparing.

- Added `zp.utils.test_utils.check_arrays`, which checks type equality
  before calling `np.assert_array_equal`.
2015-08-03 11:59:11 -04:00
Scott Sanderson ef4f642e62 ENH: Compute engine architecture for FFC API.
This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation.  It contains:

A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs.  Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`.  Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.

A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.

New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`.  Computed
filter results control which assets are available in the factor matrix
on each day.
2015-07-29 12:30:46 -04:00