lookup_symbol_resolve_multiple was identical to lookup_symbol, except that lookup_symbol performed upper-casing of the input string and lookup_symbol would return Nones. Now, lookup_symbol has a kwarg 'default_None=True' and all symbols are upper-cased on insertion and request.
- Fixes an error where Modeling API data known as of the close of `day
N` would be shown to algorithms during `before_trading_start` as of
the close of the same day. Algorithms should now only receive data
during `before_trading_start/handle_data` that was known as of the
simulation time at which the function would be called.
- All Term instances now have a `mask` attribute that must be a `Filter`
or an instance of `AssetExists()`. `mask` can be used to specify that
a Factor should be computed in a manner that ignores the values that
were not `True` in the mask.
- Changed the interface for `FFCLoader.load_adjusted_array` and
`Term._compute` from `(columns, mask)`, with mask as a DataFrame, to
`(columns, dates, assets, mask)`, where mask is a numpy array. This
is primarily to avoid having to reconstruct extra DataFrames when
using masks produced by non `AssetExists` filters.
- Adds `BoundColumn.latest`, which gives the most-recently-known value
of a column.
All terms just implement `_compute` now. (We reserve `compute` for the
public API of `CustomFactor`.)
Also removed `TestingTermMixin` and its subclasses in favor of just
using `CustomFactor.`
The initialize method of TradingAlgorithm no longer accepts and
silently ignores args and kwargs, but instead forwards them
to the user-defined function referenced by self._initialize.
To avoid passing unexpected arguments to self._initialize, the
following additional adjustments are made:
- pop 'namespace' from the kwargs supplied to TradingAlgorithm
rather than simply get()ing it
- do not pass an AssetFinder to the TradingAlgorithm in
test_modelling_algo.py, as this has been deprecated and will
cause self._initialize to fail
This commit removes the ability to reference a shared TradingEnvironment through the zipline.finance.trading module. In place, the classes that require a TradingEnvironment, or its child AssetFinder, contain their own references to those objects.
This commit also adds serialization utilities that allow for the pickling/unpickling of objects without unintentionally their TradingEnvironments or AssetFinders.
The write_data methods invokes the relevant AssetDBWriter subclass
to write data to the database. update_asset_finder is no longer
a relevant method since the AssetFinder is strictly a reader class.
The AssetDBWriter class and its subclasses will
ultimately be responsible for creating the SQLite
database tables and writing data to these tables.
In the longer term AssetDBWriter and AssetFinder will
be decoupled, sharing only an SQLite connection.
However, for backward compatibility reasons this has
not yet been fully implemented.
Modify tests since AssetFinder no longer has a
metadata_cache attribute.
- Add an `ascending=True` keyword to `rank()`.
- Add `top(N)` and `bottom(N)` methods to Factor. These return Filters
that pass the top and bottom N elements each day.
- Add a slightly faster path for rank(method='ordinal'). I had
originally thought the fast path was 2-3x faster because I had my
benchmark data axes flipped. The actual speedup is only 5-10%, which
means it probably wasn't worth the effort to Cythonize...but we have a
slightly faster version now so we might as well use it.
- Refactor test_filter and test_factor to make it easier to implement
and test transformations on factors. These tests now subclass
BaseFFCTestCase, which provides facilities for passing a dict of terms
and an "initial_workspace", the values for which are used by
SimpleFFCEngine rather than needing to manually manage the inputs and
outputs of each term.
This makes ordering with the returned assets much easier, and there's no
performance degradation for non-broadcasting operations on the Index.
Timings
-------
from random import sample
finder = AssetFinder(create_table=False, assets.db')
assets = load_8000_assets(finder)
AAPL = finder.retrieve_asset(24)
RANDOM_ASSETS = sample(assets, 500)
df = DataFrame(
index=assets,
data=np.random.randn(len(assets), 4),
columns=['a', 'b', 'c', 'd'],
)
df_int = DataFrame(
index=map(int, assets),
data=np.random.randn(len(assets), 4),
columns=['a', 'b', 'c', 'd'],
)
%timeit df.loc[24]
%timeit df_int.loc[24]
10000 loops, best of 3: 45.3 µs per loop
10000 loops, best of 3: 44.7 µs per loop
%timeit df.loc[AAPL]
%timeit df_int.loc[AAPL]
10000 loops, best of 3: 45.1 µs per loop
10000 loops, best of 3: 44.8 µs per loop
%timeit df.loc[RANDOM_ASSETS]
%timeit df_int.loc[RANDOM_ASSETS]
1000 loops, best of 3: 1.53 ms per loop
100 loops, best of 3: 2.18 ms per loop
%timeit df.sum()
%timeit df_int.sum()
10000 loops, best of 3: 56 µs per loop
10000 loops, best of 3: 55.7 µs per loop
%timeit df.index == 3
%timeit df_int.index == 3
1000 loops, best of 3: 253 µs per loop
100000 loops, best of 3: 6.76 µs per loop
%timeit df.iloc[:50]
%timeit df_int.iloc[:50]
10000 loops, best of 3: 44.3 µs per loop
10000 loops, best of 3: 44 µs per loop
If lookup_future_chain was provided with an as_of_date or knowledge date that was pandas.NaT, the query we were forming wasn't what we want. Instead, as_of_date, if not NaT, is used for knowledge_date, and if both are NaT, no date filtering is done in the query.