The write_data methods invokes the relevant AssetDBWriter subclass
to write data to the database. update_asset_finder is no longer
a relevant method since the AssetFinder is strictly a reader class.
- Add an `ascending=True` keyword to `rank()`.
- Add `top(N)` and `bottom(N)` methods to Factor. These return Filters
that pass the top and bottom N elements each day.
- Add a slightly faster path for rank(method='ordinal'). I had
originally thought the fast path was 2-3x faster because I had my
benchmark data axes flipped. The actual speedup is only 5-10%, which
means it probably wasn't worth the effort to Cythonize...but we have a
slightly faster version now so we might as well use it.
- Refactor test_filter and test_factor to make it easier to implement
and test transformations on factors. These tests now subclass
BaseFFCTestCase, which provides facilities for passing a dict of terms
and an "initial_workspace", the values for which are used by
SimpleFFCEngine rather than needing to manually manage the inputs and
outputs of each term.
- Previously it was returning a DataFrame because of how we applied an &
with a DataFrame mask. The error was masked by the fact that
`np.assert_array_equal` coerces inputs to arrays before comparing.
- Added `zp.utils.test_utils.check_arrays`, which checks type equality
before calling `np.assert_array_equal`.
This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation. It contains:
A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs. Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`. Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.
A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.
New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`. Computed
filter results control which assets are available in the factor matrix
on each day.
Test sources are now defined by the sim_params period_start and period_end, rather than by the period_start and a defined 'count' of bars. This allows us to consider the sim_params.period_end as the canonical definition of the end of a simulation.
The correct thing to look at to figure out where the root of the
zipline tree is, is `zipline.__file__`, not `zipline.__path__`. The
latter could contain multiple directories in it, and is not intended
to be `os.path.join()`ed as the previous code was doing.
Rather than drop files temporarily into the master security lists
directory during unit tests, create temporary directories for the
tests. This avoids issues when the tests are being run at the same
time as other code that uses the real security lists data.
Remove pieces that are no longer used now that the simple transforms are
wrappers around history via the SIDData object.
Move window length related pieces into batch_transform, since the rest
of the utils module is no longer used.
Previously the class SerializeableZiplineObject was used to
house basic __setstate__ and __getstate__ methods. It wasn't
really doing much that was helpful, so it is now gone.
Some unit tests for test_tradingcalendar failed on 2015-03-01, because the
addition of 365 days put the end date at 2016-02-29; when the replaces
the year on that date it fails because there is no 2017-02-29.
Instead use relativedelta with a year argument which accounts for leap
years.
Fixes the following test failure:
```
======================================================================
ERROR: test_day_after_thanksgiving (tests.test_tradingcalendar.TestTradingCalendar)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./tests/test_tradingcalendar.py", line 211, in test_day_after_thanksgiving
tradingcalendar.end.replace(year=tradingcalendar.end.year + 1)
File "tslib.pyx", line 297, in pandas.tslib.Timestamp.replace (pandas/tslib.c:7325)
ValueError: day is out of range for month
----------------------------------------------------------------------
Ran 1 test in 0.001s
```
Limited use of `pandas` data structures in both `HistoryContainer` and
`RollingPanel`. Where possible, methods were amended to return raw
`ndarrays` with the indexing logic done separately. This allows us to
cut down the number of times pandas objects are created both as returns
and intermediate values. The separation of indexing from data access
allowed us to minimize the times we’d make use of pandas indexes.
This required that that certain methods like `NDFrame.ffill` be replaced
with versions that work with `ndarrays`. Some of this was done via
straight numpy methods and others by access pandas internal
machinery. Outside of allowing us to use faster ndarrays, many of these
function provided speedups over their pandas counterparts as we didn’t
require the extra features like handling multiple dtypes. i.e. np.isnan
is faster than pd.isnull, but only works with certain dtypes.
and added a new test case
was not iterating over lookup date directory names, and
therefore mising all by one list of stocks.
discovered because of differing sort orders between
my local machine, other devs, and travis ci.
getting filled with the wrong datetimes and causing errors.
Updates the logic for addressing missing datetimes and adds unit tests
for the 2 main cases (no missing datetimes, and some missing datetimes).
script.
Adds an option kwarg to TradingAlgorithm named 'algo_filename' that
represents the file where the algoscript came from (if any). The
run_algo.py script will pass this argument with the value passed to the
'-f' flag. The default name is '<string>' to represent that the script
is coming from a string in python and not a file. This matches the
behavior of exec and the python convention for compiling code objects.
makes it an offset from 13:30 UTC.
This is to be more consistent with the market_close, which is an offset
from 20:00 UTC.
This also makes market_open and market_close cache the dt to offset from
for each day.
Previously, all specs had to be pre-allocated by using the 'add_history'
function. This is now no longer required and instead serves as a hint to
the HistoryContainer to pre-allocate the space for the given spec.
History can grow by increasing the length for a frequency, adding a
frequency, or adding a field. It can grow with any combination of
these.
HistoryContainer now is aware of the data_frequency of the algorithm,
and no longer uses the daily_at_midnight flag; instead, this is the
default behavior.