This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation. It contains:
A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs. Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`. Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.
A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.
New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`. Computed
filter results control which assets are available in the factor matrix
on each day.
Attack the startup bottleneck of creating the asset finders caches for a
large universe, which was between 1-2 seconds on development and
production machines.
Instead, allow the AssetFinder to be passed a sqlite3 file that has
already been populated and then hydrate asset objects only when an
equity is referenced for the first time.
To create aforementioned sqlite3, create an AssetFinder with an db_path
and `create_table` set to True. If `create_table` is set to False, the
prepopulated data in the sqlite file found at db_path will be used.
Default behavior is to use an in memory database.
Behavior that changes:
- Fuzzy lookup now only works on one character, that character needs to be
specified at write/metadata consumption time, since the fuzzy lookup key
is created by dropping the character from each symbol.
- Overwriting partially written metadata is no longer
supported. i.e. some unit tests allowed for inserting just the identifier,
and then later updating the symbol, end_date, etc.
Instead of building an upsert behavior at this time, this patch
changes the unit tests so that the data for each asset is only
inserted once.
Other notes:
- populate_cache is now removed, since there is no longer a two step
process of inserting metadata and then realizing that metadata into
assets. _spawn_asset is rolled into insert_metadata, so that a call to
insert_metadata both converts the metadata and makes it available in
the data store.
The lookup of future contract by individual symbol is a constraint on
incoming changes of changing how the asset finder stores data.
(i.e. the asset finder is changing so that there are separate tables for
both futures and equities.)
Since this lookup is not yet fully supported, we can add it back in on
top of the new asset finder.
Test sources are now defined by the sim_params period_start and period_end, rather than by the period_start and a defined 'count' of bars. This allows us to consider the sim_params.period_end as the canonical definition of the end of a simulation.
Remove pieces that are no longer used now that the simple transforms are
wrappers around history via the SIDData object.
Move window length related pieces into batch_transform, since the rest
of the utils module is no longer used.
Currently, `order_percent()` and `order_target_percent()` both operate as a percentage of `self.portfolio.portfolio_value`. This PR lets them operate as percentages of other important MVs.
(also adds `context.get_market_value()`, which enables this functionality)
For example:
```python
order_percent('AAPL', 0.5)
order_percent('AAPL', 0.5, percent_of='cash')
order_target_percent('MSFT', 0.1, percent_of='shorts')
tech_stocks = ('AAPL', 'MSFT', 'GOOGL')
tech_filter = lambda p: p.sid in tech_stocks
for stock in tech_stocks:
order_target_percent(stock, 1/3, percent_of_fn=tech_filter)
```
- NotHalfDay only worked at midnight
- week_(start|end) were actually month_(start|end)
- Removes check_args from api.
- Default offset of 30mins for market_(open|close)
schedule_function takes a date rule, a time rule, and a function and
will call the function, passing context and data only when the two rules
fire. This allows for code that is conditional to the datetime of the
algo.
This is implemented internally with `Event` objects which are pairings
of `EventRule`s and callbacks.
handle_data becomes a special event with a rule that always fires. This
makes the logic for handling events more complete and compact.
The initialization of perf_tracker had been moved from __init__
in TradingAlgorithm to _create_generator. This caused perf_tracker
to not be ready when portfolio requested it. portfolio was consequently
not ready for access in init. portfolio can now be accessed in init
again, assuming valid sim_params are passed. Otherwise it will be
available in handle_data() after _create_generator() is called.
Created a new flag in TradingAlgorithm that enables subclasses to
decide if they want to handle setting self.initialized = True.
Before it was the responsibility of an overriding subclass to set
initalized = True. This was causing problems because it's easy to
forget this. Now it is the responsibility of TradingAlgorithm
unless explicity stated otherwise.
Previously, calling order() in initalize resulted in a weird
stack trace. It now returns a well formulated error that is
readable to the user through the API. Adding a slippage
kwarg to test_algorithm and simfactor was necessary because
slippage can only be called during init. Previously initaliazed
was never set to true and calls to init-only function were sprinkled
around the code in non-init sections. Code changes were to enforce
init-only rules.
There were sevaral places you could supply sim_params
in TradingAlgorithm (__init__, run). This got confusing
as its not clear who updated what and which one was the
correct one to use at each time.
Then there were to ways to define data_frequency, one in
__init__() and one in the sim_params which also added code
complexity.
This refactor makes it explicit that sim_params are to be
passed to __init__() only. Moreover, data_frequency is
only stored in sim_params. For backwards compatibility,
it can still be supplied separately but will link to
the one in sim_params.
For example, you could create new sim params via:
sim_params = create_simulation_parameters(data_frequency='minute')
algo = MyAlgo(sim_params)
algo.run(data)
In addition, perf_tracker only gets initialized in one place:
_create_generator() which should also make the various ways
of running an algorithm more deterministic.
This also fixes a bug with SimulationParameters where
you could not change the period_start. Unfortunately, the
current implementation still requieres an implicit call to
update the internal variables.
Filter out empty lists from `get_open_orders` so that we have consistent
behavior between the case where a user has never placed an order and the case
where the user has placed an order but it has been executed or cancelled.
A nice side-effect, which was the impetus for this change, is that you can
check if you have any open orders by doing:
```
len(get_open_orders()) == 0
```
Also adds a test for the behavior of `get_open_orders`, which was previously
lacking.
Adds four new methods to the Zipline API that can be used as circuit-breakers
to interrupt the execution of an algorithm. The API methods are:
`set_max_position_size`
`set_max_order_size`
`set_max_order_count`
`set_long_only`
Internally, these methods are implemented by each registering a TradingControl
callback object with the TradingAlgorithm. During
TradingAlgorithm.__validate_order_params (and thus before any side-effects of
the order call occur), each callback's `validate` method is called with
information about the order to be placed and the algorithm's current state,
raising an exception if the callback detects that an error condition has been breached.
Add a test case in test_algorithms to verify that appropriate exceptions are
thrown if an algorithm makes a call to the order api with a stop/limit price
and a style.
Add `style` parameter to order_value, order_percent, order_target,
order_target_percent, and order_target_value methods. The style parameter is
forwarded to the underlying call to `order`.
Adds a test algorithm that tries to buy with very high limit prices/very low
stop prices and tries to sell with very low limit prices/very high stop prices.