Reduce overhead of the attribute access of grabbing the process_event
and process_trade from both the algorithm object and the perf_tracekr or
blotter, by assigning those functions to a variable once per snapshot.
For more discussion see https://github.com/quantopian/zipline/pull/485.
Basically, 1000 is just a number that was supposed to be high enough if no volume was available. It turns out that number is actually very low so now we are increasing it so that volume restrictions should no matter. 1e9 of shares ought to be enough for anybody ;).
Thanks to @jlowin for pointing that out.
Currently, `order_percent()` and `order_target_percent()` both operate as a percentage of `self.portfolio.portfolio_value`. This PR lets them operate as percentages of other important MVs.
(also adds `context.get_market_value()`, which enables this functionality)
For example:
```python
order_percent('AAPL', 0.5)
order_percent('AAPL', 0.5, percent_of='cash')
order_target_percent('MSFT', 0.1, percent_of='shorts')
tech_stocks = ('AAPL', 'MSFT', 'GOOGL')
tech_filter = lambda p: p.sid in tech_stocks
for stock in tech_stocks:
order_target_percent(stock, 1/3, percent_of_fn=tech_filter)
```
Limited use of `pandas` data structures in both `HistoryContainer` and
`RollingPanel`. Where possible, methods were amended to return raw
`ndarrays` with the indexing logic done separately. This allows us to
cut down the number of times pandas objects are created both as returns
and intermediate values. The separation of indexing from data access
allowed us to minimize the times we’d make use of pandas indexes.
This required that that certain methods like `NDFrame.ffill` be replaced
with versions that work with `ndarrays`. Some of this was done via
straight numpy methods and others by access pandas internal
machinery. Outside of allowing us to use faster ndarrays, many of these
function provided speedups over their pandas counterparts as we didn’t
require the extra features like handling multiple dtypes. i.e. np.isnan
is faster than pd.isnull, but only works with certain dtypes.
and added a new test case
was not iterating over lookup date directory names, and
therefore mising all by one list of stocks.
discovered because of differing sort orders between
my local machine, other devs, and travis ci.
Alleviates bottleneck caused re-indexing into a pd.Series during a tight
loop, by keeping track of the index value into the underlying `.values`
in a lookup table.
Based on suggestion from @dalejung
Risk calculations are robust to nans, except for
beta which calls numpy with the complete list of
algorithm_returns. If nans are present the result
of covar will be nan.
This is fixed by filtering out nans in
algorithm_returns.