Instead of having separate ExchangeCalendar and TradingSchedule objects, we
now just have TradingCalendar. The TradingCalendar keeps track of each
session (defined as a contiguous set of minutes between an open and a close).
It's also responsible for handling the grouping logic of any given minute
to its containing session, or the next/previous session if it's not a market
minute for the given calendar.
This commit removes the ability to reference a shared TradingEnvironment through the zipline.finance.trading module. In place, the classes that require a TradingEnvironment, or its child AssetFinder, contain their own references to those objects.
This commit also adds serialization utilities that allow for the pickling/unpickling of objects without unintentionally their TradingEnvironments or AssetFinders.
The minutely calculation of risk metrics had been removed with a
previous patch, remove vestigial references.
Remove a test which tested the behavior of updating the second minute of
a day.
Remove the logic that changed the datetime index of the risk metrics
depending on emission rate, now only trading_days are needed.
Remove `returns_frequency` parameter since both minute and daily
data frequency always use daily returns.
Instead of using the pandas.Series datetime index for every single
vector, get the index at the beginning of the update loop based on the
dt and then use that index to set the values.
Also, since the dt lookup is no longer needed, store the values as numpy
arrays, which are more lightweight.
Locally, this patch cuts out about 60% of the time spent in the update
method.
The calculations that are expected to change are:
- cumulative.beta
- cumulative.alpha
- cumulative.information
- cumulative.sharpe
- period.sortino
* Explanation of how risk calculations are changing
** Risk Fixes for Both Period and Cumulative
*** Downside Risk
Use sample instead of population for standard deviation.
Add a rounding factor, so that if the two values are close for a given
dt, that they do not count as a downside value, which would throw off
the denominator of the standard deviation of the downside diffs.
*** Standard Deviation Type
Across the board the standard deviation has been standardized to using
a 'sample' calculation, whereas before cumulative risk was monstly using
'population'. Using `ddof=1` with `np.std` calculates as if the values
are a sample.
** Cumulative Risk Fixes
*** Beta
Use the daily algorithm returns and benchmarks instead of annualized
mean returns.
*** Volatility
Use sample instead of population with standard deviation.
The volatility is an input to other calculations so this change affects
Sharpe and Information ratio calculations.
*** Information Ratio
The benchmark returns input is changed from annualized benchmark returns
to the annualized mean returns.
*** Alpha
The benchmark returns input is changed from annualized benchmark returns
to the annualized mean returns.
** Period Risk Fixes
*** Sortino
Use the downside risk of the daily return vs. the mean algorithm returns
for the minimum acceptable return instead of the treasury return.
The above required adding the calculation of the mean algorithm returns
for period risk.
Also, use algorithm_period_returns and tresaury_period_return as the
cumulative Sortino does, instead of using algorithm returns for both
inputs into the Sortino calculation.
* Other Supporting Changes
** answer_key
Add new mappings for downside risk and Sortino as well as
re-address the index mappings because of changes to the answer key
spread sheet.
** test_risk_cumulative
Change the decimal precision to expect higher precision.
The calculations are now more aligned with the answer key, so we can
expect higher precision. In particular now that the standard deviation
type matches everywhere in both the Python implementation and the answer
sheet, the precision of the first value no longer has to be glossed over.
** test_events_through_risk
Change the results which are used as a canary for risk changes,
since we do expect Sharpe to change with this change..
Change the algorithm volatility test to use the same iterkv style
as the rest of the suite, as it was useful to be able to zero in
on the offending date when debugging changes to the risk module.
The risk unit tests were using the public Yahoo! data instead
of the returns from the answer key spreadsheet, change the RiskPeriod's
created in tests to use the values in the benchmark returns
column of the answer key.
Also, change the spreadsheet's benchmark volatility calculation
to use sample.
The use of population was exposed when the input values were
corrected.
The input into max drawdown was incorrect, causing the bad results.
i.e. the `compounded_log_returns` were not values representative of
the algorithms total return at a given time, though
`calculate_max_drawdown` was treating the values as if they were.
Instead, use the `algorithm_period_returns` series, which does provide
the total return.
Update risk answer key with an Excel calculation of max drawdown
to help corroborate the calculations.
Also, remove `compounded_log_returns`, (which actually had stopped
being the `compounded_log_returns` at some point), since the max
drawdown was the only calculation using the values in that series.
Python 3 requires submodules to have more explicit pathing, so use
the dot syntax to declare submodules which are in the same directory
as another module.
Remove the lists of DailyReturn objects in favor of using pd.Series
to store the return values.
Should make it easier to inspect the values when stepping through,
make the windowing of data to a certain range more facile by using,
and have some performance increases due to removing object creation
and member access.
So that the units match the other risk calculations, also
use annualized returns for beat and alpha.
Update answer key to match values calculated on the first day.
Also, update performance tracker test so that the returns used
are fractional instead of > 1, so that the annualized numbers are
more in line with real world values.
This could perhaps be labelled BUG, as well.
Change the Sharpe (and algorithm volatiilty) value used to compare
algorithms/backtests so that it is annualized and uses daily returns.
Previously, the Sharpe metric was using the same calculation style
as the fixed size periods, i.e. 3 Month, 6 Month, etc., which can
use the geometric mean when comparing against the risk free.
Change the Sharpe calculation to use the arithmetic mean differenc
against the risk free rate, using daily (non-compounded) values.
Also, use annualized mean returns.
Indexes to risk answers were pointing to a previous version.
Also, provide the risk cumulative answers as a pd.Series,
so that it is easier to compare to values produced by risk class.
Correct the annualization factor from being 1/sqrt(252), since
the annualization was applied to the volatility, by including
252 in the Sharpe's numerator.
Currently, just provide a way to render to some of the data extracted.
Intended to have more thorough documentation of the spreadsheet,
explaining derivation/calculations in each sheet and column.
Add indexes for Sharpe, returns and other values needed for
reading answers for cumulative risk metrics.
Prepare for unit test and matching change of implementation.
Refactor the reading of values from the Excel spreadsheet so that
parsers are configurable by index.
Needed so that we can parse columns that have dates, in addition
to floats as previously.
Instead of using the indexes defined in the answer key class
to index back into the answer key object, populate the answers
so that they are available as members of the answer key object.
Update period risk test to use new answer key structure.
Also, remove the rounding behavior from the answer sheet, leaving
the rounding to the consumer of the answer key values, so that
the values can be retrieved from the spreadsheet during answer
key __init__ without knowledge of the decimal point that the calling
code expects.
Correspondingly, change period risk tests to use
np.testing.assert_almost_equal when doing floating point comparison.
Move to a new notebook, the AnswerKeyLink will be for a permalink
to the current version of the answer key, of which the output
won't be too noisy in git.
The annotation notebook will also be kept in source control, but
without output, since the table html output is large.
For more improved viewing experience via nbviewer.ipyhton.org,
include the output of the notebook.
When saving/updating this file, a fresh kernel and evaluation of
the entire notebook should be used so that the cell numbers stay
in order.
Read tests.risk.answer_key module into an IPython notebook, to start
a base to which answer key values and explanations can be added and
display without access to Excel.
For now, the notebook just provides the latest download link for
the spreadsheet.
Update answer key for cumulative risk:
- Annualization of Sharpe
- Use 10 year period
- Use of daily returns vectors instead of compounded return scalar.
No tests or risk module code are currently reading off of this sheet,
but developing it ahead of work in risk module so that the sheet can
be examined and vetted.