Currently, just provide a way to render to some of the data extracted.
Intended to have more thorough documentation of the spreadsheet,
explaining derivation/calculations in each sheet and column.
Add indexes for Sharpe, returns and other values needed for
reading answers for cumulative risk metrics.
Prepare for unit test and matching change of implementation.
Refactor the reading of values from the Excel spreadsheet so that
parsers are configurable by index.
Needed so that we can parse columns that have dates, in addition
to floats as previously.
Instead of using the indexes defined in the answer key class
to index back into the answer key object, populate the answers
so that they are available as members of the answer key object.
Update period risk test to use new answer key structure.
Also, remove the rounding behavior from the answer sheet, leaving
the rounding to the consumer of the answer key values, so that
the values can be retrieved from the spreadsheet during answer
key __init__ without knowledge of the decimal point that the calling
code expects.
Correspondingly, change period risk tests to use
np.testing.assert_almost_equal when doing floating point comparison.
Move to a new notebook, the AnswerKeyLink will be for a permalink
to the current version of the answer key, of which the output
won't be too noisy in git.
The annotation notebook will also be kept in source control, but
without output, since the table html output is large.
For more improved viewing experience via nbviewer.ipyhton.org,
include the output of the notebook.
When saving/updating this file, a fresh kernel and evaluation of
the entire notebook should be used so that the cell numbers stay
in order.
Read tests.risk.answer_key module into an IPython notebook, to start
a base to which answer key values and explanations can be added and
display without access to Excel.
For now, the notebook just provides the latest download link for
the spreadsheet.
Update answer key for cumulative risk:
- Annualization of Sharpe
- Use 10 year period
- Use of daily returns vectors instead of compounded return scalar.
No tests or risk module code are currently reading off of this sheet,
but developing it ahead of work in risk module so that the sheet can
be examined and vetted.
Instead of using a pandas Series of with dictionaries as the
values treasury curves, use a DataFrame which more naturally fits
the data type of a having a timeseries with mulitple values.
Should allow easier slicing/manipulation of the treasury curves,
e.g. getting 10 year curves would now be:
```
treasury_curves['10year']
```
Before we were setting benchmark returns on the first day
to 0. This commit changes this by calculating the benchmark
return from open to close.
According to @eherbert this is also what the answer key does.
zipline.__version__ is now present. Closes#94.
Moreover, git master should have a .dev version string according
to convention. Releases then get the .dev label removed.
Also remove test that compares risk metrics batch to iterative,
since the 'iterative' calculations, replaced by the cumulative
calculations, will intentionally drift from the results in the risk
report due to annualization and other factors.
Work towards having separate calculations for the fixed periods versus
the cumulative/headline risk metrics.
Different sumbodules for each type should help make the calculations
type distinct and easier to find.
In anticipation of splitting apart the different risk classes
into their own submodules, a distinct risk module should help
organize those new classes.
For consistency, datetimes returned by the trading calendar should
always show HHMMSS of midnight UTC. Not only is this useful for
consistency, but it also allows us to check if a particular date() is
in an array of these datetimes, because they will hash to the same
thing. For example:
early_closes = get_early_closes()
... later ...
if current_bar_datetime.date() in early_closes:
... today closes early ...
If if the datetimes returned by the trading calendar functions don't
have 00:00:00 for HHMMSS, then the "in" check above will fail because
the date and the datetimes in early_closes won't hash to the same
thing.
If a stock stops gettign updated values, e.g. if a stock rolls out
of a universe strategy, currently the underlying batch transform
for TALib may have nans (which is another issue that could be addressed),
the nans cause crashes when passed to some TALib function, e.g. Bollinger
Bands are incompatible with all nan values.
So, drop sids that only have nan values for the current data panel.
Since these modules are not requirements, make the name more clear
about the distinction. Especiall, so that build scripts do not pick
up this file when including wildcards whit a requirements prefix.
The defaultdict behavior was allowing both algo code and
TradingAlgorithm wrappers to add unintended keys.
Remove use of defaultdict in favor of a dictionary that explicitly
adds the values in tradesimulation, otherwise allow a KeyError
if the bar is indexed with a sid that doesn't exist.
Also, when iterating over the keys in the data bar, only return
those keys that have pricing data.
The deepcopy of events into the EventWindow's ticks was causing
a significant increase in memory consumption, e.g. an algorithm with
almost 200 sids and 14 vwaps removing the deepcopy reduces the amount
of memory consumed by about 40%.
The downside is that if an event's properties are changed, which is
not advised, later on, then the signal derived from vwap etc.
may be changed.
For maintainer use, requires AWS credentials for the account where
the `zipline-test-data` bucket is hosted.
Script does the following steps which used to be manual:
- Create a key name based on the md5 of the answer key file.
- Upload the answer key to S3 bucket.
- Make the file publically downloadable over HTTP.