Use date sorted sources instead, instead of sorting with second
argument of Event, etc. since the `heapq.merge` behavior is using
the second part of the tuple, thus requiring a richer set of comparison
methods, which would only be used in the test context.
Use `date_sorted_sources` instead, so that sorting is done on algo time
and source id.
Python 3 removes the `.message` attribute, so use `str` instead.
Also, the divide by zero message has changed slightly between versions,
so just check for the exception type, instead of also checking the message.
The rename of walk is not provided by six, so check the import error
via an exception.
Also, callback behavior slightly changes between the two versions,
so instead iterate over the walked files and call what was formerly
a callback, directly as a function.
Python 3 uses the `__next__` method instead of `next`,
and uses the syntax of `next(foo)` accordingly.
Add `__next__` and `next` side-by-side so both Python 2 and 3 have
a method that can be used during iteration.
Python 3 requires submodules to have more explicit pathing, so use
the dot syntax to declare submodules which are in the same directory
as another module.
Use the six module to import functions and types that are
consistent between Python 2 and 3, so that one code base can
support both versions.
- Use integer types instead of int and long.
- Use string_types instead of basestring.
- Account for iteritems, itervalues, iterkeys.
- Use six.moves for filter and zip, reduce
- Use compatible bytes for md5 hasher.
- xrange and range
- Use `print()` function for all print calls
- Fix strip and format calls that were on the outside of the
print function for some reason.
(Which were breaking in Python 3 because of print returning None.)
- Remove commented out print calls.
Note that the calendar test is decorated with @nottest (as per the other calendar test functions). I've run the test to confirm the calendar works. The differences between the env (Yahoo Finance of GSPTSE) and the calendar are illustrated in the tradingcalendar_tse file and are confirmed to be errors on Yahoo Finance's part.
A bug in the create_random_simulation_parameters allows the period
start to be a non-trading day.
That bug was causing the commission tests to randomly fail, e.g.
when the period start was on Good Friday, because the commission was
created on hour three of Good Friday, instead of the next Monday.
When it hit that case, the test commission is never processed.
Defend against that bug by using the first open of the simulation
parameters which is more guaranteed to be during market hours,
when creating the test commission.
This is in place of fixing the bug in the random parameters function
or making the parameters non-random, which are other potential fixes.
Changes to trading calendar and environments for supporting market
minutes, etc. have made the non-NYSE stock exchange support lag.
Disabling the test, with the intent of bringing support back up to
parity with NYSE.
So that we can more clearly demarcate each case of buy/sell and
price compared to stop, and their expected outputs.
Also, add comment about the current behavior versus the behavior
that will be moved to in an upcoming fix.
Remove the lists of DailyReturn objects in favor of using pd.Series
to store the return values.
Should make it easier to inspect the values when stepping through,
make the windowing of data to a certain range more facile by using,
and have some performance increases due to removing object creation
and member access.
These tests use the random simulation parameters, which is leading
to an intermittent failure.
We may want to consider removing the randomness, but in the meantime
the randomness is exposing a case where the cost basis is not the value
expected, so logging the sim parameter values to help track down what
parameters cause the failure.
So that the units match the other risk calculations, also
use annualized returns for beat and alpha.
Update answer key to match values calculated on the first day.
Also, update performance tracker test so that the returns used
are fractional instead of > 1, so that the annualized numbers are
more in line with real world values.
This could perhaps be labelled BUG, as well.
Change the Sharpe (and algorithm volatiilty) value used to compare
algorithms/backtests so that it is annualized and uses daily returns.
Previously, the Sharpe metric was using the same calculation style
as the fixed size periods, i.e. 3 Month, 6 Month, etc., which can
use the geometric mean when comparing against the risk free.
Change the Sharpe calculation to use the arithmetic mean differenc
against the risk free rate, using daily (non-compounded) values.
Also, use annualized mean returns.
Most of the functions in date_utils can be done via pandas.
The other functions are no longer used for loading, etc. so remove
the date_utils module to reduce the total surface area of Zipline core.
Continue on path of converting values stored inside of risk metrics
to use a DataFrame instead of storing multiple lists.
Also, the need for latest_dt in getting the current volatility for
the sharpe calculation, shows that we need to set the lastest_dt at
the beginning of the update loop.
Indexes to risk answers were pointing to a previous version.
Also, provide the risk cumulative answers as a pd.Series,
so that it is easier to compare to values produced by risk class.
to account for minimum price variation.
On an order to buy, between .05 below to .95 above a penny, use that penny.
On an order to sell, between .05 above to .95 below a penny, use that penny.
Eventually, all cumulative metrics, (alpha, beta, etc.) will be
stored in the same DataFrame
For easier tracking of dt to values during debugging, but should be
some performance gains as well.
Correct the annualization factor from being 1/sqrt(252), since
the annualization was applied to the volatility, by including
252 in the Sharpe's numerator.
Currently, just provide a way to render to some of the data extracted.
Intended to have more thorough documentation of the spreadsheet,
explaining derivation/calculations in each sheet and column.
Add indexes for Sharpe, returns and other values needed for
reading answers for cumulative risk metrics.
Prepare for unit test and matching change of implementation.
Refactor the reading of values from the Excel spreadsheet so that
parsers are configurable by index.
Needed so that we can parse columns that have dates, in addition
to floats as previously.
Instead of using the indexes defined in the answer key class
to index back into the answer key object, populate the answers
so that they are available as members of the answer key object.
Update period risk test to use new answer key structure.
Also, remove the rounding behavior from the answer sheet, leaving
the rounding to the consumer of the answer key values, so that
the values can be retrieved from the spreadsheet during answer
key __init__ without knowledge of the decimal point that the calling
code expects.
Correspondingly, change period risk tests to use
np.testing.assert_almost_equal when doing floating point comparison.
Move to a new notebook, the AnswerKeyLink will be for a permalink
to the current version of the answer key, of which the output
won't be too noisy in git.
The annotation notebook will also be kept in source control, but
without output, since the table html output is large.