Now that the tradesimulation loop has changed to use benchmarks
as a 'clock', the logic for setting the current time can be grouped
together at the beginning of each iteration instead of the date
and snapshot grouping.
Also, can remove the snapshot_dt and use simulation_dt instead
of having two variables that were keeping track of the same value.
Also, it is no longer needed to peek into the data to get the first
simulation_dt now that simulation_dt is set at the beginning of each
loop iteration.
The check to filter out orders for zero shares wasn't truncated the
number of shares to an integer before checking, so if a fractional
amount less than 1 was being passed in, it wasn't being filtered out
even though it should have been. This is now fixed.
The override should be used to filter out symbols not in the universe,
however it was returning false positives.
To remove the false positives, after the contains check passes,
ensure that the key exists in the _data member.
In the batch_transform we were incrementing the trading_days counter if there
is a new day event. Thus with a window_length of 1 and daily bars you will
update the batch_transform on the first day which is correct. But with minutes
you update with the first minute bar of the day which is not correct.
This is fixed by calculating the market_close explicity and seeing whether the
event.dt is on or past it.
I also added a unittest to test the correct behavior of this.
Before the change to the RollingPanel, window_length
specified the number of days that should be in a window.
The previous commit broke this if data was minute resolution.
By passing bar='minute' to the batch_transform we internally
multiply the window_length by 60*6.5 to have a full day.
Also adds a (still rudamentary) test for batch_transform
with minute data.
The new example is almost identical to the dual_moving_average one.
However, instead of our in-house mavg transform it uses the new
talib exponential moving average (EMA).
When setting timeperiod in the talib function it subtracts by 1. We then used this subtracted value to set the window_length in the batch_transform which was then not passing a big enough panel. Ultimately this caused the talib transforms to always return nans.
This also makes the unittest more stringent by explicitly comparing the output of the wrapped TALib moving average to pandas rolling_mean().
Finally, this also allows passing of window_length instead of timeperiod to allow usage of the same interface as before.
BarData should, at least for the time being, be compatible with
existing algorithms that had worked against the prior usage of
an ndict as data, which provided `has_key`.
Of note, the Python language has deprecated `has_key` in favor
of using `in` and `__contains__`.
The datetime was only being set while updating the universe with the
bar's trade events.
Now that benchmarks are used as a clock, it is possible to have
benchmarks without having trade bars during that dt, so the datetime
should be updated via benchmark as well.
Set the data_frequency member of an algorithm on the sim_params
configuration object.
Though the extra setting is slightly redundant, it is needed to
ensure that the same data_frequency is used throughout.
Should fix a bug where an algo that was intended to be run in minute
mode was operating as if it were daily in performance.
Possible TODO: Remove data_frequency as a param to TradingAlgorithm,
in favor of only being a property of sim_params.
With the benchmark returns marked at midnight, the performance packet
for a day was emitted *before* any events for that day were processed.
Fix by expecting benchmarks marked at the market close, for backtests
that use minute data but emit performance results daily, so that the
benchmark handles at the end of day.
TST: Also, add test that exercises the event loop with minutely data,
(with benchmarks that are marked end of day), since that combination
was previously uncovered.
Working towards performance and risk logic being aware of
data frequency, as different handling of order of events based
on the data frequency is needed.
Backing out slice vs. valid(), because of an incompatiblity with
starting a minutely emitted session mid-day, since the midday start
date is not yet wired through SimulationParameters.
The slicing syntax is more explicit about declaring:
'get all returns up until the current dt'.
Also, protects against NaNs that occur before the current dt
being silently ignored.
i.e. the *_returns_cont series *should* have values from start
to current dt, but the .valid() call was occluding a bug where
it wasn't.
The leading date of the date range was never called with update,
because in the main loop the todays_date variable was
incremented before update was called.
Fix by moving the increment to the next trading day to after the
call to update.
scipy build was getting too heavyweight for the Travis-CI build.
Removes test_examples from Travis-CI run, since the examples
depend on scipy and other compiled libraries.
Please, still run test_examples before submitting patches.
So that stepping through a debugger is a little easier, with
respect to having easy access to the algorithm object, and seeing
which step in `self.gen` the interpreter is currently at.
Both risk and performance now calculate performance since inception
(cumulative) and since the open. Both periods are updated intraday
and both are reported.
Batch risk for periods starting after the end of the treasury curve
history now use most recent curve.
Critical that trade events be last, so that the perf tracker's
position information be updated with both the transaction and
the trade for last_sale.
Without, the first transaction would be recorded with a last_sale
of 0.
With @fawce
Work towards running an algorithm against 'live' data, which can't
be bound to the available benchmarks and treasuries, since the
benchmarks and treasury curves for that day won't be published
until that night.