So that TALib is still available, but smooth out the ability to
run tests with some issues that bear investigating.
- Ignore MAVP during tests.
- Temporarily use a "regular" member instead of __doc__ string.
(TODO: look into using `type` to generate the class)
- During tests wait until a window exists.
In the previous implementation of batch transform it happened
that a window_length of `0` caused the transform to update on every
bar, for the time being that behavior should be retained,
though the new rolling implementation more correctly aligns to the
term of 'period' so a period of 1 would achieve the same effect.
When moving BatchTransform off of EventWindow as a base object,
the checking of window length was lost, restore that check using
the same function as EventWindow.
The use of np.allclose introduced a severe performance penalty,
caused by the creation of two `np.array`s for each check.
Instead create and use a similar check which maintains tolerance
to floating point rounding, but operates only on scalars.
Instead of creating a list of benchmarks in the risk module,
stream benchmarks through the system as events, starting from the
algorithm generator.
Works towards more easily setting arbritrary pricing data as
a a benchmark, as well as working towards live minutely benchmarks.
The batch_transform decorator wipes out the doc string of the function
it wraps. Decorate the creator with functools.wraps to preserve function
metadata.
- perf modified to let non-performance related events flow through.
- changes to support streaming non-trading data through batch transforms
and for mixing in sids with just custom data.
- allowing CUSTOM events to flow through to transforms.
- Added logic to maintain pre-specified sid filter.
- repeated calls with the same data window do not update batch transform
windows.
- repeated calls with the same data and same supplemental parameters do
not update batch transform results
- repeated calls with the same data and different supplemental params
do update batch transform results
- removed use_panel
- default for refresh_period is now 0
- refresh_period will only affect the recreation of the datapanel
- user's transform method is invoked on every call to batch transform
- the underlying dequeue shouldn't be modified, so forwarding to the
user function is a bit misleading
- if we want to provide a dequeue we should consider another class
or an EventWindow decorator.
Adds a check to see if the s_squared value is near 0.
When the number was very near 0, a very small negative floating
point, the sqrt throws a 'math domain error', this prevents that
case.
The call to `Panel.dropna` after the fillna was deleting all values,
if a stock stopped trading mid run and thus provided volume 0.
i.e. if any sid had 0 non-null values the entire panel of frames
would be truncated.
It's possible to avoid the collapse via by adding the `how='all'` flag
to `dropna`, however with the current tick based creation of the panel,
the `dropna` with `how='all'` should be functionally equivalent to
not dropping at all.
The dropna has been dropped in favor of leaving the drop to algorithm
code.
The recent change to the creation of the data panel ended up with
a panel with the dtype of 'object', which was causing numpy ufuncs
like `log` to crash out on an `AttributeError`.
This forces all frames in the panel to use a dtype of 'float',
we may want to look at seeting a dtype on a frame by frame basis,
e.g. 'volume' may more accurately be 'int'.
Global state for the financial simulation environment is accessed through the
zipline.finance.trading module, which now contains a module variable:
environment.
Parameters are passed into an algorithm as a keyword argument, sim_params.
SimulationParameters creates a trading day index for the test period that
can be used to find trading days, calculate distance between trading days,
and other common operations. The sim params index is just selected from the
global state.
================
Details:
- adding delorean to the requirements.
- made index symbol a parameter for loading the benchmark data. changed
messagepack storage to be symbol specific.
- ported risk, performance, algorithm, transforms, batch transforms
and associated tests to use simulation parameters and global environment
- factory and sim factory use global state and sim params
- factory method parameter names now reflect the class expected
For the case where the window isn't covered by the data streaming
through the simulator.
e.g. in a case where the stocks being iterated over change every
quarter, the supplemental data will fill in the 'gap' missing from
the transform since the 'new' stocks were not streaming before
the beginning of the quarter.
Of note, test cases are covered by internal suites, but this could
use tests with completely mocked data.
So that algorithms that specifiy market_aware, days and delta
as args can be transitioned to just specifying a window_length kwarg.
Moving towards removing market_aware and delta completely.
The eventual goal is to remove the market_aware and delta kwargs,
but removing the kwarg completely would break the init
method of EventWindow based classes for existing algorithms.
In the meantime, this ensures that only market_aware is only ever
set to True and raises an error if it is False.
Also raises exceptions if values are set for delta, which was only
used if market_aware was False
Also, since the logic for checking window length is changed,
because in market aware mode we should always be checking the
window length, this adds some sanity checking to the window length.
The trading day index is all business days in range minus the
non trading days we are already calculating.
Also, uses trading calendar indexes for batch transform, since the
batch transform was the only use of non_trading_days.
Instead of constantly adding and removing holidays to do market
day delta math, uses pandas DatetimeIndex to get the index of the dates
and uses the index difference to calculate market days.
Though the addition of tracking mulitple values in the window
is powerful, the changes broke behavior of existing algorithms
by changing method signatures and names.
So temporarily reverting these changes, to be pulled back in when
a way to have the multiple fields tracked with the existing API
is written, or a cutover of the API is figured out and determined.