In the previous implementation of batch transform it happened
that a window_length of `0` caused the transform to update on every
bar, for the time being that behavior should be retained,
though the new rolling implementation more correctly aligns to the
term of 'period' so a period of 1 would achieve the same effect.
When moving BatchTransform off of EventWindow as a base object,
the checking of window length was lost, restore that check using
the same function as EventWindow.
Instead of creating a list of benchmarks in the risk module,
stream benchmarks through the system as events, starting from the
algorithm generator.
Works towards more easily setting arbritrary pricing data as
a a benchmark, as well as working towards live minutely benchmarks.
The batch_transform decorator wipes out the doc string of the function
it wraps. Decorate the creator with functools.wraps to preserve function
metadata.
- perf modified to let non-performance related events flow through.
- changes to support streaming non-trading data through batch transforms
and for mixing in sids with just custom data.
- allowing CUSTOM events to flow through to transforms.
- Added logic to maintain pre-specified sid filter.
- repeated calls with the same data window do not update batch transform
windows.
- repeated calls with the same data and same supplemental parameters do
not update batch transform results
- repeated calls with the same data and different supplemental params
do update batch transform results
- removed use_panel
- default for refresh_period is now 0
- refresh_period will only affect the recreation of the datapanel
- user's transform method is invoked on every call to batch transform
- the underlying dequeue shouldn't be modified, so forwarding to the
user function is a bit misleading
- if we want to provide a dequeue we should consider another class
or an EventWindow decorator.
The call to `Panel.dropna` after the fillna was deleting all values,
if a stock stopped trading mid run and thus provided volume 0.
i.e. if any sid had 0 non-null values the entire panel of frames
would be truncated.
It's possible to avoid the collapse via by adding the `how='all'` flag
to `dropna`, however with the current tick based creation of the panel,
the `dropna` with `how='all'` should be functionally equivalent to
not dropping at all.
The dropna has been dropped in favor of leaving the drop to algorithm
code.
The recent change to the creation of the data panel ended up with
a panel with the dtype of 'object', which was causing numpy ufuncs
like `log` to crash out on an `AttributeError`.
This forces all frames in the panel to use a dtype of 'float',
we may want to look at seeting a dtype on a frame by frame basis,
e.g. 'volume' may more accurately be 'int'.
Global state for the financial simulation environment is accessed through the
zipline.finance.trading module, which now contains a module variable:
environment.
Parameters are passed into an algorithm as a keyword argument, sim_params.
SimulationParameters creates a trading day index for the test period that
can be used to find trading days, calculate distance between trading days,
and other common operations. The sim params index is just selected from the
global state.
================
Details:
- adding delorean to the requirements.
- made index symbol a parameter for loading the benchmark data. changed
messagepack storage to be symbol specific.
- ported risk, performance, algorithm, transforms, batch transforms
and associated tests to use simulation parameters and global environment
- factory and sim factory use global state and sim params
- factory method parameter names now reflect the class expected
For the case where the window isn't covered by the data streaming
through the simulator.
e.g. in a case where the stocks being iterated over change every
quarter, the supplemental data will fill in the 'gap' missing from
the transform since the 'new' stocks were not streaming before
the beginning of the quarter.
Of note, test cases are covered by internal suites, but this could
use tests with completely mocked data.
The eventual goal is to remove the market_aware and delta kwargs,
but removing the kwarg completely would break the init
method of EventWindow based classes for existing algorithms.
In the meantime, this ensures that only market_aware is only ever
set to True and raises an error if it is False.
Also raises exceptions if values are set for delta, which was only
used if market_aware was False
Also, since the logic for checking window length is changed,
because in market aware mode we should always be checking the
window length, this adds some sanity checking to the window length.
The trading day index is all business days in range minus the
non trading days we are already calculating.
Also, uses trading calendar indexes for batch transform, since the
batch transform was the only use of non_trading_days.
Instead of constantly adding and removing holidays to do market
day delta math, uses pandas DatetimeIndex to get the index of the dates
and uses the index difference to calculate market days.
When run over large amounts of data the use of ndict's gets and sets
become a large bottleneck, around 1/5th of the CPU time is spent
in ndict's __setattr__, __getattr__, etc.
By switching to an object for an event,
we reduce the penalty significantly.
Removes asserts that check for event being an ndict, as well as those
that assume a certain behavior of the __contains__ method for events.
Previously, keys that mapped to functions would be set as field names.
Attempting to assign the datapanel slot to a function causes an error.
This limits the extracted field names to those that map to an int
or a float.