Reducing use of ndict on the path to fixing a memory leak.
So that pulling an instance of ndict with objgraph is more likely
to be a use of ndict that is causing the leak.
Using a class level constant here suffices for the dot access
desired on messages.
Though the addition of tracking mulitple values in the window
is powerful, the changes broke behavior of existing algorithms
by changing method signatures and names.
So temporarily reverting these changes, to be pulled back in when
a way to have the multiple fields tracked with the existing API
is written, or a cutover of the API is figured out and determined.
Since the position amount and price ndarrays are one dimensional
and use real numbers, we do not need the overhead of the extra
case handling provided by numpy.vdot, which comes at a cost of
performance.
With thanks to @jlowin, for pointing out the better fit of numpy.dot.
Gets almost 100x speed up over iterating over the values and
summing up the values in Python.
Farms out the work to numpy and atlas by using the vector dot
product of the amounts and last sale prices.
Adds some wiring of keeping track of an index into the numpy arrays
for each position, so that value can be overwritten as events update
those amounts and sale prices.
Instead of doing the rollover by creating a new PerformancePeriod,
introduces a `rollover` method that resets the values that need
to be fresh in a new period, and moves the ending values to starting
values, and leaves positions intact.
This isn't a major runtime improvement in of itself, but it does
allow us to more easily keep track of position values from period
to period, which other improvements will use.
Instead of creating a new ndict for each position on every event,
we change the values in the object that held the previous position.
The creation of new objects on each event was incurring too much
overhead.
Changes the position type returned by performance module.
For improved speed, changes from ndict to a simple Python object,
since the cost of setting ndict values is too expensive for the
number of times that positions are returned.
Also, changes the containing type of the positions to be dictionary
with the __missing__ overloaded, instead of the ndict that had that
behavior, to reduce the penalty of using ndicts.
The creation of a new portfolio ndict on each call of handle_data
was creating a very high performance overhead.
Instead, we use the same the portfolio object for each event,
and replace the values contained within.
Gains some performance by using a 'regular' object instead of
an ndict.
Also, directly sets up the values that we return, instead of going in
between with __core_dict and then removing values.
In it's entirety performanc.as_portfolio is the current
highest bottleneck, working on reducing time spent in that function.
Gets almost 100x speed up over iterating over the values and
summing up the values in Python.
Farms out the work to numpy and atlas by using the vector dot
product of the amounts and last sale prices.
Adds some wiring of keeping track of an index into the numpy arrays
for each position, so that value can be overwritten as events update
those amounts and sale prices.
Instead of doing the rollover by creating a new PerformancePeriod,
introduces a `rollover` method that resets the values that need
to be fresh in a new period, and moves the ending values to starting
values, and leaves positions intact.
This isn't a major runtime improvement in of itself, but it does
allow us to more easily keep track of position values from period
to period, which other improvements will use.
Instead of creating a new ndict for each position on every event,
we change the values in the object that held the previous position.
The creation of new objects on each event was incurring too much
overhead.
Changes the position type returned by performance module.
For improved speed, changes from ndict to a simple Python object,
since the cost of setting ndict values is too expensive for the
number of times that positions are returned.
Also, changes the containing type of the positions to be dictionary
with the __missing__ overloaded, instead of the ndict that had that
behavior, to reduce the penalty of using ndicts.
The creation of a new portfolio ndict on each call of handle_data
was creating a very high performance overhead.
Instead, we use the same the portfolio object for each event,
and replace the values contained within.
Gains some performance by using a 'regular' object instead of
an ndict.
Also, directly sets up the values that we return, instead of going in
between with __core_dict and then removing values.
In it's entirety performanc.as_portfolio is the current
highest bottleneck, working on reducing time spent in that function.
Uses heapq.merge to sort input from mulitple sources instead of
our own sort module.
From profiling heapq.merge is more efficient than our own efforts.
update_universe is a bottleneck on large data sets.
A large portion of that bottleneck is the call to getitem while
looping over the keys, so using update while passing along the internal
__dict__
Seeing about a 40% improvement.