Commit Graph

14 Commits

Author SHA1 Message Date
Dale Jung 38e8d5214d PERF: History Perf Enhancements
Limited use of `pandas` data structures in both `HistoryContainer` and
`RollingPanel`. Where possible, methods were amended to return raw
`ndarrays` with the indexing logic done separately. This allows us to
cut down the number of times pandas objects are created both as returns
and intermediate values. The separation of indexing from data access
allowed us to minimize the times we’d make use of pandas indexes.

This required that that certain methods like `NDFrame.ffill` be replaced
with versions that work with `ndarrays`. Some of this was done via
straight numpy methods and others by access pandas internal
machinery. Outside of allowing us to use faster ndarrays, many of these
function provided speedups over their pandas counterparts as we didn’t
require the extra features like handling multiple dtypes. i.e. np.isnan
is faster than pd.isnull, but only works with certain dtypes.
2015-02-11 06:25:53 -05:00
Joe Jevnik c2aae2e0f4 BUG: rolling panel data became misaligned after extend_back 2014-11-12 16:47:44 -05:00
Joe Jevnik 8df1a49031 BUG: When increasing the length dynamically, the rolling panel was
getting filled with the wrong datetimes and causing errors.

Updates the logic for addressing missing datetimes and adds unit tests
for the 2 main cases (no missing datetimes, and some missing datetimes).
2014-11-11 13:29:57 -05:00
Joe Jevnik f8f7f2fc4c ENH: Allows history to be dynamic and grow the container at runtime.
Previously, all specs had to be pre-allocated by using the 'add_history'
function. This is now no longer required and instead serves as a hint to
the HistoryContainer to pre-allocate the space for the given spec.

History can grow by increasing the length for a frequency, adding a
frequency, or adding a field. It can grow with any combination of
these.

HistoryContainer now is aware of the data_frequency of the algorithm,
and no longer uses the daily_at_midnight flag; instead, this is the
default behavior.
2014-11-03 15:57:44 -05:00
Scott Sanderson 235954d480 DEV: Overhaul core history logic.
Overhaul the core HistoryContainer logic to be more robust to changing
universes.

Major Changes
-------------
* Remove `return_frame` cache.  The original purpose of using
  return_frames was to avoid having to create new DataFrames on each
  iteration of handle_data, but we ended up having to copy the return
  frames anyway because user code could mutate the frames in place.
  Removing the return_frames reduces unnecessary copying, and reduces
  the logic of `get_history` to just forward-filling and concatenating
  two DataFrames.

* Use a `MultiIndex`ed DataFrame to represent
  `last_known_prior_values`.  This makes lookups faster and greatly
  simplifies the logic of adding and dropping sids.

* HistoryContainer no longer attempts to determine its universe based on
  the contents of its internal buffers.  The TradingAlgorithm
  controlling the container is now responsible for explicitly calling
  `add_sids` or `drop_sids` when securities enter or leave the
  algorithm's universe.  These methods, along with the internal
  `_realign` method, provide a clean interface for changing the universe
  of securities managed by the container.

* Refactor index mutation logic in `RollingPanel` into a
  `MutableIndexRollingPanel` subclass.  Maintenance of the old behavior
  is regrettably necessary to support `BatchTransform`.

* Refactor shared logic from `roll` and `get_history` into a single
  `aggregate_ohlcv_panel` method that's responsible for collapsing an
  OHLCV buffer into a frame.
2014-09-29 14:42:57 -04:00
Thomas Wiecki 96bdb22db9 BUG: RollingPanel was not behaving correctly in corner cases.
There quite some bugs in certain corner cases. Dropping of obsolete
axes was not working correctly, roll over could cause obsolete axes
to not drop. The tests are much more stringent now as well.
2014-06-14 21:07:02 +02:00
Scott Sanderson bad4c9a439 ENH: Prep work for supporting '1m' history.
Overhauls `HistoryContainer` in prep for support of more than one frequency.

Major changes:

   - Methods/variables referring to "day" have been renamed/generalized.
     - `current_day_panel` became `buffer_panel`, which is now a `RollingPanel`
     - `prior_day_panel` became a dictionary mapping `Frequency` objects to
       "digest panels", which are instances of `RollingPanel`.

   - Hard-coded daily rollover replaced with a notion of a "current window" for
     each unique frequency managed by the panel.

     - When the end of the current window is reached for a given frequency, we
       compute an aggregate bar (code refers to this as a "digest"), which is
       appended to a panel associated with that frequency.

     - Window rollover dates are managed by a pair of dictionaries,
       `cur_window_starts` and `cur_window_closes`.  The `Frequency` class is
       responsible for computing window bounds based on the open/close of the
       previous window.

   - Semantic change to the `open_price` field: `open_price` now always
     contains the price of the first trade occurring in the given window.
     Previously it contained the price of the first minute in the window,
     returning NaN it the security happened not to trade in the first minute.
2014-06-05 15:25:48 -04:00
Thomas Wiecki b89886297f STY: autopep8 codebase. 2013-08-08 16:46:44 -04:00
Thomas Wiecki a7818f853a DOC: Add note about performance issue when updating. 2013-06-20 19:36:36 -04:00
Thomas Wiecki 102cddb4cb ENH: Use smarter matching for updating RollingPanel. 2013-06-20 19:36:36 -04:00
Thomas Wiecki 236fe92a53 ENH: Make RollingPanel update itself if new fields arrive.
Before we preinitialized the BT's fields and sids.
Thus, no new ones could be added after initialization.
This should be fixed now.
2013-06-20 19:36:22 -04:00
Eddie Hebert aca338c9e5 MAINT: Remove unused NaiveRollingPanel from rolling panel module. 2013-06-20 18:00:13 -04:00
Thomas Wiecki 2be7014d51 ENH: Rewrite of batch_transform to use rolling panel.
- Added unittest to test for newly appearing sids.
- Fixed logic bug where window was only full after
  window_length+1 events got passed.
2013-04-29 15:30:40 -04:00
Wes McKinney c5f4d00bf1 ENH: prototype data structure for managing a rolling datapanel
Manage a rolling window collection of collection of panels
for computation purposes.
2013-04-29 15:19:02 -04:00