Commit Graph

17 Commits

Author SHA1 Message Date
Joe Jevnik bc0b117dc9 MAINT: make the data loading apis more consistent.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.

Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.

Adds many new fixtures and updates some existing fixtures to use the new
ones:

WithDefaultDateBounds
  A fixture that provides the suite a START_DATE and END_DATE. This is
  meant to make it easy for other fixtures to synchronize their date
  ranges without depending on eachother in strange ways. For example,
  WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
  both have data for the same dates, so they may use depend on
  WithDefaultDates without forcing a dependency between them.

WithTmpDir, WithInstanceTmpDir
  Provides the suite or individual test case a temporary directory.

WithBcolzDailyBarReader
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzDailyBarWriter.write

WithBcolzDailyBarReaderFromCSVs
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from a
  collection of CSV files and then converted into the bcolz data through
  BcolzDailyBarWriter.write_csvs

WithBcolzMinuteBarReader
  Provides the suite a BcolzMinuteBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzMinuteBarWriter.write

WithAdjustmentReader
  Provides the suite a SQLiteAdjustmentReader which reads from an in
  memory sqlite database. The data will be read from dataframes and then
  converted into sqlite with SQLiteAdjustmentWriter.write

WithDataPortal
  Provides each test case a DataPortal object with data from temporary
  resources.
2016-04-15 23:46:10 -04:00
Richard Frank 4e84b2c5ca TST: Added doctest 2016-02-04 21:58:57 -05:00
Richard Frank ede1eb7aa0 PERF: Look up expired futures from in-memory Futures
instead of queries to the db.
2016-02-04 18:55:34 -05:00
Dale Jung 38e8d5214d PERF: History Perf Enhancements
Limited use of `pandas` data structures in both `HistoryContainer` and
`RollingPanel`. Where possible, methods were amended to return raw
`ndarrays` with the indexing logic done separately. This allows us to
cut down the number of times pandas objects are created both as returns
and intermediate values. The separation of indexing from data access
allowed us to minimize the times we’d make use of pandas indexes.

This required that that certain methods like `NDFrame.ffill` be replaced
with versions that work with `ndarrays`. Some of this was done via
straight numpy methods and others by access pandas internal
machinery. Outside of allowing us to use faster ndarrays, many of these
function provided speedups over their pandas counterparts as we didn’t
require the extra features like handling multiple dtypes. i.e. np.isnan
is faster than pd.isnull, but only works with certain dtypes.
2015-02-11 06:25:53 -05:00
Joe Jevnik c2aae2e0f4 BUG: rolling panel data became misaligned after extend_back 2014-11-12 16:47:44 -05:00
Joe Jevnik 8df1a49031 BUG: When increasing the length dynamically, the rolling panel was
getting filled with the wrong datetimes and causing errors.

Updates the logic for addressing missing datetimes and adds unit tests
for the 2 main cases (no missing datetimes, and some missing datetimes).
2014-11-11 13:29:57 -05:00
Joe Jevnik f8f7f2fc4c ENH: Allows history to be dynamic and grow the container at runtime.
Previously, all specs had to be pre-allocated by using the 'add_history'
function. This is now no longer required and instead serves as a hint to
the HistoryContainer to pre-allocate the space for the given spec.

History can grow by increasing the length for a frequency, adding a
frequency, or adding a field. It can grow with any combination of
these.

HistoryContainer now is aware of the data_frequency of the algorithm,
and no longer uses the daily_at_midnight flag; instead, this is the
default behavior.
2014-11-03 15:57:44 -05:00
Scott Sanderson 235954d480 DEV: Overhaul core history logic.
Overhaul the core HistoryContainer logic to be more robust to changing
universes.

Major Changes
-------------
* Remove `return_frame` cache.  The original purpose of using
  return_frames was to avoid having to create new DataFrames on each
  iteration of handle_data, but we ended up having to copy the return
  frames anyway because user code could mutate the frames in place.
  Removing the return_frames reduces unnecessary copying, and reduces
  the logic of `get_history` to just forward-filling and concatenating
  two DataFrames.

* Use a `MultiIndex`ed DataFrame to represent
  `last_known_prior_values`.  This makes lookups faster and greatly
  simplifies the logic of adding and dropping sids.

* HistoryContainer no longer attempts to determine its universe based on
  the contents of its internal buffers.  The TradingAlgorithm
  controlling the container is now responsible for explicitly calling
  `add_sids` or `drop_sids` when securities enter or leave the
  algorithm's universe.  These methods, along with the internal
  `_realign` method, provide a clean interface for changing the universe
  of securities managed by the container.

* Refactor index mutation logic in `RollingPanel` into a
  `MutableIndexRollingPanel` subclass.  Maintenance of the old behavior
  is regrettably necessary to support `BatchTransform`.

* Refactor shared logic from `roll` and `get_history` into a single
  `aggregate_ohlcv_panel` method that's responsible for collapsing an
  OHLCV buffer into a frame.
2014-09-29 14:42:57 -04:00
Thomas Wiecki 96bdb22db9 BUG: RollingPanel was not behaving correctly in corner cases.
There quite some bugs in certain corner cases. Dropping of obsolete
axes was not working correctly, roll over could cause obsolete axes
to not drop. The tests are much more stringent now as well.
2014-06-14 21:07:02 +02:00
Scott Sanderson bad4c9a439 ENH: Prep work for supporting '1m' history.
Overhauls `HistoryContainer` in prep for support of more than one frequency.

Major changes:

   - Methods/variables referring to "day" have been renamed/generalized.
     - `current_day_panel` became `buffer_panel`, which is now a `RollingPanel`
     - `prior_day_panel` became a dictionary mapping `Frequency` objects to
       "digest panels", which are instances of `RollingPanel`.

   - Hard-coded daily rollover replaced with a notion of a "current window" for
     each unique frequency managed by the panel.

     - When the end of the current window is reached for a given frequency, we
       compute an aggregate bar (code refers to this as a "digest"), which is
       appended to a panel associated with that frequency.

     - Window rollover dates are managed by a pair of dictionaries,
       `cur_window_starts` and `cur_window_closes`.  The `Frequency` class is
       responsible for computing window bounds based on the open/close of the
       previous window.

   - Semantic change to the `open_price` field: `open_price` now always
     contains the price of the first trade occurring in the given window.
     Previously it contained the price of the first minute in the window,
     returning NaN it the security happened not to trade in the first minute.
2014-06-05 15:25:48 -04:00
Thomas Wiecki b89886297f STY: autopep8 codebase. 2013-08-08 16:46:44 -04:00
Thomas Wiecki a7818f853a DOC: Add note about performance issue when updating. 2013-06-20 19:36:36 -04:00
Thomas Wiecki 102cddb4cb ENH: Use smarter matching for updating RollingPanel. 2013-06-20 19:36:36 -04:00
Thomas Wiecki 236fe92a53 ENH: Make RollingPanel update itself if new fields arrive.
Before we preinitialized the BT's fields and sids.
Thus, no new ones could be added after initialization.
This should be fixed now.
2013-06-20 19:36:22 -04:00
Eddie Hebert aca338c9e5 MAINT: Remove unused NaiveRollingPanel from rolling panel module. 2013-06-20 18:00:13 -04:00
Thomas Wiecki 2be7014d51 ENH: Rewrite of batch_transform to use rolling panel.
- Added unittest to test for newly appearing sids.
- Fixed logic bug where window was only full after
  window_length+1 events got passed.
2013-04-29 15:30:40 -04:00
Wes McKinney c5f4d00bf1 ENH: prototype data structure for managing a rolling datapanel
Manage a rolling window collection of collection of panels
for computation purposes.
2013-04-29 15:19:02 -04:00