Commit Graph

6 Commits

Author SHA1 Message Date
Eddie Hebert 75213ac176 MAINT: Write open and closes for minute bar format
Write arrays representing corresponding market opens and market closes,
which will eventually replace the `minute_index` field.

The market closes are being added for incoming work on another branch
which will use the market closes to generate a list of non-market
minutes to filter out when returning data from `unadjusted_window`.
2016-03-24 23:18:42 -04:00
Eddie Hebert 0f14972e08 ENH: Unadjusted window data for minute bars.
Add a method to minute bar reader which returns the OHLCV for all
requested fields for a list assets over the specified start and end
minutes.

Initial usage is intended for use by a loader which consumes minute bar
data to resample into daily bars, but may also be used when aggregating
minute data during '1d' history calls in Q2.0.

This iteration does not include including of early closes.
2016-03-14 21:52:01 -04:00
Eddie Hebert 27f94f83fa ENH: Allow passing of numpy arrays to writer.
For faster parsing and writing workflows, do not require a DataFrame.
2016-02-02 14:03:42 -05:00
Eddie Hebert 488721e805 ENH: Add padding method to minute bars writer.
So that consumers can write empty days worth of data, without needing
to construct a DataFrame with zero data force a write.

The internal loader uses `last_date_in_output_for_sid` to signify that
data has been attempted to be retrieved for all dates up until that, so
that when resuming a job those retrieval of data for those dates are not
re-attempted.

Also, used to make the write logic cleaneer, by making it only
necessary to create an array large enough for the given df.
2016-02-01 14:18:22 -05:00
Eddie Hebert 984e934e83 BUG: Fix OSError when creating sids that share dir
Fix a bug where creating a sid bcolz file when the containing directory
was already occupied by a sid caused an OSError on attempt of creating
the directory because it already existed.

e.g. if there were two sids, `1` and `2`. The paths would be
`00/00/000001.bcolz` and `00/00/000002.bcolz` which share the same
directory `00/00`.

Fixed by checking for directory existence before calling `makedirs`.

Add test coverage which exercises writing of sids that are siblings in
the sid directory structure.
2016-01-25 10:37:50 -05:00
Eddie Hebert d5c3b5a15c ENH: Add writer for minute bcolz format.
Implement a writer for minute data into a format comprised of multiple
ctables, one for each individual asset, with a common 'index' shared by
all ctables where a given a dt maps to the same array index for all
equities and fields.

This format is pulled from the lazy-mainline/Q2.0 branch, with some
changes to the interface.

Add basic retrieval of values at a given dt to reader. Not yet used by
Zipline simulations, but added to support unit tests.

Also, rename stubbed out us_equity_minutes to minute_bars, since the
writer can be agnostic to asset type.
2016-01-21 10:54:27 -05:00