So that consumers can write empty days worth of data, without needing
to construct a DataFrame with zero data force a write.
The internal loader uses `last_date_in_output_for_sid` to signify that
data has been attempted to be retrieved for all dates up until that, so
that when resuming a job those retrieval of data for those dates are not
re-attempted.
Also, used to make the write logic cleaneer, by making it only
necessary to create an array large enough for the given df.
Fix a bug where creating a sid bcolz file when the containing directory
was already occupied by a sid caused an OSError on attempt of creating
the directory because it already existed.
e.g. if there were two sids, `1` and `2`. The paths would be
`00/00/000001.bcolz` and `00/00/000002.bcolz` which share the same
directory `00/00`.
Fixed by checking for directory existence before calling `makedirs`.
Add test coverage which exercises writing of sids that are siblings in
the sid directory structure.
Implement a writer for minute data into a format comprised of multiple
ctables, one for each individual asset, with a common 'index' shared by
all ctables where a given a dt maps to the same array index for all
equities and fields.
This format is pulled from the lazy-mainline/Q2.0 branch, with some
changes to the interface.
Add basic retrieval of values at a given dt to reader. Not yet used by
Zipline simulations, but added to support unit tests.
Also, rename stubbed out us_equity_minutes to minute_bars, since the
writer can be agnostic to asset type.
Return -1 when there is a zero value for a spot price.
Intended for use by the incoming data portal changes. When the data
portal will see a -1 value, the portal will seek back a trading day
until a non-negative value is returned.
Volumes were incorrectly having the thousands factor applied, however
the volume is written as is (without the factor, since it volume is an
int, not float value.)
Fix by adding a special case for volume which returns the price as is.
Put the logic for reading and writing the equity price and adjustment
data into a module located in data, making it distinct from the pipeline
loader usage of the formats.
This prepares for both incoming changes of how adjustments are written,
(which includes using the bcolz daily reader as an input), as well as
eventually providing the readers to a DataPortal object.