- Return a value from `verify_all_indices_unique` so that `panel` isn't
unconditionally `None` in `PanelDailyBarReader`.
- Fix a bug where we always set the volume of every asset to `1e9`.
- Add minimal suite of tests for get_spot_value, which catch both of the
above.
NOTE: There are still several issues with `PanelDailyBarReader`. The
docstring for `get_spot_value` claims that it will return -1 on days
where an asset didn't trade, which isn't the case. It also claims that
it will raise `NoDataOnDate` when a request is made outside the panel
range, but it just raises a KeyError. We also still have no coverage
for `load_raw_arrays`, so it's likely that there are more bugs lurking.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
LabelArray is conceptually similar to pandas.Categorical, in that it
stores data with many duplicate values as indices into an array of
unique values. For string data with many duplicates (e.g. time-series
of tickers or or industry classifications), this provides multiple
orders of magnitude of improvement when doing string operations,
especially string comparison/matching operations.
- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
and a corresponding ObjectOverwrite adjustment.
- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
This method is called on the final result of any pipeline expression
after screen filtering has occurred. The default implementation of
``postprocess`` is identity, but Classifier overrides it to coerce
string columns into pandas.Categoricals before presenting them to the
user.
Adds the data bundle concept which makes it easy for users to register
loading functions to build out minute and daily data along with an
assets db and adjustments db. By default we have provided a `quandl`
bundle which pulls from the public domain WIKI dataset. Users may
register new bundles by decorating an ingest function with
`zipline.data.bundles.register(<name>)`. This also provides a
`yahoo_equities` function for creating an ingestion function that will
load a static set of assets from yahoo.
The cli is now structured as a couple of subcommands and has been
changed to `python -m zipline`. The old behavior of `run_algo.py` has
been moved to the `run` subcommand. This is almost entirely the same
except that it now takes the name of the data bundle to use, defaulting
to `quandl`.
The next subcommand is `ingest` which takes the name of
a data bundle to ingest. This will run the loading machinery and write
the data to a specified location that `run` can find.
There is also a `clean` subcommand which deletes the data that was
written with `ingest`.
Extensions have also been added to zipline. This is an experimental
feature where users can provide an extra set of python files to run at
the start of the process. These can be used to configure aspects of
zipline. Right now the only thing that is supported in an extension file
is the registration of a new data bundle.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.
Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.
Adds many new fixtures and updates some existing fixtures to use the new
ones:
WithDefaultDateBounds
A fixture that provides the suite a START_DATE and END_DATE. This is
meant to make it easy for other fixtures to synchronize their date
ranges without depending on eachother in strange ways. For example,
WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
both have data for the same dates, so they may use depend on
WithDefaultDates without forcing a dependency between them.
WithTmpDir, WithInstanceTmpDir
Provides the suite or individual test case a temporary directory.
WithBcolzDailyBarReader
Provides the suite a BcolzDailyBarReader which reads from bcolz data
written to a temporary directory. The data will be read from
dataframes and then converted to bcolz files with
BcolzDailyBarWriter.write
WithBcolzDailyBarReaderFromCSVs
Provides the suite a BcolzDailyBarReader which reads from bcolz data
written to a temporary directory. The data will be read from a
collection of CSV files and then converted into the bcolz data through
BcolzDailyBarWriter.write_csvs
WithBcolzMinuteBarReader
Provides the suite a BcolzMinuteBarReader which reads from bcolz data
written to a temporary directory. The data will be read from
dataframes and then converted to bcolz files with
BcolzMinuteBarWriter.write
WithAdjustmentReader
Provides the suite a SQLiteAdjustmentReader which reads from an in
memory sqlite database. The data will be read from dataframes and then
converted into sqlite with SQLiteAdjustmentWriter.write
WithDataPortal
Provides each test case a DataPortal object with data from temporary
resources.
loaders
This allows people to set their cutoff time to the time they will
actually execute 'before_trading_start'. Currently this is just passed
to the constructor of the loader; however, I would like to make this
managed by the algorithm simulation runner. This would help keep all of
the loaders in sync and lock 'before_trading_start's execution to the
time the data is queried for.
EarningsCalendar loader.
- Moves most of AdjustedArray back into Python. The window iterator is
the only part that's performance-intensive.
- Adds a bootleg templating system for creating specialized versions of
AdjustedArrayWindow for each concrete type we care about.
- Adds support for differently dtyped terms in pipeline. This allows us
to use datetime64s which are needed in the EarningsCalendar.
- Adds EarningsCalendar dataset for the next and previous earnings
announcements in pipeline.
- Adds in memory loader for EarningsCalendar.
- Adds blaze loader for EarningsCalendar.
Previously we have capitalized input strings at different levels in
our code: in the user-facing API methods and in the asset finder.
This commit moves input string capitalization exclusively to the API
method to which the string was supplied. Specifically, the string is
capitalized by a preprocess API method decorator. The preprocess
decorator passes the input string to the newly defined
ensure_upper_case() method, which returns a TypeError if the argument
supplied is not a string.
ensure_upper_case() is defined in a new file, zipline/utils/input_validation.py.
The existing expect_types() method is also moved there.
Various tests in tests/test_assets.py are modified to account for the
fact that the asset finder method lookup_symol() no longer capitalizes
its supplied argument.