Adds the data bundle concept which makes it easy for users to register
loading functions to build out minute and daily data along with an
assets db and adjustments db. By default we have provided a `quandl`
bundle which pulls from the public domain WIKI dataset. Users may
register new bundles by decorating an ingest function with
`zipline.data.bundles.register(<name>)`. This also provides a
`yahoo_equities` function for creating an ingestion function that will
load a static set of assets from yahoo.
The cli is now structured as a couple of subcommands and has been
changed to `python -m zipline`. The old behavior of `run_algo.py` has
been moved to the `run` subcommand. This is almost entirely the same
except that it now takes the name of the data bundle to use, defaulting
to `quandl`.
The next subcommand is `ingest` which takes the name of
a data bundle to ingest. This will run the loading machinery and write
the data to a specified location that `run` can find.
There is also a `clean` subcommand which deletes the data that was
written with `ingest`.
Extensions have also been added to zipline. This is an experimental
feature where users can provide an extra set of python files to run at
the start of the process. These can be used to configure aspects of
zipline. Right now the only thing that is supported in an extension file
is the registration of a new data bundle.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.
Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.
Adds many new fixtures and updates some existing fixtures to use the new
ones:
WithDefaultDateBounds
A fixture that provides the suite a START_DATE and END_DATE. This is
meant to make it easy for other fixtures to synchronize their date
ranges without depending on eachother in strange ways. For example,
WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
both have data for the same dates, so they may use depend on
WithDefaultDates without forcing a dependency between them.
WithTmpDir, WithInstanceTmpDir
Provides the suite or individual test case a temporary directory.
WithBcolzDailyBarReader
Provides the suite a BcolzDailyBarReader which reads from bcolz data
written to a temporary directory. The data will be read from
dataframes and then converted to bcolz files with
BcolzDailyBarWriter.write
WithBcolzDailyBarReaderFromCSVs
Provides the suite a BcolzDailyBarReader which reads from bcolz data
written to a temporary directory. The data will be read from a
collection of CSV files and then converted into the bcolz data through
BcolzDailyBarWriter.write_csvs
WithBcolzMinuteBarReader
Provides the suite a BcolzMinuteBarReader which reads from bcolz data
written to a temporary directory. The data will be read from
dataframes and then converted to bcolz files with
BcolzMinuteBarWriter.write
WithAdjustmentReader
Provides the suite a SQLiteAdjustmentReader which reads from an in
memory sqlite database. The data will be read from dataframes and then
converted into sqlite with SQLiteAdjustmentWriter.write
WithDataPortal
Provides each test case a DataPortal object with data from temporary
resources.
Upgrade Logbook to 0.12.5. This required changing a usage of
`logbook.NullHandler()` which passed `bubble=True`, since
`NullHandler` no longer supports the `bubble` argument.
As of logbook 0.10.0, logbook no longer installs a default handler,
which means that if the application doesn't install one, log messages
disappear into the ether.
Therefore, all of our scripts with `__main__` endpoints need to push a
`logbook.StderrHandler` if they're not already pushing some other
handler.
This commit modifies the DataFrameSource and DataPanelSource to accept only Int64Indexes on the incoming data and moves the burden of mapping user identifiers to TradingAlgorithm.run().
When zipline is imported it checks whether
it runs in the IPython notebook. If it does,
it registers a %%zipline magic that takes the
same arguments as the CLI with the addition of
a -o for specifying the output variable to store
the performance frame in.
The algo code in the cell is, as of yet, executed
in its own environment rather than that of the
IPython NB which is probably what we want.
Also adds cli option to save the perf dataframe
to a pickle file.
Also adds an IPython notebook buyapple example.
Add a CLI that reads in an algorithm, loads data,
run the algorithm, and output performance metrics.
The examples are adapted to the new zipline API and
analyses are split into separate files.
Also add config files that run the example
algorithms with preset settings.
This is a step towards the goal of uniting Quantopian scripts
and zipline.
To make the syntax of zipline identical to Quantopian
we break out the API methods (like order) and turn them into
functions. To access the algo object we add a thread local reference
to the current algorithm that is accessed in the API functions.
TradingAlgorithm now takes either a string or two functions
(initialize and handle_data) that it executes.
Use api method decorator for methods available in algoscript.
Ported appropriate algorithm tests from internal code.
So that the 1-Month Sharpe ratio has a curve to use during calculation,
use data from 2002, since the Treasury returns 1 Month data starting
in July, 2001.
- Use `print()` function for all print calls
- Fix strip and format calls that were on the outside of the
print function for some reason.
(Which were breaking in Python 3 because of print returning None.)
- Remove commented out print calls.
Remove the lists of DailyReturn objects in favor of using pd.Series
to store the return values.
Should make it easier to inspect the values when stepping through,
make the windowing of data to a certain range more facile by using,
and have some performance increases due to removing object creation
and member access.
To support mulitple sids the TALib transforms now return a dict,
instead of a float. Accordingly, the TALib example script now needs
to index into the transform result.
The new example is almost identical to the dual_moving_average one.
However, instead of our in-house mavg transform it uses the new
talib exponential moving average (EMA).
Fix crash due to 'delay' was no longer supported.
But removing SlippageModel override, since current configs
should be functionally equivalent to FixedSlippage.
Uses a method called 'record' that provides a key value,
instead of providing keys to extract from context.
The variables are stored internally to the algorithm in a dictionary,
and not just stared as a property of the algorithm.
Main intent behind this change is to make the API more user friendly,
since the previous recorded_variables relies on the value to be set
in the algorithms context/self, the hope is that only having to use
the `record` method means less moving pieces and a more understandable
API.
i.e., instead of:
```
def initialize(self):
recorded_variables('foo', bar')
def handle_data(self, data):
self.foo = 1
self.bar = 2
```
The API is now:
```
def initialize(self):
pass
def handle_data(self, data):
self.record(foo=1, bar=2)
```