Commit Graph

217 Commits

Author SHA1 Message Date
Joe Jevnik 8622993358 DOC: update docs based on Rich's feedback 2016-05-12 17:07:02 -04:00
Joe Jevnik d888c4faaa DOC: update docs for api functions 2016-05-06 15:25:30 -04:00
Scott Sanderson 5f190395ad ENH: Add support for strings in Pipeline.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
  LabelArray is conceptually similar to pandas.Categorical, in that it
  stores data with many duplicate values as indices into an array of
  unique values.  For string data with many duplicates (e.g. time-series
  of tickers or or industry classifications), this provides multiple
  orders of magnitude of improvement when doing string operations,
  especially string comparison/matching operations.

- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
  and a corresponding ObjectOverwrite adjustment.

- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
  This method is called on the final result of any pipeline expression
  after screen filtering has occurred. The default implementation of
  ``postprocess`` is identity, but Classifier overrides it to coerce
  string columns into pandas.Categoricals before presenting them to the
  user.
2016-05-04 15:50:52 -04:00
Joe Jevnik 59c8e371a2 ENH: Updates the cli, data bundles and extensions.
Adds the data bundle concept which makes it easy for users to register
loading functions to build out minute and daily data along with an
assets db and adjustments db. By default we have provided a `quandl`
bundle which pulls from the public domain WIKI dataset. Users may
register new bundles by decorating an ingest function with
`zipline.data.bundles.register(<name>)`. This also provides a
`yahoo_equities` function for creating an ingestion function that will
load a static set of assets from yahoo.

The cli is now structured as a couple of subcommands and has been
changed to `python -m zipline`. The old behavior of `run_algo.py` has
been moved to the `run` subcommand. This is almost entirely the same
except that it now takes the name of the data bundle to use, defaulting
to `quandl`.

The next subcommand is `ingest` which takes the name of
a data bundle to ingest. This will run the loading machinery and write
the data to a specified location that `run` can find.

There is also a `clean` subcommand which deletes the data that was
written with `ingest`.

Extensions have also been added to zipline. This is an experimental
feature where users can provide an extra set of python files to run at
the start of the process. These can be used to configure aspects of
zipline. Right now the only thing that is supported in an extension file
is the registration of a new data bundle.
2016-05-03 18:38:24 -04:00
dmichalowicz 8d1ecb508a Use string_types 2016-04-28 15:28:19 -04:00
Scott Sanderson 85ae664d8c BUG: Don't crash on dataframes with assets in index. 2016-04-28 15:19:57 -04:00
Andrew Liang 5809ae17f1 DEV: Better error message for sid= in get_open_orders
Let the user to know to use asset= instead
2016-04-26 12:23:57 -04:00
Jean Bredeche c404c60d68 BUG: don't allow ordering in before_trading_start 2016-04-26 10:56:36 -04:00
Joe Jevnik bc0b117dc9 MAINT: make the data loading apis more consistent.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.

Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.

Adds many new fixtures and updates some existing fixtures to use the new
ones:

WithDefaultDateBounds
  A fixture that provides the suite a START_DATE and END_DATE. This is
  meant to make it easy for other fixtures to synchronize their date
  ranges without depending on eachother in strange ways. For example,
  WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
  both have data for the same dates, so they may use depend on
  WithDefaultDates without forcing a dependency between them.

WithTmpDir, WithInstanceTmpDir
  Provides the suite or individual test case a temporary directory.

WithBcolzDailyBarReader
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzDailyBarWriter.write

WithBcolzDailyBarReaderFromCSVs
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from a
  collection of CSV files and then converted into the bcolz data through
  BcolzDailyBarWriter.write_csvs

WithBcolzMinuteBarReader
  Provides the suite a BcolzMinuteBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzMinuteBarWriter.write

WithAdjustmentReader
  Provides the suite a SQLiteAdjustmentReader which reads from an in
  memory sqlite database. The data will be read from dataframes and then
  converted into sqlite with SQLiteAdjustmentWriter.write

WithDataPortal
  Provides each test case a DataPortal object with data from temporary
  resources.
2016-04-15 23:46:10 -04:00
Jean Bredeche fac5905c10 Merge pull request #1114 from quantopian/handle-data-optional
ENH: make handle_data optional
2016-04-13 09:31:41 -04:00
Richard Frank 70befd490b MAINT: Don't store data portal everywhere
Removed lots of data portal references that participated in ref cycles
and prevented deterministic cleanup of dbs.
2016-04-12 19:33:22 -04:00
Jean Bredeche 4203c54417 ENH: make handle_data optional 2016-04-07 09:50:09 -04:00
Jean Bredeche dc01c45dc4 DEV: Apply adjustments for portfolio and account in BTS
completely copied from https://github.com/quantopian/zipline/pull/1104/

All credit goes to Andrew Liang (@lianga888)
2016-04-05 11:37:34 -04:00
Eddie Hebert 16fd6681a6 ENH: Rewrite of Zipline to use lazy access pattern
More documentation to follow in release notes.

Based on lazy-mainline branch, see for more details.

Also-By: Jean Bredeche <jean@quantopian.com>
Also-By: Andrew Liang <aliang@quantopian.com>
Also-By: Abhijeet Kalyan <akalyan@quantopian.com>
2016-04-04 16:12:58 -04:00
Joe Jevnik 4f0babc558 DOC: update TradingAlgorithm docstring 2016-03-11 11:38:43 -05:00
jfkirk db1e62971a ENH: Adds tick_size and renames futures multiplier 2016-01-22 14:56:30 -05:00
Richard Frank daf05c6b59 BUG: Ensure that current_sids() returns Assets instead of identifiers
Also batch lookup sids in algo.run
2016-01-21 10:32:07 -05:00
llllllllll c2091cf79e MAINT: fix error messages for set_(commission|slippage)
These called the functions 'update_(commission|slippage) leading to
confusing messages when you called them post init.
2015-12-18 13:05:03 -05:00
Scott Sanderson c763d8f4d5 STY: Remove unused import. 2015-11-13 18:26:54 -05:00
Scott Sanderson 654edaa851 BUG: Clear asset caches when mapping DataFrame.
Our DataFrame index resolution logic relies on failed lookups **not**
being cached, but not caching failed lookups is a nontrivial performance
hit when repeatedly looking up sids.  The "solution" here is to clear
the caches after writing in new assets.

The real fix for this is either:

1. Don't construct an AssetFinder until we have the datasource in hand
   in run(), or
2. Don't symbol-map the user's input source if it's a DataFrame.
   Instead we should make our data loaders pre-map the data.
2015-11-13 18:26:54 -05:00
llllllllll 0cb4c38717 ENH: Allow users to pass a context manager to wrap all scheduled
functions.

This includes handle_data.
2015-11-11 14:19:13 -05:00
puppy a82415c77b BUG: Address issue #801 and add test. Pass panel directly to
object instead of data.
2015-11-05 10:18:57 -05:00
puppy e50858315b BUG: Issue #801 Initializing TradingAlgorithm Object Does Not Set
_analyze Method in algorithm.py

Added one line to check for the keyword argument 'analyze' and set the
the _analyze method when a TradingAlgorithm object is initialized
within a script.
2015-11-05 10:16:48 -05:00
Scott Sanderson 7eeacbe0e9 Merge pull request #796 from quantopian/prevent-history-in-initialize
ENH: Fail when history() is called in initialize.
2015-11-04 22:29:16 -05:00
Richard Frank d7add5b248 ENH: Makes chunk_size configurable in attach_pipeline
Default first chunk is smaller for more immediate results
2015-10-27 16:15:54 -04:00
Scott Sanderson 4826787475 ENH: Fail when history() is called in initialize. 2015-10-26 12:04:31 -04:00
Richard Frank ee26a21855 MAINT: Renamed loader_dispatch to get_loader
Now it raises a KeyError instead of returning None,
if loader not found.
2015-10-12 16:13:55 -04:00
Richard Frank 83bd1310d9 PERF: Using pipeline_loader_dispatch to group by loader
instead of dataset
2015-10-12 10:48:29 -04:00
Stewart Douglas 4e2039c9b0 ENH: Coerce user input with API method decorator
Previously we have capitalized input strings at different levels in
our code: in the user-facing API methods and in the asset finder.
This commit moves input string capitalization exclusively to the API
method to which the string was supplied. Specifically, the string is
capitalized by a preprocess API method decorator. The preprocess
decorator passes the input string to the newly defined
ensure_upper_case() method, which returns a TypeError if the argument
supplied is not a string.

ensure_upper_case() is defined in a new file, zipline/utils/input_validation.py.
The existing expect_types() method is also moved there.

Various tests in tests/test_assets.py are modified to account for the
fact that the asset finder method lookup_symol() no longer capitalizes
its supplied argument.
2015-10-08 15:41:33 -04:00
Stewart Douglas 3ef0ddf0c6 ENH: Add future_symbol API method 2015-10-05 11:19:04 -04:00
Scott Sanderson 557bdcd69d MAINT: Don't name pipelines.
`Pipeline()` no longer takes a name.
`attacH_pipeline` now takes a name.

This is mainly for uniformity with how `Factors` and `Filters` are
handled.
2015-10-02 16:31:29 -04:00
Scott Sanderson 096a0d49fd MAINT: Rename drain_pipeline -> pipeline_output.
More boring, but *drain* carries a connotation of "get everything",
which is misleading.
2015-10-01 18:03:54 -04:00
Scott Sanderson 1f6c7ff31f MAINT: Rename ffc_loader -> pipeline_loader. 2015-10-01 18:03:54 -04:00
Scott Sanderson 2d683961bd MAINT: More renaming.
s/FFCEngine/PipelineEngine/
s/FFCLoader/PipelineLoader/
2015-10-01 18:03:54 -04:00
Scott Sanderson f82a01841b MAINT: Rename ALL the things.
zipline.modelling.* -> zipline.pipeline.*
zipline.data.ffc.loaders -> zipline.pipeline.loaders
tests/modelling -> tests/pipeline
2015-10-01 18:03:53 -04:00
Scott Sanderson 8e59d12daf ENH: Pipeline API
- Adds `zipline.pipeline.Pipeline`, a new user-facing class for managing
  pipelines of Modeling API expressions.

- Adds `attach_pipeline` and `drain_pipeline` as API methods

- Removes `add_factor` and `add_filter` as API methods.  These have been
  replaced two new methods on `Pipeline`: `add`, and `apply_screen`.

- Adding a `Filter` as a column no longer implicitly truncates rows from
  the Modelling API output.  It simply causes a new column, of dtype
  `bool` to show up in the output. Removal of rows is now handled by the
  new `apply_screen` method of `Pipeline`.

- Refactors the existing Modeling API tests to reflect the new APIs.
2015-10-01 18:03:53 -04:00
Richard Frank 136eb8aa0d BUG: Running an algo with a df/panel of Assets was raising SidNotFound 2015-09-24 21:49:45 -04:00
jfkirk 082bc4f906 MAINT: Removes default_none from lookup_symbol 2015-09-16 14:55:42 -04:00
jfkirk d84bdefef8 MAINT: Removes lookup_symbol_resolve_multiple method
lookup_symbol_resolve_multiple was identical to lookup_symbol, except that lookup_symbol performed upper-casing of the input string and lookup_symbol would return Nones. Now, lookup_symbol has a kwarg 'default_None=True' and all symbols are upper-cased on insertion and request.
2015-09-16 09:54:37 -04:00
Scott Sanderson 26fd6fda8b ENH/BUG: Modeling API enhancements.
- Fixes an error where Modeling API data known as of the close of `day
  N` would be shown to algorithms during `before_trading_start` as of
  the close of the same day.  Algorithms should now only receive data
  during `before_trading_start/handle_data` that was known as of the
  simulation time at which the function would be called.

- All Term instances now have a `mask` attribute that must be a `Filter`
  or an instance of `AssetExists()`.  `mask` can be used to specify that
  a Factor should be computed in a manner that ignores the values that
  were not `True` in the mask.

- Changed the interface for `FFCLoader.load_adjusted_array` and
  `Term._compute` from `(columns, mask)`, with mask as a DataFrame, to
  `(columns, dates, assets, mask)`, where mask is a numpy array.  This
  is primarily to avoid having to reconstruct extra DataFrames when
  using masks produced by non `AssetExists` filters.

- Adds `BoundColumn.latest`, which gives the most-recently-known value
  of a column.
2015-09-16 01:47:11 -04:00
Thomas Wiecki 6f2e1672d7 BUG: Forward initialize args and kwargs to user-defined function.
The initialize method of TradingAlgorithm no longer accepts and
silently ignores args and kwargs, but instead forwards them
to the user-defined function referenced by self._initialize.

To avoid passing unexpected arguments to self._initialize, the
following additional adjustments are made:

 - pop 'namespace' from the kwargs supplied to TradingAlgorithm
   rather than simply get()ing it

 - do not pass an AssetFinder to the TradingAlgorithm in
   test_modelling_algo.py, as this has been deprecated and will
   cause self._initialize to fail
2015-09-14 10:42:20 -04:00
jfkirk 6e6ef447d2 TST: Adds tearDownClass methods to delete TradingEnvironments 2015-09-10 11:53:29 -04:00
jfkirk dc964a7e7d MAINT: Removes the ability to reference a global TradingEnvironment
This commit removes the ability to reference a shared TradingEnvironment through the zipline.finance.trading module. In place, the classes that require a TradingEnvironment, or its child AssetFinder, contain their own references to those objects.

This commit also adds serialization utilities that allow for the pickling/unpickling of objects without unintentionally their TradingEnvironments or AssetFinders.
2015-09-10 11:53:28 -04:00
Stewart Douglas 7be2cf8652 MAINT: Allow algo.run() to write to db 2015-09-10 11:53:27 -04:00
Stewart Douglas 8ccdae9870 MAINT: Change defaults when parsing kwargs in TradingEnvironment 2015-09-10 11:53:26 -04:00
Stewart Douglas 501fd58fdf ENH: Replace update_asset_finder with write_data
The write_data methods invokes the relevant AssetDBWriter subclass
to write data to the database. update_asset_finder is no longer
a relevant method since the AssetFinder is strictly a reader class.
2015-09-10 11:53:24 -04:00
Stewart Douglas ad31e1ff6e MAINT: Coerce user input to Timestamps, catching errors 2015-09-08 11:01:04 -04:00
Stewart Douglas c2159d429b ENH: Allow user to set the symbol lookup date
Previously symbols were resolved to sids based on the end of
simulation date. This commit allows the user to specify the
date for which resolution will take place using a new
set_symbol_lookup_date() API method.

If the user does not use this method the lookup date will
default back to the simulation end date.
2015-09-08 11:01:04 -04:00
jfkirk cf41373f8f BUG: Symbol look-up now uses the sim_params.period_end as a look-up date 2015-09-01 12:39:03 -04:00
Scott Sanderson f7039d6f52 ENH: Make data available in before_trading_start. 2015-08-21 12:37:17 -04:00