Commit Graph

3810 Commits

Author SHA1 Message Date
Joe Jevnik 8622993358 DOC: update docs based on Rich's feedback 2016-05-12 17:07:02 -04:00
Joe Jevnik 87c20373b4 DOC: document that docs should be built with py3 2016-05-06 15:32:48 -04:00
Joe Jevnik d888c4faaa DOC: update docs for api functions 2016-05-06 15:25:30 -04:00
Joe Jevnik 0562179060 Merge pull request #1178 from quantopian/quantopian-quandl
ENH: Adds quantopian-quandl bundle as new default.
2016-05-06 12:53:07 -04:00
Scott Sanderson f618cc94a2 Merge pull request #1187 from quantopian/test-daily-bar-reader-a-bit
BUG: Fix multiple bugs in PanelDailyBarReader.
2016-05-06 11:42:59 -04:00
Jean Bredeche 0687584c21 Merge pull request #1176 from quantopian/eod_cancel_refactor
BUG: DAY_END action not emitted during minute emission
2016-05-06 11:01:04 -04:00
Scott Sanderson 3395b33f1e BUG: Fix multiple bugs in PanelDailyBarReader.
- Return a value from `verify_all_indices_unique` so that `panel` isn't
  unconditionally `None` in `PanelDailyBarReader`.

- Fix a bug where we always set the volume of every asset to `1e9`.

- Add minimal suite of tests for get_spot_value, which catch both of the
  above.

NOTE: There are still several issues with `PanelDailyBarReader`.  The
docstring for `get_spot_value` claims that it will return -1 on days
where an asset didn't trade, which isn't the case.  It also claims that
it will raise `NoDataOnDate` when a request is made outside the panel
range, but it just raises a KeyError.  We also still have no coverage
for `load_raw_arrays`, so it's likely that there are more bugs lurking.
2016-05-06 10:59:14 -04:00
Andrew Liang 7641247b41 BUG: DAY_END action not emitted during minute emission
Refactor AlgorithmSimulator so that DAY_END is emitted for both
minute and daily emission, and that handling of end-of-minute
and end-of-day are separated
2016-05-06 10:25:44 -04:00
Jean Bredeche a068eb374a Merge pull request #1182 from quantopian/no-more-dups
DEV: Ensure there are no duplicates in the data passed into TradingAlgorithm.run
2016-05-06 09:55:23 -04:00
Jean Bredeche 7a65ae3e78 Merge pull request #1184 from quantopian/no-more-dups-2-electric-boogaloo
TEST/MAINT: Refactor unique axis verification.
2016-05-06 09:22:12 -04:00
Joe Jevnik 120d60fe27 STY: unused import 2016-05-05 18:23:03 -04:00
Joe Jevnik f7a522e3c9 ENH: update --show-progess message in the quantopian-quandl loader 2016-05-05 18:22:14 -04:00
Joe Jevnik a26802efd2 DOC: Update docs for bundles and fix the whatsnew 2016-05-05 18:22:14 -04:00
Joe Jevnik d819721d96 ENH: use more human readable format for bundle ingest directories
We are now using isoformats with ':' replaced with ';'. We cannot use a
normal isoformat because windows does not allow files or directories
with ':' in the name.
2016-05-05 18:22:13 -04:00
Joe Jevnik 0b3a35891e ENH: fix the quality of life issues in the CLI
Fixes the issues presented in #1181 by @ssanderson around the new
command line interface.
2016-05-05 18:22:13 -04:00
Joe Jevnik 89542e33bd ENH: Adds quantopian-quandl bundle as new default.
This data bundle will use the quantopian mirror of the quandl WIKI data
instead of downloading from quandl directly. This dramatically improves
the speed because we do not pay the rate limiting for quandl and we can
send the data in the format zipline expects.
2016-05-05 18:22:13 -04:00
Scott Sanderson bd0f138081 TEST/MAINT: Refactor unique axis verification.
Break it into a standalone function that handles any pandas type.
2016-05-05 14:20:47 -04:00
Jean Bredeche 69972992c0 Merge pull request #1183 from quantopian/extract-fetcher-method
DEV: extract fetcher method for easier downstream use
2016-05-05 13:23:49 -04:00
Jean Bredeche 9c291cfa28 DEV: extract fetcher method for easier downstream use 2016-05-05 13:06:14 -04:00
Jean Bredeche 3f1b0f79f2 DEV: Ensure there are no duplicates in the data passed into TradingAlgorithm.run 2016-05-05 11:54:39 -04:00
Scott Sanderson 402fb2aa99 Merge pull request #1174 from quantopian/string-classifiers
ENH: Add support for strings in Pipeline.
2016-05-05 02:32:33 -04:00
Scott Sanderson 9fd8ec180d BUG: View with specific int dtype.
Just viewing as int is broken on win32.
2016-05-05 02:13:14 -04:00
Scott Sanderson e0aeda4c3e BUG: Fix bytes/unicode issues in py3. 2016-05-05 01:46:35 -04:00
Scott Sanderson 2ceeac1237 BUG: Use compat unicode. 2016-05-04 19:58:55 -04:00
Scott Sanderson a29da32252 TEST: Don't assert particular numpy error.
They change from version to version.
2016-05-04 19:40:50 -04:00
Scott Sanderson bd49647ce0 BUG: Fix failure on pandas >= 0.17. 2016-05-04 19:38:28 -04:00
Scott Sanderson 7a4e9fd61a ENH: Make None the default for string columns. 2016-05-04 19:10:19 -04:00
Scott Sanderson b78501e54a BUG: Fix broken isnull() on string classifiers.
Adds a special case in NullFilter to handle LabelArrays correctly.
2016-05-04 17:26:27 -04:00
Scott Sanderson 317ecc8aa8 DOC: Add whatsnew. 2016-05-04 16:31:58 -04:00
Scott Sanderson 5a1ed7b1d3 ENH: Make element_of work for ints too. 2016-05-04 16:31:58 -04:00
Scott Sanderson 4357673221 MAINT: Add unicode to __all__. 2016-05-04 15:56:09 -04:00
Scott Sanderson 0922714bac DOC: Clarify test docstrings. 2016-05-04 15:54:51 -04:00
Scott Sanderson ce4378416a MAINT: Remove lazy imports of Latest.
They're no longer needed to break import cycles.
2016-05-04 15:54:51 -04:00
Scott Sanderson 17b402666c DOC: Fixup docstring. 2016-05-04 15:54:51 -04:00
Scott Sanderson 4d42cddae4 ENH: Fail fast on outputs in CustomClassifier.
We don't support multiple outputs for CustomClassifier because we use
LabelArrays for string classifiers.
2016-05-04 15:54:50 -04:00
Scott Sanderson 620d7648b0 BUG: Tests/bugfixes for LabelArray slicing.
- Fixes a bug where __setitem__ was not called when setting with a slice
  on Python 2 (__setslice__ was called instead), which caused strange
  behavior when setting an empty string.  This is fixed by overriding
  __setslice__ and forwarding to __setitem__.

- Fixes a bug where __getitem__ returned an instance of np.void when
  returning a scalar.  We now correctly return an entry from our
  categoricals.
2016-05-04 15:54:50 -04:00
Scott Sanderson 4dbc7eac56 MAINT: Remove byteswap and newbyteorder from LabelArray. 2016-05-04 15:54:50 -04:00
Scott Sanderson 8de45540f2 ENH: NaN semantics for LabelArray missing values. 2016-05-04 15:54:50 -04:00
Scott Sanderson 2395cbb671 ENH: Use np.void for labelarray storage.
This disables most broken ufuncs
2016-05-04 15:54:50 -04:00
Scott Sanderson 7a65121e6e BUG: contains was renamed to has_substring 2016-05-04 15:54:50 -04:00
Scott Sanderson 5cd7d79818 MAINT: Restore support for bytes/unicode AdjustedArrays. 2016-05-04 15:54:50 -04:00
Scott Sanderson 6b1f0caafc DOC: Clean up comment on `postprocess`. 2016-05-04 15:54:50 -04:00
Scott Sanderson 47e9b107ec DOC: Clean up docstring cruft. 2016-05-04 15:54:50 -04:00
Scott Sanderson 23324b4218 DOC: Add docstring for LabelArray. 2016-05-04 15:54:50 -04:00
Scott Sanderson 1a2ed2724b BUG: Pass correct class to super call. 2016-05-04 15:54:50 -04:00
Scott Sanderson c40bbfae03 TEST: More tests for string predicates. 2016-05-04 15:54:50 -04:00
Scott Sanderson bb6f908036 TEST: Add test for categorical postprocessing. 2016-05-04 15:54:50 -04:00
Scott Sanderson 5f190395ad ENH: Add support for strings in Pipeline.
- Adds a new class, ``LabelArray``, which is a subclass of np.ndarray.
  LabelArray is conceptually similar to pandas.Categorical, in that it
  stores data with many duplicate values as indices into an array of
  unique values.  For string data with many duplicates (e.g. time-series
  of tickers or or industry classifications), this provides multiple
  orders of magnitude of improvement when doing string operations,
  especially string comparison/matching operations.

- Adds a new generic object "specialization" for `AdjustedArrayWindow`,
  and a corresponding ObjectOverwrite adjustment.

- Adds a new ``postprocess`` method to ``zipline.pipeline.term.Term``.
  This method is called on the final result of any pipeline expression
  after screen filtering has occurred. The default implementation of
  ``postprocess`` is identity, but Classifier overrides it to coerce
  string columns into pandas.Categoricals before presenting them to the
  user.
2016-05-04 15:50:52 -04:00
Eddie Hebert 8756bf2c91 Merge pull request #1177 from quantopian/limit-minute-carrays
PERF: Cap memory usage by minute bar carrays.
2016-05-04 12:47:18 -04:00
Eddie Hebert 1248dcde36 PERF: Cap memory usage by minute bar carrays.
Instead of letting the cache of carrays grow unbounded, use an LRUCache
to cap the number of equities for any given column.

Tested with the size 1000, on an algo that was using pipeline which was
using over 3000, runtimes were similar, but the memory usage was
successfully capped to around 1.2GB.

Also, tested with an algorithm which bought and hold just one equity and
no major slow down was seen when using the LRUCache vs. a dictionary.

We may want to follow this up with an extension to `carray` which is not
as memory hungry per column; e.g. by not loading repeated/similar
metadata or releasing the last read chunk after a certain amount of
time.
2016-05-04 12:08:50 -04:00