Commit Graph

24 Commits

Author SHA1 Message Date
Maya Tydykov 0191d9d903 MAINT: move filtering for null date rows back to dataframe
TST: test both next and prev event frame loading and use EventsLoader.

BUG: remove extra arg

MAINT: call list on zip for compatibility with python 3
2016-04-25 16:11:12 -04:00
Maya Tydykov e41c99d077 MAINT: add an event date col field to each loader
MAINT: add event date col field and filter rows where this field is null

TST: modify tests to filter nulls in event date col

MAINT: calculate value repeats by vectorized computation on separate start and end dates.

MAINT: pass DatetimeIndex instead of list of strings
2016-04-25 11:42:08 -04:00
Eddie Hebert a13e336ef5 Merge pull request #1157 from quantopian/use-carray-instead-of-read-all-on-small-size
PERF: Improve read time for smaller num of assets.
2016-04-21 22:25:01 -04:00
Eddie Hebert 66d05aa563 PERF: Improve read time for smaller num of assets.
The BcolzDailyBarReader was optimized for the pipeline case of reading
all assets at once.

Now that the reader is also used to support daily history the case of
reading a data for a small number of assets is more common, particularly
in algorithms that use the history API which have a high rotation of
assets (e.g. an algorithm which pipeline uses to set the active
universe)

Remove the bottleneck in reading a small number of assets by
conditionally reading the slice for each asset from the carray, instead
of reading the data for all equities and then indexing into that full
array. On a certain number of assets, it is still better to read all the
data at once. On the Quantopian dataset, which holds data for 20000
about for the last 10 years of equity data (where not all equities trade
over the full range), stored in 118 blosc blp files per column, the
tipping point where the 'read all' mode wins out between 3000-4000
assets.

That number was tested by trying to exercise a worst case scenario where
the equities were spread out evenly across the blp files, by stepping
along a sorted list of assets that were alive over a query range which
spanned 70 trading days.
```
size = 3000
sids = [assets[i] for i in range(0, len(assets), len(assets) /
size)][:size]
```

Also, add parameter to WithBcolzDailyBarReader fixture which allows the
test to specify what the threshold count for reading all data should be,
so that the test_us_equity_pricing can be forced into either mode to
make sure that both branches in logic are covered by all test cases.

On local dev machine this patch improves the read time of `load_raw_array`
for one asset from 100 ms to 96.5 µs. (10^5 improvement.) With reading
only asset per call a being an observed common case when populating the
non-cached values in USEquityHistoryLoader.
2016-04-21 20:43:52 -04:00
Richard Frank 8c92f2d241 TST: What if we don't gc...
Looks like we removed ref cycles elsewhere, so windows builds are
passing without this.
2016-04-21 18:41:57 -04:00
Jean Bredeche 9d1e15ddde BUG: Fetcher wasn't working properly in before_trading_start.
We were trying to use the previous day in before_trading_start because
we were looking for the previous market minute, then normalizing it.  That's
no longer the case, as we want to use today's date for fetcher lookups
in before_trading_start.

Also refactored a bit how dataportal determines if a query should be
routed to the fetcher data structures.
2016-04-21 15:09:14 -04:00
Maya Tydykov 1531568899 ENH: add custom dataset for estimize
MAINT: alphabetize constants

MAINT: remove obsolete column

TST: refactor tests to use common code

MAINT: remove unneeded fields from dataset

MAINT: remove obsolete earnings estimates columns and refactor
2016-04-19 11:29:03 -04:00
Joe Jevnik bc0b117dc9 MAINT: make the data loading apis more consistent.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.

Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.

Adds many new fixtures and updates some existing fixtures to use the new
ones:

WithDefaultDateBounds
  A fixture that provides the suite a START_DATE and END_DATE. This is
  meant to make it easy for other fixtures to synchronize their date
  ranges without depending on eachother in strange ways. For example,
  WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
  both have data for the same dates, so they may use depend on
  WithDefaultDates without forcing a dependency between them.

WithTmpDir, WithInstanceTmpDir
  Provides the suite or individual test case a temporary directory.

WithBcolzDailyBarReader
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzDailyBarWriter.write

WithBcolzDailyBarReaderFromCSVs
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from a
  collection of CSV files and then converted into the bcolz data through
  BcolzDailyBarWriter.write_csvs

WithBcolzMinuteBarReader
  Provides the suite a BcolzMinuteBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzMinuteBarWriter.write

WithAdjustmentReader
  Provides the suite a SQLiteAdjustmentReader which reads from an in
  memory sqlite database. The data will be read from dataframes and then
  converted into sqlite with SQLiteAdjustmentWriter.write

WithDataPortal
  Provides each test case a DataPortal object with data from temporary
  resources.
2016-04-15 23:46:10 -04:00
Richard Frank 32a400a9fb BUG: Fixing bitness issues on 32-bit systems
by being explicit with sizes
2016-04-12 17:07:50 -04:00
Joe Jevnik e9498a73f4 Merge pull request #1065 from quantopian/extra-guards-in-fixtures
TST: adds guards for calling class methods in instance setup
2016-04-05 12:51:58 -04:00
Eddie Hebert 16fd6681a6 ENH: Rewrite of Zipline to use lazy access pattern
More documentation to follow in release notes.

Based on lazy-mainline branch, see for more details.

Also-By: Jean Bredeche <jean@quantopian.com>
Also-By: Andrew Liang <aliang@quantopian.com>
Also-By: Abhijeet Kalyan <akalyan@quantopian.com>
2016-04-04 16:12:58 -04:00
Maya Tydykov e8185a1512 MAINT: reorganize - move testing mixin to fixtures
BUG: correctly create asset finder

MAINT: rename fixture

STY: fixes for flake8

STY: add space around assignment

MAINT: add var back to constructor

MAINT: remove unused import

MAINT: compare var with None directly

MAINT: fix merge errors
2016-03-29 13:15:16 -04:00
Maya Tydykov 6b3560ade8 MAINT: remove redundant function and move to utils 2016-03-29 13:12:51 -04:00
Maya Tydykov 06dd6e958d TST: recfator tests to use fixtures
MAINT: use np.array

MAINT: return cols rather than modifying attribute
2016-03-29 13:12:50 -04:00
Scott Sanderson 872b84e09a ENH: Implement Factor.quantiles. 2016-03-25 15:11:18 -04:00
Scott Sanderson 16c5aecba6 DEV: Add utility for permuting rows in an array.
Useful for testing rank-order functions on arrays.
2016-03-25 15:11:18 -04:00
Scott Sanderson b85eb36da8 TEST: Add test for demean example. 2016-03-25 15:11:18 -04:00
Richard Frank dd8175b1d9 TST: Forward arguments to numpy 2016-03-23 15:26:53 -04:00
Maya Tydykov 4164ffdcb0 BUG: call correct method 2016-03-21 16:41:23 -04:00
Joe Jevnik 639ff62731 TST: adds guards for calling class methods in instance setup 2016-03-18 14:29:29 -04:00
Scott Sanderson 56942e4598 DOC: Fix typo in docstring. 2016-03-15 20:34:40 -04:00
Scott Sanderson 73562a159d MAINT: Cleanups while reading fixtures. 2016-03-15 20:25:37 -04:00
Joe Jevnik 4f247e3aaa TST: Adds WithLogger and WithTradingEnvironment 2016-03-14 19:32:55 -04:00
Joe Jevnik 721dd36116 TST: move test_utils and adds test fixture classes
Renames zipline.utils.test_utils to zipline.testing

Adds zipline.testing.fixtures.ZiplineTestCase to manage setup and
teardown and adds mixins to define fixtures like an asset finder or
trading calendar.
2016-03-10 15:39:52 -05:00