22 Commits

Author SHA1 Message Date
Conner Fromknecht 99efa7a9f3 Fixed catalyst tests except example tests 2017-06-19 14:43:10 -07:00
Andrew Daniels b3c1cd5535 MAINT: Display diff if input to daily bar writer has gaps/extra bars 2017-05-04 10:19:09 -04:00
Andrew Daniels 0d9f4d29f5 MAINT: Handle gaps in input to daily bars writer (#1778)
Previously, a dataframe passed into BcolzDailyBarWriter.write that was
missing an expected session between its first and last sessions would be
written incorrectly. Upon converting the dataframe to a ctable, the
values for all days following the gap would be shifted backwards, and
nans would be shifted in at the end.

This commit handles the issue by asserting that the number of rows in
the input table matches the number of sessions in the calendar between
the table's first and last sessions.

Also fixes a test that was mistakenly using minutes_in_range where it
should have been using sessions_in_range (uncovered by this change).
2017-05-03 20:49:22 -04:00
Eddie Hebert 098d38ac76 MAINT: Return nan from daily bcolz get_value.
Match the behavior of the minute bar reader, now that the session and
minute bar readers share a common interface.

isnull is slightly slower than checking against -1; however, n cases
where we check against illiquid trades in a tight loop, volume is
checked which is not using nan. The change here should be marginal with
regards to performance.
2016-10-25 11:25:09 -04:00
Eddie Hebert 151c3e45a7 TST: Fix get_last_traded_dt on bcolz daily reader.
Remove special handling for the last session of an asset, which was
moving the last traded back a session.

If the asset has data on a session, `get_last_traded_dt` should always
return that session if it is the parameter to the method.
2016-08-31 14:59:58 -04:00
Eddie Hebert 71a34bf7ac MAINT: Standardize reader get value methods.
The daily/session bar reader's `spot_price` took the same parameters and
returned the same kind of output as the minute bar reader's `get_value`.

Standardize on one method to make a common interface, which may be
formally factored out in a later patch; to help enable writing reader
implementations or mixins which can be agnostic to the bar frequency.
2016-08-24 12:46:36 -04:00
Joe Jevnik 4265a13edf Revert "Merge pull request #1354 from quantopian/revert-1302-point-in-time-asset-db"
This reverts commit 3b633011c6, reversing
changes made to 70ac5323de.
2016-08-02 14:25:10 -04:00
Joe Jevnik 814a2be7b7 Revert "Point in time asset db" 2016-07-27 23:29:08 -04:00
Joe Jevnik 7fd8c29880 ENH: add point in time aspect to equity symbol mapping
Changes the overlap behavior so that it is an error to write data which
would have two companies holding the same ticker. Other than one test
around which company would win in that case, all the other tests are
passing. That single test has been changed to check the write-time
error.
2016-07-26 13:34:58 -04:00
Jean Bredeche 5a0f840917 Clean up daily bar reader/writer to take advantage of new trading calendar. The reader
is backwards-compatible with the previous format.

In USEquityLoader, use dailyreader's trading_calendar.

This is backwards compatible and will fall back to the NYSE calendar if
the reader doesn’t have a calendar specified.
2016-07-15 15:13:57 -04:00
Jean Bredeche 6fb4923cc7 Re-implemented the Calendar API.
Instead of having separate ExchangeCalendar and TradingSchedule objects, we
now just have TradingCalendar.  The TradingCalendar keeps track of each
session (defined as a contiguous set of minutes between an open and a close).
It's also responsible for handling the grouping logic of any given minute
to its containing session, or the next/previous session if it's not a market
minute for the given calendar.
2016-07-12 13:13:50 -04:00
Eddie Hebert 51eda06323 MAINT: Add equity to naming of bar data classes.
In preparation of adding futures, add equity to the names of both the
classes and methods for writing bcolz data. Futures data will use a
different minutes per day with a separate reader. This change will allow
both equity and futures fixtures to be side by side.

Also, break out the method which generates the dataframes and trading
days member into fixtures (`EquityMinuteBarData` and
`EquityDailyBarData`) on which the `*BarReader` fixture depends.  This
fixture is separated out to enable reader/writers in different formats
to use the same data setup. (There is internal code which needs to write
minute and daily bar data in a database format.)
2016-06-30 08:21:42 -04:00
jfkirk c8304e8601 ENH: Adds ExchangeCalendar, TradingSchedule, and implementations
Conflicts:
	tests/data/test_minute_bars.py
	tests/data/test_us_equity_pricing.py
	tests/finance/test_slippage.py
	tests/pipeline/test_engine.py
	tests/pipeline/test_us_equity_pricing_loader.py
	tests/serialization_cases.py
	tests/test_algorithm.py
	tests/test_assets.py
	tests/test_bar_data.py
	tests/test_benchmark.py
	tests/test_exception_handling.py
	tests/test_fetcher.py
	tests/test_finance.py
	tests/test_history.py
	tests/test_perf_tracking.py
	tests/test_security_list.py
	tests/utils/test_events.py
	zipline/algorithm.py
	zipline/data/data_portal.py
	zipline/data/us_equity_loader.py
	zipline/errors.py
	zipline/finance/trading.py
	zipline/testing/core.py
	zipline/utils/events.py
2016-06-08 13:34:18 -04:00
Andrew Daniels 8e6c98e9aa BUG: Fixes reading and writing of daily bars first_trading_day attr
When writing first_trading_day, it is already in the correct frame of
reference (seconds since epoch) and does not need to be transformed
further. Adjusts the reader to expect this value.
2016-06-02 13:41:09 -04:00
Joe Jevnik 59c8e371a2 ENH: Updates the cli, data bundles and extensions.
Adds the data bundle concept which makes it easy for users to register
loading functions to build out minute and daily data along with an
assets db and adjustments db. By default we have provided a `quandl`
bundle which pulls from the public domain WIKI dataset. Users may
register new bundles by decorating an ingest function with
`zipline.data.bundles.register(<name>)`. This also provides a
`yahoo_equities` function for creating an ingestion function that will
load a static set of assets from yahoo.

The cli is now structured as a couple of subcommands and has been
changed to `python -m zipline`. The old behavior of `run_algo.py` has
been moved to the `run` subcommand. This is almost entirely the same
except that it now takes the name of the data bundle to use, defaulting
to `quandl`.

The next subcommand is `ingest` which takes the name of
a data bundle to ingest. This will run the loading machinery and write
the data to a specified location that `run` can find.

There is also a `clean` subcommand which deletes the data that was
written with `ingest`.

Extensions have also been added to zipline. This is an experimental
feature where users can provide an extra set of python files to run at
the start of the process. These can be used to configure aspects of
zipline. Right now the only thing that is supported in an extension file
is the registration of a new data bundle.
2016-05-03 18:38:24 -04:00
Eddie Hebert 66d05aa563 PERF: Improve read time for smaller num of assets.
The BcolzDailyBarReader was optimized for the pipeline case of reading
all assets at once.

Now that the reader is also used to support daily history the case of
reading a data for a small number of assets is more common, particularly
in algorithms that use the history API which have a high rotation of
assets (e.g. an algorithm which pipeline uses to set the active
universe)

Remove the bottleneck in reading a small number of assets by
conditionally reading the slice for each asset from the carray, instead
of reading the data for all equities and then indexing into that full
array. On a certain number of assets, it is still better to read all the
data at once. On the Quantopian dataset, which holds data for 20000
about for the last 10 years of equity data (where not all equities trade
over the full range), stored in 118 blosc blp files per column, the
tipping point where the 'read all' mode wins out between 3000-4000
assets.

That number was tested by trying to exercise a worst case scenario where
the equities were spread out evenly across the blp files, by stepping
along a sorted list of assets that were alive over a query range which
spanned 70 trading days.
```
size = 3000
sids = [assets[i] for i in range(0, len(assets), len(assets) /
size)][:size]
```

Also, add parameter to WithBcolzDailyBarReader fixture which allows the
test to specify what the threshold count for reading all data should be,
so that the test_us_equity_pricing can be forced into either mode to
make sure that both branches in logic are covered by all test cases.

On local dev machine this patch improves the read time of `load_raw_array`
for one asset from 100 ms to 96.5 µs. (10^5 improvement.) With reading
only asset per call a being an observed common case when populating the
non-cached values in USEquityHistoryLoader.
2016-04-21 20:43:52 -04:00
Joe Jevnik bc0b117dc9 MAINT: make the data loading apis more consistent.
Changes BcolzDailyBarWriter to not be an abc, data is passed as an
iterator of (sid, dataframe) pairs to the write method.

Changes the AssetsDBWriter to be a single class which accepts an engine
at construction time and has a `write` method for writing dataframes for
the various tables. We no longer support writing the various other data
types, callers should coerce their data into a dataframe themselves. See
zipline.assets.synthetic for some helpers to do this.

Adds many new fixtures and updates some existing fixtures to use the new
ones:

WithDefaultDateBounds
  A fixture that provides the suite a START_DATE and END_DATE. This is
  meant to make it easy for other fixtures to synchronize their date
  ranges without depending on eachother in strange ways. For example,
  WithBcolzMinuteBarReader and WithBcolzDailyBarReader by default should
  both have data for the same dates, so they may use depend on
  WithDefaultDates without forcing a dependency between them.

WithTmpDir, WithInstanceTmpDir
  Provides the suite or individual test case a temporary directory.

WithBcolzDailyBarReader
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzDailyBarWriter.write

WithBcolzDailyBarReaderFromCSVs
  Provides the suite a BcolzDailyBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from a
  collection of CSV files and then converted into the bcolz data through
  BcolzDailyBarWriter.write_csvs

WithBcolzMinuteBarReader
  Provides the suite a BcolzMinuteBarReader which reads from bcolz data
  written to a temporary directory. The data will be read from
  dataframes and then converted to bcolz files with
  BcolzMinuteBarWriter.write

WithAdjustmentReader
  Provides the suite a SQLiteAdjustmentReader which reads from an in
  memory sqlite database. The data will be read from dataframes and then
  converted into sqlite with SQLiteAdjustmentWriter.write

WithDataPortal
  Provides each test case a DataPortal object with data from temporary
  resources.
2016-04-15 23:46:10 -04:00
Joe Jevnik 721dd36116 TST: move test_utils and adds test fixture classes
Renames zipline.utils.test_utils to zipline.testing

Adds zipline.testing.fixtures.ZiplineTestCase to manage setup and
teardown and adds mixins to define fixtures like an asset finder or
trading calendar.
2016-03-10 15:39:52 -05:00
Eddie Hebert 5f81acea05 ENH: Return -1 for missing spot prices.
Return -1 when there is a zero value for a spot price.
Intended for use by the incoming data portal changes. When the data
portal will see a -1 value, the portal will seek back a trading day
until a non-negative value is returned.
2015-11-25 11:32:36 -05:00
Eddie Hebert 53dae6320c BUG: Fix volume value returned by daily spot price
Volumes were incorrectly having the thousands factor applied, however
the volume is written as is (without the factor, since it volume is an
int, not float value.)

Fix by adding a special case for volume which returns the price as is.
2015-11-25 10:19:52 -05:00
Eddie Hebert 5338c8e611 ENH: Add spot_price to BcolzDailyBarReader.
Add new method to BcolzDailyBarReader, `spot_price` which returns the
unadjusted price for the specified day and sid.
2015-10-10 07:19:03 -04:00
Eddie Hebert e33f6dcdcd MAINT: Move equity data formats out of loader.
Put the logic for reading and writing the equity price and adjustment
data into a module located in data, making it distinct from the pipeline
loader usage of the formats.

This prepares for both incoming changes of how adjustments are written,
(which includes using the bcolz daily reader as an input), as well as
eventually providing the readers to a DataPortal object.
2015-10-09 17:20:19 -04:00