Commit Graph

62 Commits

Author SHA1 Message Date
Richard Frank 30847a10a7 BUG: Interface of load_adjusted_array is to return a list of arrays
but MultiColumnLoader was returning a list of lists of arrays in some
cases.
2015-08-19 10:12:19 -04:00
Scott Sanderson ef4f642e62 ENH: Compute engine architecture for FFC API.
This patch lays the groundwork for a compute engine designed to
facilitate construction of factor-based universe screening and portfolio
allocation.  It contains:

A new module, `zipline.modelling`, containing entities that can be used
to express computations as dependency graphs.  Each node in such a graph
is an instance of the base `Term` class, defined in
`zipline.modelling.term`.  Dependency graphs are executed by instances
of `FFCEngine`, defined in `zipline.modelling.engine`.

A new module, `zipline.data.ffc`, containing loaders and dataset
definitions for inputs to the modelling API.

New `TradingAlgorithm` api methods: `add_factor`, and `add_filter`.
These methods can only be called from `initialize`, and are used to
inform the algorithm that each day it should compute the given terms.
Computed factor results are made available through a new attribute of
the `data` object in `before_trading_start` and `handle_data`.  Computed
filter results control which assets are available in the factor matrix
on each day.
2015-07-29 12:30:46 -04:00
jfkirk b84ac01cbf ENH: Adds futures trading and asset management logic to TradingAlgorithm and performance classes 2015-06-11 11:35:49 -04:00
Eddie Hebert 0fa44471be MAINT: Change expected type of treasury curves from load to DataFrame.
Instead of converting the curves back and forth from dictionaries to
DataFrame and back, use the DataFrame format when passing to
environment.
2015-04-20 10:26:09 -04:00
Benjamin Berman ef598c7130 BUG: Handle a ValueError on from_csv calls
The cached market data could be corrupted. Pandas raises a ValueError in
that case, and this error handles it.
2015-04-14 12:40:37 -04:00
Scott Sanderson 885db87dea MAINT: Use logger instead of printing in loader.py
Makes it easier to filter logs when they're not desired.
2015-04-14 12:40:37 -04:00
warren-oneill b62fadc76f adding NYSE trading_day and trading_days as default in load_market_data() 2015-04-08 16:57:23 -04:00
warren-oneill aa872afdf4 adding updates from master 2015-04-08 16:57:12 -04:00
warren-oneill 49c168b3d0 adding trading_day and trading_days as variables to load_market_data 2015-04-08 16:56:13 -04:00
Jonathan Kamens e942275108 STY: Flake8
Upgrade the version of the flake8, pep8, and mccabe PyPI packages, and
make the code changes necessary for compatibility with the updated
packages.
2015-03-19 17:21:25 -04:00
Jonathan Kamens c46a3afa3c BUG: Don't download benchmarks / treasury curves unnecessary
Fix an off-by-one error which was causing us to download the benchmark
and treasury curves over and over again even when they weren't needed.
2015-03-08 09:31:50 -04:00
Luke Schiefelbein 1542b41fbd BUG: Fix price caching for tickers with '/' char
On Ubuntu (assume this is true for all posix) tickers containing a slash char ("CRD/A", "BRK/A", both valid tickers with yahoo api accessible timeseries) lead to a path error in loader.py line 286.
2014-11-19 11:26:27 +01:00
Thomas Wiecki 820115f7be MAINT: Replace iterkv with iteritems.
iterkv is being deprecated as of pandas 0.14.
2014-10-22 17:25:37 +02:00
Jonathan Kamens 1d3a8759bc MAINT: Remove use of deprecated getchildren method on xml element
Rather than calling getchildren on xml.etree.ElementTree elements,
we're now supposed to just itegrate over the elements.
2014-10-08 11:39:48 -04:00
twiecki 5cb2919b10 STY: pep8 fixes. 2014-04-10 10:57:12 -04:00
twiecki 4bdecd6402 STY: PEP8 fixes. 2014-03-26 20:46:20 +09:00
Eddie Hebert 71cda461c5 BUG: Fix check for cached public data for Python 2.7
Python 2 and 3 throw different exception types when a file does
not exist.

Catch both exception types to trigger the download, so that the
loader works under both Python versions.
2014-01-07 17:19:16 -05:00
Eddie Hebert 46ab748dd2 MAINT: Use pandas for data cache file I/O
The compatibility between the two versions was made easier by
letting pandas handle the heavy lifting, so pass filenames to the
pandas serialization methods, instead of dealing doing the file
handling and reading/writing within the data module.
2014-01-07 12:01:08 -05:00
Eddie Hebert ccb05acf5c MAINT: Read text in Python 3 instead of bytes when fetching public data.
Account for byte/string compatibility when consuming response from
requests module.
2014-01-07 12:00:04 -05:00
Eddie Hebert b4959e46cf MAINT: Use six for Python 3 compatible names and behavior.
Use the six module to import functions and types that are
consistent between Python 2 and 3, so that one code base can
support both versions.

- Use integer types instead of int and long.
- Use string_types instead of basestring.
- Account for iteritems, itervalues, iterkeys.
- Use six.moves for filter and zip, reduce
- Use compatible bytes for md5 hasher.
- xrange and range
2014-01-07 11:33:50 -05:00
Eddie Hebert 54ddd1c109 MAINT: print function clean up in preparation for Python 3
- Use `print()` function for all print calls
- Fix strip and format calls that were on the outside of the
  print function for some reason.
  (Which were breaking in Python 3 because of print returning None.)
- Remove commented out print calls.
2014-01-04 20:55:43 -05:00
Eddie Hebert df8464308d MAINT: Update URL for free benchmark data.
Keep pace with Yahoo!'s change from ichart.yahoo.com to
ichart.finance.yahoo.com
2014-01-02 19:22:19 -05:00
David Stephens e45528458f ENH: Added functionality to download Canadian treasury curves.
Added automatic switching of treasury curves based on index sent to environment.
2013-12-27 13:27:43 -05:00
Eddie Hebert 50800a9863 BUG: Fix data cache filepath on Windows.
Prevent the ':' char, generated by converting a datetime to a string,
from creating on incompatible filepath for Windows.
2013-11-18 20:37:45 -05:00
Eddie Hebert 43b85cffb0 MAINT: Calculate tradingcalendar with days beyond the current day.
To make 'next open' calculations more straight ahead, calculate more
than enough days in the trading calendar.
2013-11-11 15:48:44 -05:00
Eddie Hebert 797cb8ece3 BUG: Fix bad reference to benchmark timezone in loader. 2013-11-11 14:39:11 -05:00
Eddie Hebert 89793e371c MAINT: Protect loader against Series saved with no tz.
Checking for tz.UTC is not sufficient, since it is possible for
the index.tz value to be None.
2013-11-11 14:17:14 -05:00
Eddie Hebert c45c1a22e1 BUG: Only localize benchmark index if it is naive.
Check for whether or not the index's timezone is UTC or not before
attempting to localize, since an already localized index throws an
error when tz_localize is called.
2013-10-29 13:17:58 -04:00
Eddie Hebert 2d64ab8bfe BUG: Fix naive timestamps in benchmarks.
Always convert the benchmarks to UTC, not just on reload.
2013-10-29 08:36:53 -04:00
Eddie Hebert cdbafc534a BUG: Fix mismatch of stored benchmark timestamps.
Normalize the date, so that there is not an EST/EDT and UTC mismatch.
2013-10-20 08:00:17 -04:00
Eddie Hebert 37c56b9aa4 MAINT: Use Series throughout for daily returns.
Remove the lists of DailyReturn objects in favor of using pd.Series
to store the return values.

Should make it easier to inspect the values when stepping through,
make the windowing of data to a certain range more facile by using,
and have some performance increases due to removing object creation
and member access.
2013-10-19 23:06:18 -04:00
Eddie Hebert 71f03e9537 BUG: Ensure loading benchmarks include latest dates.
The Series `.append` does not update in-place, assign the value
to `saved_benchmarks` so that we update the newest benchmarks.
2013-10-07 12:17:26 -04:00
Eddie Hebert 6ac5d49573 MAINT: Remove duplicated treasury loading code.
The dump and update of curves were both using the entire history.
So instead of having the update use a different code path, always
use dump and overwrite.
2013-10-02 11:10:15 -04:00
Eddie Hebert 5ddc134379 ENH: Cache daily data to eliminate repeat network calls.
Both unit tests and repeated runs while developing an algorithm
can benefit from having a local copy of the Yahoo data, instead
of doing a network call each time.

Store the web request results as a csv file in a cache directory,
named by symbol and date range.
2013-10-01 15:04:02 -04:00
Eddie Hebert b44fc20e4e MAINT: Remove msgpack as a dependency.
Now that the data serialization uses pandas, msgpack is no longer
needed.
2013-10-01 14:28:11 -04:00
Thomas Wiecki a66f45b598 MAINT: Moving yahoo loader from factory to utils. 2013-10-01 14:09:26 -04:00
Eddie Hebert b65f7f42c0 BUG: Fix updating treasury curves.
A transpose back to the serialization shape was left out.

Also, fixes empty return from update.
2013-10-01 11:57:04 -04:00
Eddie Hebert bfd72355bd MAINT: Remove loader_tool
This utility was referring to functions that had been long since
removed in the loader module.

If the utility is still needed by some, it can be added back in,
but using the pandas read/write instead of msgpack.
2013-09-30 11:51:04 -04:00
Eddie Hebert 956107a846 MAINT: Use pandas instead of msgpack for benchmarks and treasuries.
Instead of writing our own serialization using msgpack, leverage
the csv serialization provided by pandas.

Also, lessens the need for msgpack and functions in date_utils.
2013-09-30 11:27:35 -04:00
Thomas Wiecki b89886297f STY: autopep8 codebase. 2013-08-08 16:46:44 -04:00
Ben McCann eae5803910 BUG: Calculate benchmark returns for first day
Before we were setting benchmark returns on the first day
to 0. This commit changes this by calculating the benchmark
return from open to close.

According to @eherbert this is also what the answer key does.
2013-08-08 12:20:04 -04:00
Ben McCann 2751e98d1a ENH: Add function to download 10 year treasury data to use as a benchmark 2013-07-19 19:37:24 -04:00
Ben McCann efe50f8494 BUG: Fix get_benchmark_returns.
It should calculate the return off the pervious day's close, instead
of current day's open.
2013-07-15 15:35:09 -04:00
Eddie Hebert a968b5827c MAINT: Use print function instead of print statement.
The loader module printed some warning messages, these could
be changed to use a logger, but for now convert to use the print
function for compatibility with Python 3.
2013-07-02 21:33:40 -04:00
Eddie Hebert 4f5b2d6298 MAINT: Change relative library imports to use dot syntax.
Testing with a Python 3 virtualenv uncovered more relative imports
that did not explicitly use the dot syntax.
2013-07-02 21:31:59 -04:00
Eddie Hebert 158988d184 MAINT: Use explicit syntax for relative imports.
Python 3 requires using dot syntax for relative imports,
otherwise the import is treated as an absolute import, i.e.
an import of a module from outside of the project.

By using dot syntax now, imports should be compatible with both
Python 2.7 and Python 3.
2013-07-02 15:54:12 -04:00
Eddie Hebert 097c225c6b BUG: Fix treasury loading.
Make adjustments for using Python built-in ElementTree instead of lxml
based lxml.

lxml was edited out during pulling in of memory friendly loading of
treasury curves, however some of the use of ETree was lxml specific.

Mea culpa.
2013-04-22 14:03:25 -04:00
Richard Frank 4ba35d7d46 ENH: Stream benchmark and treasury data when downloading
Instead of loading entire csv or xml into memory.
2013-04-22 12:35:17 -04:00
Richard Frank 2dbafd5162 BUG: Zero out the microsecond attribute of datetimes
wherever we zero out the second attribute.  Otherwise, we can be
off by some microseconds from midnight, etc.
2013-04-15 10:44:44 -04:00
Eddie Hebert 05a03bcf21 BUG: Fix error during benchmark update over empty period.
On ranges with missing data from Yahoo, e.g.:
On 2013-04-2 the date range of April 2013-03-29 failed because
of the first day in the range being Good Friday, and the API not
yet updating for the Monday after.

Handle the 404 that is found by raising and warning that no
benchmark data was found, but continuing on.
2013-04-02 11:13:26 -04:00