2019-07-18 17:37:40 -04:00
2019-07-18 17:37:40 -04:00
2019-07-18 17:37:40 -04:00
2019-07-18 17:37:40 -04:00
2019-07-18 17:37:40 -04:00
2019-07-18 17:37:40 -04:00

EDGAR Python Web Scraper

This repository contains Gary Pang's Python Web scraper for parsing fund holdings pulled from SEC website, EDGAR, and writing a .tsv file from the data.

Requirements

Getting Started

  • pip install -r requirements.txt (or pipenv install if you are using pipenv)
  • python scraper.py (or pipenv run python scraper.py)
  • When prompted, enter the 10-digit CIK number of a mutual fund

Key Dependencies

  • Requests, Python library for making HTTP requests
  • lxml, Python library for processing XML and HTML
  • Beautiful Soup, Python library for scraping information from Web pages
  • re, Python module for using regular expressions
  • csv, Python module for parsing and writing CSV and TSV files

Contributor

References

S
Description
Python Web scraper for parsing 13F filings (mutual fund holdings) from SEC website
Readme MIT 283 KiB
Languages
Jupyter Notebook 99.3%
Python 0.7%