dependabot[bot] af9f1ab857 Bump lxml from 4.6.3 to 4.6.5
Bumps [lxml](https://github.com/lxml/lxml) from 4.6.3 to 4.6.5.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-4.6.3...lxml-4.6.5)

---
updated-dependencies:
- dependency-name: lxml
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-13 20:07:54 +00:00
2019-07-18 17:48:48 -04:00
2019-07-18 17:37:40 -04:00
2021-12-13 20:07:54 +00:00
2019-07-18 17:51:41 -04:00
2021-02-26 03:14:04 -05:00
2019-07-18 17:37:40 -04:00

SEC Python Web Scraper

This repository contains a Python Web scraper for parsing 13F filings (mutual fund holdings) from SEC's website, EDGAR, and writing a .tsv file from the data.

Requirements

Getting Started

  • pip install -r requirements.txt (or pipenv install if you are using pipenv)
  • python scraper.py (or pipenv run python scraper.py)
  • When prompted, enter the 10-digit CIK number of a mutual fund

Key Dependencies

  • Requests, Python library for making HTTP requests
  • lxml, Python library for processing XML and HTML
  • Beautiful Soup, Python library for scraping information from Web pages
  • re, Python module for using regular expressions
  • csv, Python module for parsing and writing CSV and TSV files

Contributor

References

S
Description
Python Web scraper for parsing 13F filings (mutual fund holdings) from SEC website
Readme MIT 282 KiB
Languages
Jupyter Notebook 99.3%
Python 0.7%