ETL pipeline that extracts historical pricing & financial indicators for all S&P 500 tickers from Yahoo Finance API to PostgreSQL DB via Python.
- Programming Language: Python
- API: Yahoo Finance
- Database: PostgreSQL
- Libraries: BeautifulSoup, pandas, yfinance, psycopg2
- S&P 500 Ticker List: Wikipedia - List of S&P 500 Companies
| Script | Description | Data List (Schema Link) | DB View (Screenshot) |
|---|---|---|---|
1_get_sp500_tickers.py |
Scrapes S&P 500 ticker list from Wikipedia | sp500_tickers.csv |
|
2_1_get_sp500_prices.py |
Extracts OHLCV pricing, splits, dividends | pricing_data.csv |
Pricing Table |
3_1_get_sp500_indicators.py |
Extracts 180+ financial indicators | indicators.csv |
Indicators Table |
4_1_recommendations.py |
Extracts analyst recommendations for tickers | recommendations.csv |
Recommendations |
5_1_options.py |
Extracts options chains data | options_chain.csv |
Options Chain |
6_1_get_BS_IS_CF.py |
Extracts Balance Sheet, Income Statement, Cash Flow data | financial_statements.csv |
Financials |
Author: *Nicholas Papadimitris *
Created on: 01/03/2025 9:00 PM (UTC)
Project ID: YF_ETL_01_Mar2025
GitHub: My GitHub