A comprehensive machine learning platform for financial market prediction, portfolio optimization, and risk analysis.
This project aims to build a production-ready finance ML platform that demonstrates:
- Machine Learning: From classical models to deep learning
- Financial Engineering: Technical analysis, portfolio theory, risk management
- Data Engineering: ETL pipelines, feature engineering, data quality
- MLOps: Model deployment, monitoring, CI/CD
- Cloud Infrastructure: Scalable, production-ready deployment
Phase: Phase 0 - MVP Setup Progress: 0% Last Updated: 2026-02-12
See docs/tracking/PROJECT_STATUS.md for detailed progress tracking.
- Project Roadmap - 12-18 month development plan
- Project Summary - Quick reference guide
- Setup Guide - Installation and configuration
- Run Guide - How to run different components
- Project Status - Current progress and metrics
- Features - Implemented and planned features
- TODO - Task checklist
- All Documentation - Complete docs directory
financeTool/
โโโ data/ # Data storage
โ โโโ raw/ # Original data
โ โโโ processed/ # Processed features
โ โโโ external/ # Third-party data
โโโ notebooks/ # Jupyter notebooks
โโโ src/ # Source code
โ โโโ data/ # Data collection
โ โโโ features/ # Feature engineering
โ โโโ models/ # ML models
โ โโโ backtesting/ # Backtesting engine
โ โโโ api/ # API endpoints
โ โโโ utils/ # Utilities
โโโ tests/ # Test suite
โโโ configs/ # Configuration files
โโโ dashboards/ # Web dashboards
โโโ docs/ # Documentation
โโโ models/ # Saved models
โโโ logs/ # Application logs
- Python 3.9+ - Primary language
- pandas & numpy - Data manipulation
- scikit-learn - Classical ML
- PyTorch/TensorFlow - Deep learning
- yfinance - Market data
- Transformers - NLP and sentiment analysis
- FastAPI - REST API
- Streamlit - Interactive dashboards
- MLflow - Experiment tracking
- Docker - Containerization
- AWS/GCP - Cloud deployment
- Python 3.9 or higher
- pip package manager
- Virtual environment (venv or conda)
# Clone or navigate to project
cd /Users/carterchan/Documents/self-projects/financeTool
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # macOS/Linux
# OR
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Verify installation
python -c "import pandas; import numpy; import sklearn; print('Setup successful!')"See docs/guides/SETUP_GUIDE.md for detailed installation instructions.
This project is designed as a learning journey through:
- Phase 0 (Weeks 1-3): Basic ML and data handling
- Phase 1 (Weeks 4-8): Financial engineering and backtesting
- Phase 2 (Weeks 9-16): Deep learning for time series
- Phase 3 (Weeks 17-24): NLP and alternative data
- Phase 4 (Weeks 25-32): Portfolio optimization and risk
- Phase 5 (Weeks 33-40): Cloud deployment and MLOps
- Phase 6+ (Months 12+): Continuous expansion
- Collect historical stock data for 1-3 assets
- Compute basic features (returns, moving averages, volatility)
- Train simple models (Logistic Regression, Random Forest)
- Visualize predictions vs actuals
- Install dependencies
- Select target stocks (e.g., AAPL, MSFT, SPY)
- Create data collection script
- Implement feature engineering
- Train baseline models
See docs/PROJECT_ROADMAP.md for complete phase details.
# Run all tests
pytest tests/
# Run with coverage
pytest --cov=src tests/
# Run specific test
pytest tests/test_data_collection.py -v- โ Historical data collection
- โ Basic feature engineering
- โ Classical ML models
- โ Simple backtesting
- ๐ง LSTM/GRU models
- ๐ง Sentiment analysis
- ๐ง Multimodal predictions
- โธ๏ธ Portfolio optimization
- โธ๏ธ Risk management
- โธ๏ธ REST API
- โธ๏ธ Cloud deployment
See docs/tracking/FEATURES.md for complete feature tracking.
python src/data/collect_data.py --symbols AAPL MSFT SPY --start 2020-01-01python src/models/train_random_forest.py --data data/processed/AAPL_features.csvpython src/backtesting/run_backtest.py --strategy threshold --model models/rf_model.pklSee RUN_GUIDE.md for detailed usage instructions.
- Setup Guide - Installation and environment setup
- Run Guide - How to run all components
- Project Roadmap - Long-term development plan
- Features - Feature tracking and status
- Chat Context - Development decisions log
- Project Status - Current progress metrics
This is a personal learning project, but suggestions and feedback are welcome!
- Create feature branch
- Implement changes with tests
- Update documentation
- Submit for review
This project is for educational purposes.
- Build end-to-end ML pipeline
- Implement multiple model architectures
- Deploy production-ready API
- Achieve >55% directional accuracy
- Master time series ML
- Understand financial engineering
- Gain MLOps experience
- Build cloud-native applications
- Portfolio showcase piece
- Demonstrate full-stack ML skills
- Show continuous learning
- Prove production readiness
- Check CHAT_CONTEXT.md for past discussions
- Review SETUP_GUIDE.md for troubleshooting
- See RUN_GUIDE.md for usage help
- Cryptocurrency prediction
- Real-time streaming data
- Reinforcement learning traders
- Explainable AI
- Multi-timeframe analysis
- Macroeconomic indicators
- Automated reporting
- Mobile app interface
Start your journey: Follow the Setup Guide and begin Phase 0!
Last Updated: 2026-02-12