Skip to content
View manuel-reyes-ml's full-sized avatar

Block or report manuel-reyes-ml

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
manuel-reyes-ml/README.md

Hi, I'm Manuel 👋

Building Production Data Systems | Business Operations → Data → ML → LLM Engineering
Currently: Stage 1 (Data Analyst) | Building in Public | 37-Month Journey

LinkedIn Email Portfolio


💼 What Makes Me Different

Most entry-level candidates have tutorial projects. I have production code with measurable business impact.

My Unique Position:

Most Candidates What I Bring
Tutorial projects Production ETL system (live, saving $15K/year)
No domain expertise 15+ years data experience across multiple industries
Basic skills 8 years finance + 6 years trading domain expertise
Vague portfolios Public code with synthetic data (privacy-conscious)
Unclear trajectory Systematic 37-month roadmap to Senior LLM Engineer
Private learning Building in public (transparent, accountable)

The Combination: Deep domain expertise (data + finance + trading) + Production systems + Technical skills + Clear growth trajectory = Immediate value + long-term potential


🎯 Quick Navigation

For Recruiters (START HERE):

  1. 💼 Projects Portfolio - Production + learning projects with business impact
  2. 🧾 1099 ETL Pipeline - Live production system (public code)
  3. 🔗 LinkedIn Profile - Professional background & recommendations

For Technical Review:

For Collaboration:


🚀 Production Highlight

Status: ✅ Live in production | 🌐 Public repository with synthetic data

Business Context:
Manual reconciliation of retirement plan distributions took 4-6 hours weekly, was error-prone, and blocked critical 1099-R tax reporting at Daybright Financial.

Technical Solution:
Automated Python ETL pipeline extracting, transforming, validating, and reconciling data between two financial systems (Relius and Matrix).

Measurable Impact:

  • 95% time reduction (4-6 hours → 15 minutes weekly)
  • 💰 $15,000+ annual savings in labor costs
  • 📊 10x scalability (300+ accounts vs. 30 manual capacity)
  • Zero errors since deployment (comprehensive validation)
  • 🔒 Production-ready code visible on GitHub (smart privacy with synthetic data)

Why This Matters:

  • Real enterprise system (not tutorial project)
  • Measurable business ROI
  • Professional data governance (public code, private data)
  • Demonstrates production best practices from day one

Tech Stack: Python • pandas • openpyxl • Excel • matplotlib • data validation • faker

→ View Code & Technical Documentation


📊 Current Stage: Data Analyst

Stage 1 Capabilities (Months 1-5)

What I Can Do Right Now:

# Data Analysis & ETLPython data pipelines (production-deployed)
✅ SQL queries (SELECT, JOIN, WHERE, aggregations)
✅ Pandas data manipulation & cleaningData validation & error handlingExcel automation (openpyxl)
✅ API integration & data extractionData visualization (Matplotlib, Seaborn, Plotly)

# Business SkillsProcess automation & efficiency analysisFinancial data reconciliationStakeholder reportingROI analysis & business impact measurementProduction system deployment

Currently Building:

  • 📈 Trading Attention Tracker - Multi-source data pipeline correlating trading volume with public attention
  • 🎓 Certifications: CS50, Python for Everybody, IBM Data Analyst, Google Data Analytics

Next Milestone:

  • 🎯 Land Data Analyst role (Target: Month 5)
  • 📊 Complete Stage 1 capstone projects

🗺️ 37-Month Progression Plan

Systematic evolution from Data Analyst to Senior LLM Engineer:

Stage Duration Role Key Skills Status
1 Months 1-5 Data Analyst Python, SQL, Statistics, Visualization 🟢 ACTIVE
2 Months 6-15 Data Engineer AWS, Airflow, PySpark, Data Pipelines ⚪ Planned
3 Months 16-29 ML Engineer scikit-learn, TensorFlow, MLOps, Deployment ⚪ Planned
4 Months 30-34 LLM Specialist LangChain, RAG, Vector DBs, Fine-tuning ⚪ Planned
5 Months 35-37 Senior LLM Engineer Production AI Systems, Leadership 🎯 Goal

Current Focus: Stage 1 (Data Analyst)
Ultimate Goal: Production AI Trading Assistant (Month 37)

→ View Interactive Roadmap


🛠️ Technical Skills

Current Proficiency:

Languages:     Python 3.11+, SQL
Data Analysis: pandas, NumPy, Matplotlib, Seaborn, Plotly
Databases:     SQLite, PostgreSQL (learning)
Tools:         Jupyter, Git/GitHub, VS Code, Excel (advanced)
APIs & Web:    REST APIs, JSON/XML, BeautifulSoup
Production:    ETL pipelines, data validation, error handling

Domain Expertise:

Data Operations: 15+ years working with data (manufacturing, bookkeeping, finance)
Financial Services: 8 years (retirement plans, compliance, reconciliation)
Trading: 6 years active trading (technical analysis, quantitative strategies)
Business: Process automation, ROI analysis, stakeholder communication

Stage-Based Skill Progression:

Stage 2 (Data Engineer): AWS, Airflow, PySpark, Redshift, Docker
Stage 3 (ML Engineer): scikit-learn, TensorFlow, PyTorch, MLOps
Stage 4 (LLM Specialist): LangChain, RAG, vector databases, fine-tuning
Stage 5 (Senior): Production AI systems, architecture, leadership


📚 Featured Projects

Status: 🚧 Active Development | Phase 2 (v1.1)

Multi-source data pipeline testing hypothesis: Public attention spikes correlate with trading volume increases.

Technical Highlights:

  • End-to-end pipeline: Multiple APIs → SQLite → Analysis → Visualization
  • Data sources: yfinance, Wikipedia API, RSS feeds, web scraping
  • Normalized database design (5 related tables)
  • Time-series correlation analysis
  • Sentiment analysis integration

Evolution Path:

  • v1.0: Core pipeline, 3 tickers, basic analysis
  • 🚧 v1.1: Expand to 10+ tickers, CSV exports, enhanced validation
  • 📅 v2.0: Interactive Streamlit dashboard
  • 📅 v3.0: ML models predicting volume from attention

Tech Stack: Python • SQLite • pandas • yfinance • Wikipedia API • BeautifulSoup • Matplotlib

→ View Project


Central repository linking all projects with business context, technical details, and impact metrics.

Categories:

  • 🏭 Production Systems: ETL pipelines, automation tools
  • 📈 Finance & Trading: Market analysis, quantitative strategies
  • 📚 Learning Projects: Capstone projects, certifications
  • 🧪 Experiments: Proof-of-concepts, testing new tools

→ View All Projects


🎓 Education & Certifications

Active Learning (Stage 1):

  • 🚧 CS50: Introduction to Computer Science (Harvard)
  • 🚧 Python for Everybody Specialization (University of Michigan)
  • 🚧 Google Data Analytics Professional Certificate
  • 🚧 IBM Data Analyst Professional Certificate (11 courses)
  • 📅 Statistics with Python Specialization (University of Michigan)

Planned by Stage:

  • Stage 2: AWS Certified Data Engineer, Solutions Architect Associate
  • Stage 3: TensorFlow Developer Certificate, Deep Learning Specialization
  • Stage 4: Advanced LLM Engineering, RAG Systems
  • Stage 5: Leadership & Architecture courses

💡 Why I'm Making This Transition

I've worked with data my entire career—manufacturing operations, bookkeeping, financial reporting, financial services, and 6 years of active trading. One truth became undeniable: data-driven decisions consistently outperform gut feeling.

I watched companies make million-dollar mistakes by ignoring their numbers. I saw traders succeed or fail based purely on their data analysis discipline. The pattern was crystal clear.

But I hit a ceiling: I could analyze data brilliantly, but I couldn't build the automated systems to scale these insights. I couldn't create the real-time pipelines, ML models, or AI assistants I envisioned.

So I'm building them myself—following a systematic 37-month path from Data Analyst to Senior LLM Engineer. I'm doing it publicly because career transformation shouldn't be mysterious—it should inspire others to do the same.


💭 Building in Public Philosophy

Why share everything openly?

Transparency: Real learning is messy—showing actual process, not just polished results
Accountability: Public commits = public commitment (can't fake progress)
Community: Others learn from my journey, I learn from theirs
Portfolio: This profile IS proof of ability, work ethic, and trajectory
Authenticity: Career transitions should be documented honestly

What makes this different:

  • Not just completing courses → Enhancing exercises beyond requirements
  • Not just learning → Deploying production systems with measurable impact
  • Not just building projects → Solving real business problems
  • Not just showing code → Explaining business context and ROI
  • Not staying private → Sharing progress, challenges, and breakthroughs

🎯 The Ultimate Goal: AI Trading Assistant

Building toward a production-grade LLM-powered trading system that:

  • 🔍 Analyzes markets in real-time using multi-source data
  • 🤖 Generates trading signals with ML models
  • 💬 Provides natural language insights (LLM-powered)
  • ⚡ Executes algorithmic strategies automatically
  • 📊 Learns and adapts continuously

Why This Is The Perfect Target:

  • Combines deep domain expertise (data + finance + trading) with cutting-edge AI
  • Solves real problem (trading analysis is time-consuming and emotional)
  • Rare skill combination in the market
  • Demonstrates end-to-end capability (data engineering → ML → LLM → production)
  • Foundation for consulting/startup opportunities

Progressive Build Across 5 Stages:

  • Stage 1: Data pipeline + market analysis (Current)
  • Stage 2: Cloud infrastructure + real-time data streams
  • Stage 3: ML models + backtesting frameworks
  • Stage 4: LLM integration + natural language insights
  • Stage 5: Production deployment + monetization strategy

📊 GitHub Activity

GitHub Streak

Top Languages


🌐 Let's Connect

LinkedIn Email Portfolio

I'm Open To:

💼 Data Analyst Opportunities

  • Remote positions preferred
  • Business operations or finance/trading sectors
  • Teams that value production experience and domain expertise

🤝 Professional Connections

  • Data professionals and traders
  • Code reviews and technical discussions
  • Collaborations on trading + tech projects

🎓 Knowledge Exchange

  • Mentorship (giving or receiving)
  • Career transition advice
  • Data-driven decision making discussions

Let's Connect If You:

  • Value production code over tutorial completions
  • Are building data-driven trading systems
  • Believe in transparent career development
  • Want to discuss data + AI + finance intersection
  • Are on a similar learning journey
  • Are hiring Data Analysts with proven delivery capability

⚡ Quick Facts

  • 📈 6+ years active trading (swing & day strategies)
  • 🌅 4:30 AM club (early morning focused study)
  • ♟️ Chess enthusiast (strategy translates to markets!)
  • 🤖 Fascinated by LLMs transforming financial analysis
  • 📚 Reading: Machine Learning for Algorithmic Trading + Hands-On LLMs
  • 🎯 Obsessed with data-driven decision making
  • 💪 Proving systematic learning beats raw talent

📌 Repository Guide

Production Systems:

Active Projects:

Learning Documentation:


💡 "From financial analyst to LLM engineering—systematic learning + production focus = career transformation"

⭐️ Star repos if you find them useful!
🔔 Follow for updates on the 37-month journey!
💬 Connect to collaborate or discuss data + trading + tech!


Current Stage: Stage 1 (Data Analyst) - Months 1-5
Status: 🟢 Active • Building in Public • Deploying Production Code

→ View Live Progress & Roadmap

Pinned Loading

  1. data-portfolio data-portfolio Public

    Portfolio of data analysis & engineering projects. Transitioning from trading/operations to data science & AI. Python, SQL, pandas.

  2. 1099_reconciliation_pipeline 1099_reconciliation_pipeline Public

    Automated ETL + analytics pipeline to reconcile Relius and Matrix retirement plan distributions and generate 1099-R correction files before mailing. Cuts manual reconciliation time (up to 95%) and …

    Jupyter Notebook

  3. trading_attention_tracker trading_attention_tracker Public

    Analyze how news headlines, sentiment, and Wikipedia attention relate to stock trading volume using Python, SQLite, pandas, and public APIs.

  4. learning_journey learning_journey Public

    37-month learning roadmap from Financial Services Professional to LLM Engineer. Includes comprehensive course notes (CS50, Python, SQL, IBM DA) and enhanced project implementations. Active learning…

    Python