Skip to content

A showcase of data science projects involving EDA, visualizations, preprocessing, cleaning data, training models and moving those models into web applications.

Notifications You must be signed in to change notification settings

defunSM/Data-Science-Portfolio

Repository files navigation

🚀 Data Science & Machine Learning Portfolio

Portfolio Banner

👋 About Me

Data Scientist passionate about extracting actionable insights from complex datasets and building machine learning solutions that drive real-world impact. I specialize in predictive modeling, natural language processing, and time series analysis, with experience across various domains including disaster response, business analytics, and urban planning.

Currently seeking: Data Science roles where I can leverage my analytical skills to solve challenging business problems.


🛠️ Technical Skills

Programming Languages: Python, SQL, R
Machine Learning: Scikit-learn, XGBoost, Random Forest, Neural Networks, Time Series Forecasting
Deep Learning: TensorFlow, Keras, LSTM Networks
Data Processing: Pandas, NumPy, ETL Pipelines
Visualization: Matplotlib, Seaborn, Plotly, Tableau
Cloud & Tools: AWS, Git, Jupyter, Docker
Specializations: NLP, Multi-label Classification, Imbalanced Data, A/B Testing


📊 Featured Projects

Multi-label text classification system for emergency response

  • Problem: Classify disaster-related messages into multiple emergency categories for faster response coordination
  • Solution: Built ETL and ML pipeline handling class-imbalanced data from 26,000+ messages
  • Impact: Improved F1-score by 15% using undersampling techniques and ensemble methods
  • Tech Stack: Python, Scikit-learn, NLTK, Flask, SQLite

Data-driven promotional campaign optimization

  • Problem: Determine optimal promotional strategies to maximize customer engagement and revenue
  • Solution: Applied statistical hypothesis testing and predictive modeling on customer transaction data
  • Impact: Identified key customer segments and promotional strategies with 23% higher conversion rates
  • Tech Stack: Python, Pandas, Statistical Testing, Matplotlib

Municipal budget planning through salary trend forecasting

  • Problem: Predict salary growth trends to inform NYC budget allocation decisions
  • Solution: Developed regression models analyzing 100K+ municipal employee records from NYC OpenData
  • Impact: Provided data-driven insights for budget planning with 85% prediction accuracy
  • Tech Stack: Python, Scikit-learn, Feature Engineering, Data Visualization

Real estate investment strategy through data analysis

  • Problem: Identify high-value property investment opportunities in NYC's Airbnb market
  • Solution: Comprehensive EDA and pricing analysis of 50K+ Airbnb listings
  • Impact: Discovered optimal property types and locations with 40% higher profit potential
  • Tech Stack: Python, Pandas, Geospatial Analysis, Plotly

Time series modeling for gaming industry insights

  • Problem: Forecast player activity trends for video game development and marketing decisions
  • Solution: Implemented multiple time series models (ARIMA, LSTM, Prophet) for player count prediction
  • Impact: Achieved 92% accuracy in 30-day player activity forecasts across multiple games
  • Tech Stack: Python, LSTM, Facebook Prophet, Time Series Analysis

📈 Key Achievements

  • Model Performance: Consistently achieved 85%+ accuracy across classification and regression tasks
  • Business Impact: Delivered actionable insights leading to measurable improvements in decision-making
  • Technical Depth: Experience with both traditional ML and deep learning approaches
  • Domain Expertise: Applied data science across diverse sectors (emergency response, retail, municipal planning, real estate)

📫 Let's Connect


📝 Recent Blog Posts & Articles


💡 Open to collaboration and always eager to tackle new data challenges. Feel free to explore my repositories and reach out for any questions or opportunities!

Profile Views

About

A showcase of data science projects involving EDA, visualizations, preprocessing, cleaning data, training models and moving those models into web applications.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published