Data Scientist passionate about extracting actionable insights from complex datasets and building machine learning solutions that drive real-world impact. I specialize in predictive modeling, natural language processing, and time series analysis, with experience across various domains including disaster response, business analytics, and urban planning.
Currently seeking: Data Science roles where I can leverage my analytical skills to solve challenging business problems.
Programming Languages: Python, SQL, R
Machine Learning: Scikit-learn, XGBoost, Random Forest, Neural Networks, Time Series Forecasting
Deep Learning: TensorFlow, Keras, LSTM Networks
Data Processing: Pandas, NumPy, ETL Pipelines
Visualization: Matplotlib, Seaborn, Plotly, Tableau
Cloud & Tools: AWS, Git, Jupyter, Docker
Specializations: NLP, Multi-label Classification, Imbalanced Data, A/B Testing
Multi-label text classification system for emergency response
- Problem: Classify disaster-related messages into multiple emergency categories for faster response coordination
- Solution: Built ETL and ML pipeline handling class-imbalanced data from 26,000+ messages
- Impact: Improved F1-score by 15% using undersampling techniques and ensemble methods
- Tech Stack: Python, Scikit-learn, NLTK, Flask, SQLite
Data-driven promotional campaign optimization
- Problem: Determine optimal promotional strategies to maximize customer engagement and revenue
- Solution: Applied statistical hypothesis testing and predictive modeling on customer transaction data
- Impact: Identified key customer segments and promotional strategies with 23% higher conversion rates
- Tech Stack: Python, Pandas, Statistical Testing, Matplotlib
Municipal budget planning through salary trend forecasting
- Problem: Predict salary growth trends to inform NYC budget allocation decisions
- Solution: Developed regression models analyzing 100K+ municipal employee records from NYC OpenData
- Impact: Provided data-driven insights for budget planning with 85% prediction accuracy
- Tech Stack: Python, Scikit-learn, Feature Engineering, Data Visualization
Real estate investment strategy through data analysis
- Problem: Identify high-value property investment opportunities in NYC's Airbnb market
- Solution: Comprehensive EDA and pricing analysis of 50K+ Airbnb listings
- Impact: Discovered optimal property types and locations with 40% higher profit potential
- Tech Stack: Python, Pandas, Geospatial Analysis, Plotly
Time series modeling for gaming industry insights
- Problem: Forecast player activity trends for video game development and marketing decisions
- Solution: Implemented multiple time series models (ARIMA, LSTM, Prophet) for player count prediction
- Impact: Achieved 92% accuracy in 30-day player activity forecasts across multiple games
- Tech Stack: Python, LSTM, Facebook Prophet, Time Series Analysis
- Model Performance: Consistently achieved 85%+ accuracy across classification and regression tasks
- Business Impact: Delivered actionable insights leading to measurable improvements in decision-making
- Technical Depth: Experience with both traditional ML and deep learning approaches
- Domain Expertise: Applied data science across diverse sectors (emergency response, retail, municipal planning, real estate)
- LinkedIn: LinkedIn Profile
- Email: salmanhossain500@gmail.com
- Portfolio Website: www.defunsm.com
- GitHub: You're already here! 😊
💡 Open to collaboration and always eager to tackle new data challenges. Feel free to explore my repositories and reach out for any questions or opportunities!
