This project builds an end-to-end machine learning pipeline to predict whether it will rain on a given day in the Melbourne region using historical weather data. The goal is to demonstrate practical machine learning skills including feature engineering, model pipelines, hyperparameter tuning, and evaluation.
Accurate rainfall prediction is important for daily planning and risk management. Using historical meteorological data, this project predicts whether measurable rainfall will occur on a given day.
The dataset contains daily weather observations from Australia between 2008 and 2017.
Sources:
- Australian Bureau of Meteorology (BOM)
- Kaggle: Weather Dataset (Rattle Package)
The analysis focuses on the following locations to reduce geographic variability:
- Melbourne
- Melbourne Airport
- Watsonia
- Handling missing data
- Preventing data leakage by redefining the prediction target
- Managing categorical and numerical features
- Class imbalance in rainfall prediction
- Dropped features with excessive missing values
- Renamed rainfall labels to avoid target leakage
- Engineered a seasonal feature from date information
- Filtered data by geographically close locations
Two supervised classification models were implemented and compared:
- Robust to feature interactions
- Tuned using GridSearchCV
- Achieved strong overall accuracy
- Interpretable baseline model
- Improved recall for rainy days
- Better performance on minority class prediction
Both models were trained using a unified preprocessing and modeling pipeline.
- Accuracy
- Precision, Recall, and F1-score
- Confusion Matrix
- Feature Importance (Random Forest)
- Overall accuracy: ~84%
- Logistic Regression showed slightly better recall for rainfall events
- Seasonal patterns and humidity-related features were among the most influential
- Python
- Pandas, NumPy
- Scikit-learn
- Matplotlib, Seaborn
rainfall-prediction-melbourne/
│
├── rainfall_prediction_melbourne.ipynb
├── README.md
├── requirements.txt
- Clone the repository
git clone https://github.com/amarkumar55/rainfall-prediction-melbourne.git
pip install -r requirements.txt
jupyter notebook rainfall_prediction_melbourne.ipynb
This project was completed independently as part of my learning journey.
All code, analysis, and explanations are my own.