This repository contains a basic machine learning project designed to compare the performance of classic classification algorithms on a multi-class dataset. The core of this project is the predictionModel.ipynb Jupyter notebook, which walks through the complete workflow: data preparation, exploratory analysis, model training with imbalance handling, hyperparameter tuning, and comparative evaluation.
🚀 Key Features Multi-Class Classification: The goal is to predict the Depression_Type (a multi-class target with 12 categories) from a set of mental health indicators.
Data Preprocessing: Handles categorical features via One-Hot Encoding and scales numerical features using StandardScaler.
Imbalance Handling: Explicitly addresses the significant class imbalance in the target variable by calculating and applying balanced class weights to model training.
Model Comparison: Benchmarks and compares three classification models:
Logistic Regression
Random Forest Classifier
XGBoost Classifier
Hyperparameter Tuning: Uses GridSearchCV to optimize the Random Forest and XGBoost models for better performance.
Comprehensive Evaluation: Models are evaluated using Accuracy, F1-Score, and ROC-AUC Score to provide a robust performance assessment.