This is a credit score classifier for the final project of CS 363M: Principles of Machine Learning I. We use Decision Trees, KNN, Neural Networks, Random Forests, Boosting, and clustering algorithms like K-Means and Density-Based Scanning to classify a person's credit score as "good", "standard", and "poor" given a wide variety of features like age; occupation; number of credit cards, bank accounts, loans; and more. We also did a lot of data pre-processing, feature engineering, and data visualization to make the dataset more understandable. Finally, we used PCA to reduce dimensionality and downsampled our data to prevent class imbalances. In the end, we had about a 72% accuracy among our top models.
To view our full analysis on the data, preprocessing, feature engineering, models, and conclusions, look at credit_score.ipynb.
To run the code, simply run credit_score.ipynb.
Here is the link to the dataset: https://www.kaggle.com/datasets/parisrohan/credit-score-classification/data.