Skip to content

Linear Regression for baseline, Random Forest for non-linearity, KNN for proximity-based prediction. Models trained on 80% data, 20% for testing, evaluated with MSE.

Notifications You must be signed in to change notification settings

LIoccoUMD/supervised-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Supervised Learning Project

This repository contains code for a supervised learning project aimed at predicting student GPA using various machine learning algorithms.

NOTE: The error in the ipynb file is there to show that the dataset must be downloaded before running the file. Once downloaded everything will behave as intended.

Project Overview

  • Objective: Predict student GPA based on multiple features using different regression models.
  • Algorithms Used:
    • Linear Regression: Used to establish a baseline performance by fitting a linear equation to the observed data.
    • Random Forest Regression: Applied to capture non-linear relationships, providing an ensemble learning method to improve prediction accuracy.
    • K-Nearest Neighbors (KNN) Regression: Employed to predict values based on the average of the 'k' closest data points in the feature space.

Methodology

  • Data Preparation: Selected features from dataset, excluding 'id' and 'gpa' for training.
  • Model Training and Testing: Each model was trained on 80% of the data with the remaining 20% reserved for testing.
  • Model Evaluation: Performance assessed using Mean Squared Error (MSE).

Usage

To run this project:

  • Download the dataset listed at the top of the .ipynb file
  • Make sure it is in a reachable location
git clone https://github.com/LIoccoUMD/supervised-learning/
cd supervised-learning
python Supervised_Learning.ipynb

OR, for an easier setup

  • Open in Google Colab
  • Download the linked dataset
  • Run directly in Colab

About

Linear Regression for baseline, Random Forest for non-linearity, KNN for proximity-based prediction. Models trained on 80% data, 20% for testing, evaluated with MSE.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published