The objective of this project is to build a machine learning model that predicts whether a patient has diabetes based on medical diagnostic measurements.
- Python
- Pandas, NumPy
- Scikit-learn
- Support Vector Machine (SVM)
- Loaded and analyzed the diabetes dataset
- Split the dataset into training and testing sets
- Applied feature scaling using StandardScaler
- Trained a linear Support Vector Machine classifier
- Evaluated the model using accuracy metric
- Ensured no data leakage by fitting the scaler only on training data
dataset/– Contains the diabetes datasetsrc/data_preprocessing.py– Data loading, splitting, and scalingmodel.py– SVM model definitiontrain.py– Model training and evaluation
requirements.txt– Project dependenciesREADME.md– Project documentation
- Achieved an accuracy of 77.2% on the test dataset
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
- Run the training script:
python src/train.py