Skip to content

todor02/SPAMorHAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Spam or Ham Classifier

A simple Naive Bayes text classifier that detects whether a message is Spam or Ham (Not Spam).


๐Ÿ“Œ Features

  • Preprocesses text (lowercasing + tokenization with regex)

  • Builds a vocabulary from training data

  • Uses Multinomial Naive Bayes with Laplace smoothing

  • Reports accuracy, precision, recall, and F1-score

  • Includes an interactive demo to test custom messages

๐Ÿ“‚ Dataset

This project uses the SMS Spam Collection dataset:

https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset?resource=download

  • The CSV file has two columns:

    • v1 โ†’ Label (ham or spam)

    • v2 โ†’ Message text

Example:

  • v1
    • SPAM
  • v2
    • "Congratulations! You've won a $1000 Walmart gift card. Call now!"

  • v1
    • HAM
  • v2
    • "Can we reschedule our meeting to 3 PM tomorrow?"

๐Ÿš€ Usage

1. Clone this repository

2. Run the script:

  • python main.py

Example output:

  • Accuracy: 0.95
  • Precision: 0.94
  • Recall: 0.91
  • F1-Score: 0.92

Try the interactive demo:

--- Spam Classifier Demo ---

Enter a message to classify:

"WIN A FREE iPhone! Click now!"

Prediction: SPAM


๐Ÿ”ฎ Future Improvements

Add stopword removal and stemming/lemmatization

Try different classifiers (Logistic Regression, SVM, Neural Networks)

Deploy as a simple web app with Flask/Streamlit


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages