This project focuses on detecting deepfake images AI-generated media designed to appear real using various machine learning models. Deepfakes pose serious threats by spreading misinformation, damaging reputations, and facilitating identity based fraud. Our goal was to evaluate and compare different detection models in terms of accuracy, training time, and usability, and to provide a simple user interface for hands-on experimentation.
- @Rigvedi10 โ Sharath Rigvedi
- @atbusch78 โ Alex Busch
- @ritvikvroy โ Ritvik Roy
- Abdul Rehman โ (GitHub profile not yet available)
We trained and tested the following models for binary classification (real vs fake images):
| Model | Accuracy | Time to Fit |
|---|---|---|
| Neural Network (NN) | 88.55% | 99s |
| Support Vector Machine (SVM) | 72.71% | 492s |
| Deep Neural Network (Xception) | 57.76% | ~600s |
| Random Forest (RF) | 82% | 977s |
- Neural Network: Best balance of speed and performance.
- Random Forest: Second-best accuracy but longest training time.
- SVM: Simple but not scalable to large data or high dimensions.
- Xception (DNN): Advanced but computationally expensive and impractical in limited-resource settings like Google Colab.
We used the CIFAKE dataset from Kaggle: ๐ CIFAKE Dataset on Kaggle
- 120,000 total images (60K real + 60K deepfake)
- Synthetic and clean, requiring minimal preprocessing
- Resizing: All images resized to match model input requirements (e.g., 299x299 for Xception).
- Minimal Cleaning: Dataset was already curated.
- Feature Extraction for SVM: Extracted color histograms, texture, and edges manually.
We developed a lightweight UI in Google Colab for each model using streamlit. Users can upload an image, classify it as real/fake, and view the modelโs confidence score.
- Run all cells in the Colab notebook.
- Use
!wget -O - ipv4.icanhazip.comto get the IP address (used as the UI password). - Click the final cellโs link to open the UI.
- Input the IP address to access.
- Upload an image and press "Classify Image".
| Model | Notebook | UI |
|---|---|---|
| Neural Network | Notebook | UI |
| Neural Network (Pretrained) | Notebook | UI |
| SVM | Notebook | UI |
| Random Forest | Notebook | UI |
| Xception (DNN) | Notebook | UI |
- Use ImageNet or larger real-world datasets for better generalization.
- Optimize the training process with TPUs/GPUs.
- Implement real-time video deepfake detection.
- Add ensemble methods or transformers for improved detection.