ML Model for Detecting Fake Handwritten Signatures
This project implements a deep learning–based system that classifies handwritten signatures as genuine or forged. The model is trained using the CEDAR Signature Dataset, which contains both authentic signatures and skilled forgeries.
The goal of the project is to detect fraudulent signatures using image-based features learned by a Convolutional Neural Network (CNN). The system also provides summary analytics such as how many signatures are valid or invalid in a dataset.
Project Overview
Handwritten signatures are widely used in financial, legal, and administrative domains as a form of verification. However, signature forgery is common and poses a serious threat to identity security.
This project builds an automated signature validation system using deep learning. The CNN learns stroke-level and shape-level features from signature images and predicts whether a given signature is genuine or forged.
The system supports:
Single-signature prediction (GENUINE or FORGED)
Batch evaluation of signatures
Counting how many signatures are valid vs invalid
Generating basic charts for visual analysis
Dataset: CEDAR Signature Dataset
The model uses the CEDAR Signature Dataset, a widely used dataset for handwritten signature verification research.
Dataset Information
55 writers
24 genuine signatures per writer
24 forged signatures per writer
Skilled forgeries included
Approximately 2,640 total images
Dataset Folder Structure Used in Project signature_data/ train/ genuine/ forged/ val/ genuine/ forged/ test/ genuine/ forged/
Objectives of the Project
Build a machine learning model to classify handwritten signatures.
Detect whether a signature is valid (genuine) or invalid (forged).
Achieve high accuracy using a custom CNN architecture.
Provide counts of genuine and forged signatures across the dataset.
Provide visual output to support presentations and analysis.
Model Description
A custom Convolutional Neural Network (CNN) is used for binary image classification.
Model Features
Multiple convolutional layers with ReLU activation
MaxPooling layers for downsampling
Dense layers for classification
Dropout layer to reduce overfitting
Sigmoid output layer for binary prediction (genuine/forged)
Training Details
Input: 128 × 256 grayscale signature images
Loss: Binary Crossentropy
Optimizer: Adam
Metric: Accuracy
Epochs: 15–20
Model Performance
Typical accuracy results:
Training accuracy: ~96%
Validation accuracy: ~93–96%
Test accuracy: ~94–95%
These scores indicate strong signature classification capability, even with skilled forgeries.
System Outputs
- Single Signature Prediction
The model predicts:
"GENUINE"
"FORGED"
It also outputs a probability score representing confidence.
Example:
Prediction: GENUINE Probability genuine: 0.9823
- Valid vs Invalid Signature Count
The project includes a feature to count how many signatures are predicted as valid or invalid.
Example:
Total signatures checked : 198 Predicted VALID (genuine): 102 Predicted INVALID (forged): 96
- Visual Output
A bar chart is generated showing:
Number of genuine signatures
Number of forged signatures
Technologies Used
Python
TensorFlow / Keras
OpenCV
NumPy
Matplotlib
Google Colab (GPU environment)
Future Enhancements
Use Siamese Networks for writer-dependent verification.
Build a Streamlit/Flask web application for real-time signature checking.
Add larger multi-writer datasets to improve generalization.
Use Generative Adversarial Networks (GANs) to create synthetic signatures.
Deploy the model as an API or cloud-based service.
Conclusion
This project successfully demonstrates the use of deep learning for detecting forged handwritten signatures. The CNN achieves high accuracy on the CEDAR dataset and can effectively differentiate between genuine and forged signatures. With additional enhancements, the system can be integrated into real-world verification workflows such as banking, documentation, and identity verification.