DataArmor is a Python-based data anonymization tool that uses the Mondrian algorithm to achieve k-anonymity on structured datasets. Designed with a simple GUI, it allows users to protect sensitive information by anonymizing quasi-identifiers through generalization and intelligent partitioning.
- โ Mondrian algorithm for multi-dimensional k-anonymity
- โ Intuitive GUI using Tkinter
- โ Supports numeric and categorical quasi-identifiers
- โ Easy CSV input and anonymized output
- โ Suitable for research and educational purposes
- Load your dataset (CSV format)
- Select:
- Explicit Identifier (EID)
- Quasi-Identifiers (QIDs): Numeric and Non-numeric
- Desired value of k
- Click "Anonymize" to apply k-anonymity
- View and save the anonymized dataset
DataArmor uses the Mondrian multi-dimensional partitioning algorithm:
- Partitioning: Recursively splits the dataset based on QID values.
- Validation: Ensures each partition has at least k records.
- Generalization: Applies generalization or suppression to satisfy anonymity requirements.
| File | Description |
|---|---|
| DataArmor_GUI.py | Main script with Tkinter GUI |
| k_Anonymity.py | Mondrian algorithm implementation |
| data1.csv | Sample dataset |
| README.md | Project documentation |
- Python 3.x
pandastkinter(standard with Python)csv(built-in)
Install dependencies (if needed):
pip install pandas
- Clone the repository:
git clone https://github.com/Maha028/DataArmor.git cd DataArmor
- Run the GUI:
python DataArmor_GUI.py
- Follow the GUI steps to load your dataset and apply k-anonymity.
The repository includes data1.csv, a sample dataset for testing. You can replace it with your own structured CSV files.