Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking

Implementation and reproducibility package for the experiments described in:

Carlos Mario Braga, Manuel A. Serrano, and Eduardo Fernández-Medina (2025)
Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking
Submitted to the First International Workshop on Security and Privacy in Federated and Distributed Architectures (FEDAS'25),
in conjunction with BDCAT 2025, ACM, Nantes, France.

This repository provides the full pipeline for reproducing the experiments, figures, and statistical analyses described in the paper.

Base Software

This implementation extends the canonical framework by Bagdasaryan et al. (2020):

How To Backdoor Federated Learning, AISTATS 2020, PMLR v108, pp. 2938-2948.

The original code includes:

Federated Averaging (FedAvg) implementation in PyTorch.
Datasets: CIFAR-10 and MNIST.
Attack strategies: model replacement and semantic backdoor.
Multi-client simulation with benign and adversarial participants.

This project preserves the training, aggregation, and attack logic from the canonical framework to ensure comparability with prior results.

Repository Structure

generate_files.py              # Generates baseline + delta combinations (e.g., t1_1.csv ... t1_5.csv)
calibrate_thresholds_proj.py   # Calibrates Energy/Wasserstein-1 thresholds using bootstrap
compare_distribution_proj.py   # Compares runs against the baseline and triggers anomaly alerts
plot_graphs.py                 # Plots Energy and W1 values with threshold overlays
data/                          # Input CSVs (T0.csv, T1_delta.csv, etc.)
results/                       # Output results (JSON + PNG files)
README.md

Dependencies:

Python 3.8
numpy, pandas, scipy, matplotlib

Execution Pipeline

Baseline calibration (benign reference)
Calibrate thresholds using the benign dataset T0.csv:
```
python calibrate_thresholds_proj.py data/T0.csv --B 500 --n0 50 --pctl 99
```
This produces a JSON file in results/ such as:
```
results/thresholds_T0_nadd12_pctl99.json
```
Generate per-run datasets (T1, T2, T3)
Using a delta file representing new updates (benign or adversarial):
```
python generate_files.py data/T0.csv data/T2_delta.csv data/t2
```
This creates files: data/t2_1.csv ... data/t2_5.csv

Compare distributions and detect deviations
Evaluate deviations using the calibrated thresholds:

python compare_distribution_proj.py T2 results/thresholds_T0_nadd12_pctl99.json --data-dir data

Output:

results/results_T2.json

Plot summary graphs
Generate the figures for Energy and W1 metrics:
```
python plot_graphs.py results/results_T2.json
```
Output files in results/:
- T2_energy.png
- T2_w1.png

Attack Scenarios

The experiments reproduce the three canonical cases described in the paper.
Each scenario corresponds to a different type or intensity of client update.

Scenario	Description	Type of Update	Attack Strength	Expected Behavior
T1	Benign control (no adversarial activity).	Regular client updates only.	(none)	Distances remain below threshold (no alerts).
T2	Strong attack: semantic backdoor using model replacement.	Adversarial updates scaled aggressively.	100	Clear multi-projection deviations; early detection triggered.
T3	Stealthy attack: reduced-scale backdoor.	Adversarial updates with minimal scaling.	20	Moderate deviations; detectable but weaker signals.

The scaling factor controls the amplitude of the malicious update relative to the global model.
Higher values of the scaling factor lead to faster convergence of the backdoor but make detection easier.

Command-Line Parameters

Each stage in the pipeline can be executed independently. Below are the main options.

`generate_files.py`

Creates CSVs combining the baseline (T0.csv) and delta updates.

Usage

python generate_files.py <T0.csv> <delta.csv> <output_prefix>

Argument	Description
`<T0.csv>`	Baseline benign updates.
`<delta.csv>`	Delta file with new updates (benign or adversarial).
`<output_prefix>`	Output prefix for generated CSVs.

`calibrate_thresholds_proj.py`

Performs bootstrap calibration of Energy and Wasserstein-1 thresholds.

Usage

python calibrate_thresholds_proj.py data/T0.csv [options]

Option	Default	Description
`--B`	500	Bootstrap replications.
`--n0`	50	Baseline size per replication.
`--n-add`	-	Effective size of new (weighted) sample.
`--pstar`	-	Equivalent to `--n-add`, defines effective proportion p*.
`--pctl`	95	Percentile used for threshold calibration.
`--debias-proj`	-	Removes the dominant benign projection direction.
`--use-l2-perp`	-	Replaces l2 with the orthogonal norm (requires `--debias-proj`).
`--out-name`	auto	Output file name.

`compare_distribution_proj.py`

Compares each generated run against the benign baseline and checks for deviations.

Usage

python compare_distribution_proj.py <CASE> <thresholds.json> [options]

Option	Default	Description
`--data-dir`	`data`	Folder with generated CSVs.
`--outdir`	`results`	Output folder for results.
`--drift-prefix`	â€“	Optional drift correction prefix.
`--rule`	`any`	Trigger rule: `any` (OR) or `both` (AND).

`plot_graphs.py`

Generates the figures for Energy and W1 metrics.

Usage

python plot_graphs.py results/results_T2.json

Option	Default	Description
`--outdir`	`results`	Output folder for figures.

Reproducibility & Review Notice

This repository accompanies the submission:

Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking
Submitted to the First International Workshop on Security and Privacy in Federated and Distributed Architectures (FEDAS'25),
in conjunction with BDCAT 2025, ACM, Nantes, France.

The repository is provided solely for peer review and reproducibility evaluation.
It reproduces all experiments and figures reported in the paper (Sections 3-5).

At this stage, no public license is granted. Redistribution or reuse is not allowed until final publication and licensing.

If you are a reviewer:

All scripts can be executed with Python 3.8 and standard dependencies.
Default parameters reproduce the benign (T1), strong attack (T2), and stealthy attack (T3) scenarios.
Output files (.json, .png) will appear in the results/ folder.

For questions related to this reproducibility package, please contact the corresponding author through the submission system.

Acknowledgments

This reproducibility package was developed as part of the research submitted to FEDAS â€™25.
The underlying research received support from the following projects:

Di4SPDS (PCI2023145980-2) funded by MCIN/AEI/10.13039/501100011033 and by the European Union (Chist-ERA Program).
KOSMOS-UCLM (PID2024-155363OB-C44) funded by MCIN/AEI/10.13039/501100011033/FEDER, EU.
AURORA (SBPLY/24/180225/000074) funded by the Regional Government of Castilla-La Mancha and the European Regional Development Fund (FEDER).
RADAR (2025-GRIN-38447) funded by FEDER.
RED2024-154240-T funded by MICIU/AEI/10.13039/501100011033.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking

Base Software

Repository Structure

Execution Pipeline

Attack Scenarios

Command-Line Parameters

`generate_files.py`

`calibrate_thresholds_proj.py`

`compare_distribution_proj.py`

`plot_graphs.py`

Reproducibility & Review Notice

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
results		results
README.md		README.md
calibrate_thresholds_proj.py		calibrate_thresholds_proj.py
compare_distribution_proj.py		compare_distribution_proj.py
generate_files.py		generate_files.py
plot_graphs.py		plot_graphs.py

GSYAtools/Backdoor_attacks_FL

Folders and files

Latest commit

History

Repository files navigation

Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking

Base Software

Repository Structure

Execution Pipeline

Attack Scenarios

Command-Line Parameters

generate_files.py

calibrate_thresholds_proj.py

compare_distribution_proj.py

plot_graphs.py

Reproducibility & Review Notice

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`generate_files.py`

`calibrate_thresholds_proj.py`

`compare_distribution_proj.py`

`plot_graphs.py`

Packages