Implementation and reproducibility package for the experiments described in:
Carlos Mario Braga Ortuño, Manuel A. Serrano, and Eduardo Fernández-Medina (2025)
Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking
Proceedings of the IEEE/ACM 12th International Conference on Big Data Computing, Applications and Technologies (BDCAT 2025), ACM.
Article No. 27, 6 pages.
DOI: https://doi.org/10.1145/3773276.3775243
This repository provides the full experimental pipeline required to reproduce the results, figures, and statistical analyses reported in the BDCAT 2025 paper cited above. It includes all components needed to replicate the proposed unsupervised, per-client detection method for early identification of backdoor and poisoning attacks in federated learning under secure aggregation.
In addition, this repository serves as the reproducibility and experimental baseline for a new extended article currently in preparation by the same authors, to be submitted to JNIC 2025 (Jornadas Nacionales de Investigación en Ciberseguridad).
The new article is entitled:
“De la detección al diagnóstico: interpretación estructural de rupturas ecosistémicas en ciberseguridad”
This forthcoming work builds upon the methodology introduced in the BDCAT 2025 paper and extends it towards:
- structural interpretation of detected symmetry-breaking events,
- diagnostic insights into ecosystemic deviations,
- and a broader cybersecurity-oriented analysis framework.
The repository is maintained to ensure full reproducibility, methodological transparency, and experimental continuity between the published international contribution and the ongoing national-level research work.
This implementation extends the canonical framework by Bagdasaryan et al. (2020):
How To Backdoor Federated Learning, AISTATS 2020, PMLR v108, pp. 2938-2948.
The original code includes:
- Federated Averaging (FedAvg) implementation in PyTorch.
- Datasets: CIFAR-10 and MNIST.
- Attack strategies: model replacement and semantic backdoor.
- Multi-client simulation with benign and adversarial participants.
This project preserves the training, aggregation, and attack logic from the canonical framework to ensure comparability with prior results.
generate_files.py # Generates baseline + delta combinations (e.g., t1_1.csv ... t1_5.csv)
calibrate_thresholds_proj.py # Calibrates Energy and Wasserstein-1 thresholds using bootstrap
compare_distribution_proj.py # Compares runs against the baseline and triggers anomaly alerts
diagnostic_validation.py # Runs the structural diagnostic validation module (JNIC 2025 extension)
plot_graphs.py # Plots Energy and W1 values with threshold overlays
data/ # Input CSVs (T0.csv, T1_delta.csv, etc.)
results/ # Output results
├── detection/ # Detection outputs (JSON + PNG files for threshold exceedances)
└── diagnostic/ # Diagnostic validation outputs (JNIC 2025 extension)
├── *.json # Structural diagnostic summaries
└── *.png # Representative figures from diagnostic validation
README.md
Dependencies:
- Python 3.8
numpy,pandas,scipy,matplotlib
-
Baseline calibration (benign reference)
Calibrate thresholds using the benign datasetT0.csv:python calibrate_thresholds_proj.py data/T0.csv --B 500 --n0 50 --pctl 99
This produces a JSON file in
results/such as:results/thresholds_T0_nadd12_pctl99.json -
Generate per-run datasets (T1, T2, T3)
Using a delta file representing new updates (benign or adversarial):python generate_files.py data/T0.csv data/T2_delta.csv data/t2
This creates files:
data/t2_1.csv ... data/t2_5.csv -
Compare distributions and detect deviations
Evaluate deviations using the calibrated thresholds:python compare_distribution_proj.py T2 results/thresholds_T0_nadd12_pctl99.json --data-dir data
Output:
results/results_T2.json -
Plot summary graphs
Generate the figures for Energy and W1 metrics:python plot_graphs.py results/results_T2.json
Output files in
results/:T2_energy.pngT2_w1.png
-
Structural diagnostic validation (post-detection analysis)
Perform structural diagnosis over the detected threshold exceedances to characterize ecosystemic symmetry-breaking patterns across runs:python diagnostic_validation.py
This script operates on the detection outputs generated in the previous steps (results/results_T1.json, results/results_T2.json, results/results_T3.json) and produces:
Diagnostic summary: results/diagnostic/diagnostic_summary.json Diagnostic thresholds (calibrated from benign reference behavior, percentile 95): results/diagnostic/diagnostic_thresholds.json
Two-dimensional diagnostic plots: results/diagnostic/density_plot.pdf, results/diagnostic/persistence_plot.pdf and results/diagnostic/stability_plot.pdf
Three-dimensional structural space visualization: results/diagnostic/structural_space_3d.pdf
The diagnostic stage introduces three structural indicators: Rupture density δ(T), Rupture persistence π(T) and Structural stability S(T)
The experiments reproduce the three canonical cases described in the paper.
Each scenario corresponds to a different type or intensity of client update.
| Scenario | Description | Type of Update | Attack Strength | Expected Behavior |
|---|---|---|---|---|
| T1 | Benign control (no adversarial activity). | Regular client updates only. | (none) | Distances remain below threshold (no alerts). |
| T2 | Strong attack: semantic backdoor using model replacement. | Adversarial updates scaled aggressively. | 100 | Clear multi-projection deviations; early detection triggered. |
| T3 | Stealthy attack: reduced-scale backdoor. | Adversarial updates with minimal scaling. | 20 | Moderate deviations; detectable but weaker signals. |
The scaling factor controls the amplitude of the malicious update relative to the global model.
Higher values of the scaling factor lead to faster convergence of the backdoor but make detection easier.
Each stage in the pipeline can be executed independently. Below are the main options.
Creates CSVs combining the baseline (T0.csv) and delta updates.
Usage
python generate_files.py <T0.csv> <delta.csv> <output_prefix>| Argument | Description |
|---|---|
<T0.csv> |
Baseline benign updates. |
<delta.csv> |
Delta file with new updates (benign or adversarial). |
<output_prefix> |
Output prefix for generated CSVs. |
Performs bootstrap calibration of Energy and Wasserstein-1 thresholds.
Usage
python calibrate_thresholds_proj.py data/T0.csv [options]| Option | Default | Description |
|---|---|---|
--B |
500 | Bootstrap replications. |
--n0 |
50 | Baseline size per replication. |
--n-add |
- | Effective size of new (weighted) sample. |
--pstar |
- | Equivalent to --n-add, defines effective proportion p*. |
--pctl |
95 | Percentile used for threshold calibration. |
--debias-proj |
- | Removes the dominant benign projection direction. |
--use-l2-perp |
- | Replaces l2 with the orthogonal norm (requires --debias-proj). |
--out-name |
auto | Output file name. |
Compares each generated run against the benign baseline and checks for deviations.
Usage
python compare_distribution_proj.py <CASE> <thresholds.json> [options]| Option | Default | Description |
|---|---|---|
--data-dir |
data |
Folder with generated CSVs. |
--outdir |
results |
Output folder for results. |
--drift-prefix |
– | Optional drift correction prefix. |
--rule |
any |
Trigger rule: any (OR) or both (AND). |
Generates the figures for Energy and W1 metrics.
Usage
python plot_graphs.py results/results_T2.json| Option | Default | Description |
|---|---|---|
--outdir |
results |
Output folder for figures. |
Performs structural diagnostic validation on the outputs of the detection pipeline, transforming threshold exceedances into interpretable ecosystemic indicators.
This script implements the diagnostic layer introduced in the JNIC 2025 extension, enabling post-detection analysis without modifying the original detection methodology.
Usage
python diagnostic_validation.pyThis repository accompanies the following published work:
Early Detection of Backdoor Attacks in Federated Learning via Ecosystemic Symmetry Breaking
Proceedings of the IEEE/ACM 12th International Conference on Big Data Computing, Applications and Technologies (BDCAT 2025), ACM.
In addition, the repository serves as the reproducibility and validation package for the following work currently under submission:
De la detección al diagnóstico: interpretación estructural de rupturas ecosistémicas en ciberseguridad
Submitted to JNIC 2025 (Jornadas Nacionales de Investigación en Ciberseguridad).
- The detection pipeline (baseline calibration, per-run comparison, and threshold-based alerts) fully reproduces the experiments and figures reported in the BDCAT 2025 paper.
- The diagnostic validation module extends the original pipeline with post-detection structural analysis and corresponds to the experimental contribution evaluated in the JNIC 2025 submission.
All experiments are reproducible using the scripts provided in this repository.
- This repository is made available for scientific reproducibility and peer review purposes.
- The diagnostic components (
diagnostic_validation.pyandresults/diagnostic/) are provided specifically for the evaluation of the JNIC 2025 submission. - At this stage, no open-source license is granted. Redistribution or reuse of the code or experimental outputs is not permitted without explicit author permission.
- All scripts are compatible with Python 3.8+ and rely only on standard scientific Python libraries.
- Default parameters reproduce:
- Benign behavior (
T1) - Strong backdoor attack (
T2) - Stealthy backdoor attack (
T3)
- Benign behavior (
- Output files (
.json,.png,.pdf) are generated under theresults/directory.
For questions related to this reproducibility package, please contact the authors through the corresponding conference submission system.
This reproducibility package was developed as part of the research submitted to FEDAS ’25.
The underlying research received support from the following projects:
- Di4SPDS (PCI2023145980-2) funded by MCIN/AEI/10.13039/501100011033 and by the European Union (Chist-ERA Program).
- KOSMOS-UCLM (PID2024-155363OB-C44) funded by MCIN/AEI/10.13039/501100011033/FEDER, EU.
- AURORA (SBPLY/24/180225/000074) funded by the Regional Government of Castilla-La Mancha and the European Regional Development Fund (FEDER).
- RADAR (2025-GRIN-38447) funded by FEDER.
- RED2024-154240-T funded by MICIU/AEI/10.13039/501100011033.