Benchmarking of Automated Root Cause Analysis and Causal Discovery in Manufacturing Using the causRCA Dataset (CIRPe 2025)
A comprehensive repository for the causRCA dataset and benchmarking framework for causal discovery and automated root cause analysis (RCA) methods in manufacturing systems.
This repository contains the implementation, evaluation scripts, and benchmarking tools for the causRCA dataset - a collection of real-world time series data from an industrial vertical lathe. The dataset includes both normal operation data and labeled fault scenarios from hardware-in-the-loop simulations, providing ground truth for causal discovery and root cause analysis research.
- Real-world industrial data from CNC-controlled vertical lathe
- Labeled fault scenarios with known ground truth causes
- Expert-derived causal graph for benchmarking causal discovery methods
- Multiple fault types: Coolant, hydraulics, and probe system failures
- Extensive metadata enabling comprehensive method evaluation
├── data/ # causRCA dataset (see data/README_DATA_DESCRIPTION.md)
│ ├── real_op/ # Normal operation time series
│ ├── dig_twin/ # Fault simulation data
│ │ ├── exp_coolant/ # Coolant system faults
│ │ ├── exp_hydraulics/ # Hydraulic system faults
│ │ └── exp_probe/ # Probe system faults
│ ├── expert_graph/ # Expert-derived causal graph
│ └── categorical_encoding.json # Categorical variable encoding
├── src/ # Source code for analysis and benchmarking
├── eval/ # Evaluation scripts
│ ├── cd/ # Causal discovery evaluation
│ └── rca/ # Root cause analysis evaluation
The causRCA dataset is published on Zenodo:
Dataset: Link: https://doi.org/10.5281/zenodo.15876410
Dataset DOI: 10.5281/zenodo.15876410
# Install system dependencies
sudo apt-get install gcc g++ make graphviz graphviz-dev# Create conda environment
conda env create -f environment.yml
conda activate causrca
# Install package in development mode
pip install -e .Comprehensive dataset documentation is available in data/README_DATASET.md.
- Causal Discovery: Benchmark causal discovery algorithms against expert knowledge
- Root Cause Analysis: Evaluate supervised and unsupervised RCA methods
- Anomaly Detection: Test fault detection capabilities
- Causal Discovery: Benchmark learned causal graphs against expert-derived ground truth
- Supervised Root Cause Analysis: Train and test models on labeled fault diagnoses
- Unsupervised Root Cause Analysis: Identify manipulated variables with known ground truth
- Anomaly Detection: Evaluate fault detection in industrial time series
This section provides direct links between the evaluation scripts and the results tables / diagrams presented in our paper.
# Generate Causal Discovery Performance Table for majority approach (dataset ensemble learning)
python eval/cd/cd_eval.py# Generate Supervised and Unsupervised RCA Performance Tables
python eval/rca/rca_eval.py# Generate Causal RCA Performance Table using causal graphs for unsupervised root cause analysis
# Evaluates AVG@3 performance scores across three datasets (Coolant, Hydraulic, Probe)
# using causal graphs from four causal discovery algorithms compared to expert-graphs
python eval/rca/causal_rca_unsupervised_eval.py# Generate Boxplot for RCA performance using different F1 variants of expert graphs
python eval/rca/causal_rca_f1_vars_eval.pyThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.
CIRPe Paper:
tba. (after review and publication)GitHub Repository:
@software{mehling_Benchmarking_of_Automated_2025,
author = {Mehling, Carl Willy and Pieper, Sven and Lüke, Tobias},
license = {Apache-2.0},
month = jul,
title = {{Benchmarking of Automated Root Cause Analysis and Causal Discovery
in Manufacturing Using the causRCA Dataset (CIRPe 2025)
}},
url = {https://github.com/causalgraph/causRCA},
year = {2025}
}Dataset:
@dataset{mehling_2025_15876410,
author = {Mehling, Carl Willy and
Pieper, Sven and
Lüke, Tobias},
title = {causRCA: Real-World Dataset for Causal Discovery
and Root Cause Analysis in Machinery
},
month = jul,
year = 2025,
publisher = {Zenodo},
version = {1.0.0},
doi = {10.5281/zenodo.15876410},
url = {https://doi.org/10.5281/zenodo.15876410},
}Principal Investigator:
- Carl Willy Mehling (Fraunhofer IWU) - ORCID
Co-Investigators:
We gratefully acknowledge:
- KAMAX Holding GmbH & Co. KG - Real production data provision
- Schuster Maschinenbau GmbH - Digital twin development support
- ISG Industrielle Steuerungstechnik GmbH - Digital twin implementation
- SEITEC GmbH - Hardware-in-the-loop setup and OPC UA data recording
This work was developed within the research project KausaLAssist, funded by the German Federal Ministry of Education and Research (BMBF) under grant 02P20A150.
Paper submission pending review - DOI will be added upon acceptance
We welcome the use of the causRCA dataset for evaluating causal discovery and root cause analysis methods. We also encourage repository improvements through issues and pull requests.