GitHub - mlrsrch/hierarchy_ensembles: Hierarchy-Exploiting Ensembles for Improved Multi-Class Classification

HiGEC
Hierarchy Generation and Extended Classification Framework

HiGEC is a Python framework for enhancing multi-class classification through automated hierarchy generation (HG) and flexible hierarchy exploitation (HE) strategies. It supports hybrid approaches that integrate hierarchical and flat classifier outputs.

🔧 Installation

git clone https://github.com/alagoz/higec.git
cd higec
pip install -r requirements.txt

Dependencies:
numpy scipy matplotlib scikit-learn scikit-learn-extra proglearn xgboost lightgbm

⚡ Key Features

� Automatic hierarchy generation from flat class labels

🧩 Hybrid HE+F classification strategies

🖇️ Support for any scikit-learn compatible classifier

📊 Benchmark-ready with OpenML integration

🌳 Visualization tools for hierarchy inspection

🚀 Quick Start

Run the example:

python run_higec_example.py

Pipeline:

Downloads OpenML dataset
Trains flat classifier baseline
Generates class hierarchy
Evaluates hierarchical approach

🛠 Core Components

File	Purpose
`HG.py`	Hierarchy generation
`HE.py`	Hierarchy exploitation
`hdc.py`	Divisive clustering
`utils.py`	Data handling & visualization

🧪 Customization

Adjust parameters in 'run_higec_example.py':

DID = 46264                       # OpenML dataset ID
HiGEC = 'CCM[HAC|COMPLETE]-LCPN[ETC]+F[XGB]'  # HG + HE scheme
CLF_NAME_FC = 'RF'                # Flat classifier

Available classifiers: RF, XGB, ETC, LGB.

📈 Example Output

Extended Linkage Table:

node_id:0, node_type:parent, subsets:[[0], [1,2,3,4]], branch_ids:[0,7], parent_id:None
node_id:1, node_type:parent, subsets:[[3,4],[1,2]], branch_ids:[5,6], parent_id:0

Performance Comparison:

- Flat Classification (RF) (f1): 0.3517 in 0.4309 seconds
- HiGEC: CCM[HAC|COMPLETE]-LCPN[ETC]+F[XGB] (f1): 0.3700 in 1.1853 seconds

Generated Hierarchy:

📊 Benchmark Results

HiGEC was evaluated on 100 multi-class tabular datasets, showing consistent F1-score gains over flat classification (FC), particularly with hybrid HE+F configurations.

Mean F1 Comparison (HiGEC vs FC)

Mean F1 Scores & Standard Deviations

Download raw results (F1 scores per dataset):

f1_scores_fc_vs_higec.csv – Contains per-dataset F1-scores of FC and selected 9 HiGEC algorithms.
Columns: index, short, RF, XGB, ETC, LGB, LCN[XGB]+, LCPN[ETC]+F[XGB], LCPN[RF]+F[XGB], LCPN[XGB]+F[RF], LCL[XGB]+F[RF], LCPN[RF]+F[RF], LCL[RF]+F[XGB], LCPN[LGB]+F[XGB], LCPN[XGB]+F[XGB]

Download mean performance metrics for all FC algorithms:

fc_mean_performance.csv – Contains mean scores across datasets for each FC algorithm.
Columns: index, short, mean_f1_xgb, mean_f1_catb, ... , mean_acc_xgb, mean_acc_catb, ... , mean_auc_xgb, mean_auc_catb, ... , total_dur_xgb, total_dur_catb, ...

These CSV files allow full reproducibility and further statistical analysis of HiGEC’s performance compared to FC.

📖 References

For more details on methodology, datasets, and evaluations, see the HiGEC GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
results		results
HE.py		HE.py
HG.py		HG.py
LICENSE		LICENSE
README.md		README.md
diss_mat_embedding.py		diss_mat_embedding.py
generated_hier.png		generated_hier.png
hdc.py		hdc.py
jsd.py		jsd.py
run_higec_example.py		run_higec_example.py
tsd.py		tsd.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mean F1 Comparison (HiGEC vs FC)

Mean F1 Scores & Standard Deviations

About

Uh oh!

Releases

Packages

Languages

License

mlrsrch/hierarchy_ensembles

Folders and files

Latest commit

History

Repository files navigation

Mean F1 Comparison (HiGEC vs FC)

Mean F1 Scores & Standard Deviations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages