HiGEC
Hierarchy Generation and Extended Classification Framework
HiGEC is a Python framework for enhancing multi-class classification through automated hierarchy generation (HG) and flexible hierarchy exploitation (HE) strategies. It supports hybrid approaches that integrate hierarchical and flat classifier outputs.
🔧 Installation
git clone https://github.com/alagoz/higec.git
cd higec
pip install -r requirements.txtDependencies:
numpy scipy matplotlib scikit-learn scikit-learn-extra proglearn xgboost lightgbm
⚡ Key Features
� Automatic hierarchy generation from flat class labels
🧩 Hybrid HE+F classification strategies
🖇️ Support for any scikit-learn compatible classifier
📊 Benchmark-ready with OpenML integration
🌳 Visualization tools for hierarchy inspection
🚀 Quick Start
Run the example:
python run_higec_example.pyPipeline:
-
Downloads OpenML dataset
-
Trains flat classifier baseline
-
Generates class hierarchy
-
Evaluates hierarchical approach
🛠 Core Components
| File | Purpose |
|---|---|
HG.py |
Hierarchy generation |
HE.py |
Hierarchy exploitation |
hdc.py |
Divisive clustering |
utils.py |
Data handling & visualization |
🧪 Customization
Adjust parameters in 'run_higec_example.py':
DID = 46264 # OpenML dataset ID
HiGEC = 'CCM[HAC|COMPLETE]-LCPN[ETC]+F[XGB]' # HG + HE scheme
CLF_NAME_FC = 'RF' # Flat classifierAvailable classifiers: RF, XGB, ETC, LGB.
📈 Example Output
Extended Linkage Table:
node_id:0, node_type:parent, subsets:[[0], [1,2,3,4]], branch_ids:[0,7], parent_id:None
node_id:1, node_type:parent, subsets:[[3,4],[1,2]], branch_ids:[5,6], parent_id:0Performance Comparison:
- Flat Classification (RF) (f1): 0.3517 in 0.4309 seconds
- HiGEC: CCM[HAC|COMPLETE]-LCPN[ETC]+F[XGB] (f1): 0.3700 in 1.1853 seconds📊 Benchmark Results
HiGEC was evaluated on 100 multi-class tabular datasets, showing consistent F1-score gains over flat classification (FC), particularly with hybrid HE+F configurations.
Download raw results (F1 scores per dataset):
- f1_scores_fc_vs_higec.csv – Contains per-dataset F1-scores of FC and selected 9 HiGEC algorithms.
- Columns:
index,short,RF,XGB,ETC,LGB,LCN[XGB]+,LCPN[ETC]+F[XGB],LCPN[RF]+F[XGB],LCPN[XGB]+F[RF],LCL[XGB]+F[RF],LCPN[RF]+F[RF],LCL[RF]+F[XGB],LCPN[LGB]+F[XGB],LCPN[XGB]+F[XGB]
Download mean performance metrics for all FC algorithms:
- fc_mean_performance.csv – Contains mean scores across datasets for each FC algorithm.
- Columns:
index,short,mean_f1_xgb,mean_f1_catb, ... ,mean_acc_xgb,mean_acc_catb, ... ,mean_auc_xgb,mean_auc_catb, ... ,total_dur_xgb,total_dur_catb, ...
These CSV files allow full reproducibility and further statistical analysis of HiGEC’s performance compared to FC.
📖 References
For more details on methodology, datasets, and evaluations, see the HiGEC GitHub repository.

