Releases: cognitivefactory/features-maximization-metric
1.0.0
Features Maximization Metric
Implementation of Features Maximization Metric, an unbiased metric aimed at estimate the quality of an unsupervised classification.
- GitHub repository : https://github.com/cognitivefactory/features-maximization-metric/tree/1.0.0
- Main documentation : https://cognitivefactory.github.io/features-maximization-metric/
- Pypi distribution : https://pypi.org/project/cognitivefactory-features-maximization-metric/1.0.0/
Quick description
Features Maximization (FMC) is a features selection method described in Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7.
This metric is computed by applying the following steps:
-
Compute the Features F-Measure metric (based on Features Recall and Features Predominance metrics).
(a) The Features Recall
FR[f][c]for a given classcand a given featurefis the ratio between
the sum of the vectors weights of the featureffor data in classc
and the sum of all vectors weights of featureffor all data.
It answers the question: "Can the featurefdistinguish the classcfrom other classesc'?"(b) The Features Predominance
FP[f][c]for a given classcand a given featurefis the ratio between
the sum of the vectors weights of the featureffor data in classc
and the sum of all vectors weights of all featuref'for data in classc.
It answers the question: "Can the featurefbetter identify the classcthan the other featuresf'?"(c) The Features F-Measure
FM[f][c]for a given classcand a given featurefis
the harmonic mean of the Features Recall (a) and the Features Predominance (c).
It answers the question: "How much information does the featurefcontain about the classc?" -
Compute the Features Selection (based on F-Measure Overall Average comparison).
(d) The F-Measure Overall Average is the average of Features F-Measure (c) for all classes
cand for all featuresf.
It answers the question: "What are the mean of information contained by features in all classes ?"(e) A feature
fis Selected if and only if it exist at least one classcfor which the Features F-Measure (c)FM[f][c]is bigger than the F-Measure Overall Average (d).
It answers the question: "What are the features which contain more information than the mean of information in the dataset ?"(f) A Feature
fis Deleted if and only if the Features F-Measure (c)FM[f][c]is always lower than the F-Measure Overall Average (d) for each classc.
It answers the question: "What are the features which do not contain more information than the mean of information in the dataset ?" -
Compute the Features Contrast and Features Activation (based on F-Measure Marginal Averages comparison).
(g) The F-Measure Marginal Averages for a given feature
fis the average of Features F-Measure (c) for all classescand for the given featuref.
It answers the question: "What are the mean of information contained by the featurefin all classes ?"(h) The Features Contrast
FC[f][c]for a given classcand a given selected featurefis the ratio between
the Features F-Measure (c)FM[f][c]
and the F-Measure Marginal Averages (g) for selected feature f
put to the power of an Amplification Factor.
It answers the question: "How relevant is the featurefto distinguish the classc?"(i) A selected Feature
fis Active for a given classcif and only if the Features Contrast (h)FC[f][c]is bigger than1.0.
It answers the question : "For which classes a selected featurefis relevant ?"
This metric is an efficient method to:
- identify relevant features of a dataset modelization;
- describe association between vectors features and data classes;
- increase contrast between data classes.
References
Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7
How to cite
Schild, E. (2023). cognitivefactory/features-maximization-metric. Zenodo. https://doi.org/10.5281/zenodo.7646382.
0.1.1
Release 0.1.1 of cognitivefactory/features-maximization-metric package.
- GitHub repository : https://github.com/cognitivefactory/features-maximization-metric/tree/0.1.1
- Main documentation : https://cognitivefactory.github.io/features-maximization-metric/
- Pypi distribution : https://pypi.org/project/cognitivefactory-features-maximization-metric/0.1.1/
0.1.0
Release 0.1.0 of cognitivefactory/features-maximization-metric package.
- GitHub repository : https://github.com/cognitivefactory/features-maximization-metric/tree/0.1.0
- Main documentation : https://cognitivefactory.github.io/features-maximization-metric/
- Pypi distribution : not distributed