Skip to content

Releases: cognitivefactory/features-maximization-metric

1.0.0

14 Nov 13:05

Choose a tag to compare

Features Maximization Metric

Implementation of Features Maximization Metric, an unbiased metric aimed at estimate the quality of an unsupervised classification.

Quick description

Features Maximization (FMC) is a features selection method described in Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7.

This metric is computed by applying the following steps:

  1. Compute the Features F-Measure metric (based on Features Recall and Features Predominance metrics).

    (a) The Features Recall FR[f][c] for a given class c and a given feature f is the ratio between
    the sum of the vectors weights of the feature f for data in class c
    and the sum of all vectors weights of feature f for all data.
    It answers the question: "Can the feature f distinguish the class c from other classes c' ?"

    (b) The Features Predominance FP[f][c] for a given class c and a given feature f is the ratio between
    the sum of the vectors weights of the feature f for data in class c
    and the sum of all vectors weights of all feature f' for data in class c.
    It answers the question: "Can the feature f better identify the class c than the other features f' ?"

    (c) The Features F-Measure FM[f][c] for a given class c and a given feature f is
    the harmonic mean of the Features Recall (a) and the Features Predominance (c).
    It answers the question: "How much information does the feature f contain about the class c ?"

  2. Compute the Features Selection (based on F-Measure Overall Average comparison).

    (d) The F-Measure Overall Average is the average of Features F-Measure (c) for all classes c and for all features f.
    It answers the question: "What are the mean of information contained by features in all classes ?"

    (e) A feature f is Selected if and only if it exist at least one class c for which the Features F-Measure (c) FM[f][c] is bigger than the F-Measure Overall Average (d).
    It answers the question: "What are the features which contain more information than the mean of information in the dataset ?"

    (f) A Feature f is Deleted if and only if the Features F-Measure (c) FM[f][c] is always lower than the F-Measure Overall Average (d) for each class c.
    It answers the question: "What are the features which do not contain more information than the mean of information in the dataset ?"

  3. Compute the Features Contrast and Features Activation (based on F-Measure Marginal Averages comparison).

    (g) The F-Measure Marginal Averages for a given feature f is the average of Features F-Measure (c) for all classes c and for the given feature f.
    It answers the question: "What are the mean of information contained by the feature f in all classes ?"

    (h) The Features Contrast FC[f][c] for a given class c and a given selected feature f is the ratio between
    the Features F-Measure (c) FM[f][c]
    and the F-Measure Marginal Averages (g) for selected feature f
    put to the power of an Amplification Factor.
    It answers the question: "How relevant is the feature f to distinguish the class c ?"

    (i) A selected Feature f is Active for a given class c if and only if the Features Contrast (h) FC[f][c] is bigger than 1.0.
    It answers the question : "For which classes a selected feature f is relevant ?"

This metric is an efficient method to:

  • identify relevant features of a dataset modelization;
  • describe association between vectors features and data classes;
  • increase contrast between data classes.

References

Lamirel, J.-C., Cuxac, P., & Hajlaoui, K. (2016). A Novel Approach to Feature Selection Based on Quality Estimation Metrics. In Advances in Knowledge Discovery and Management (pp. 121–140). Springer International Publishing. https://doi.org/10.1007/978-3-319-45763-5_7

How to cite

Schild, E. (2023). cognitivefactory/features-maximization-metric. Zenodo. https://doi.org/10.5281/zenodo.7646382.

0.1.1

16 Feb 12:36

Choose a tag to compare

0.1.1 Pre-release
Pre-release

0.1.0

16 Feb 12:16

Choose a tag to compare

0.1.0 Pre-release
Pre-release

Release 0.1.0 of cognitivefactory/features-maximization-metric package.