Skip to content

the definition of selection frequency is correct? #15

@BigBroKuang

Description

@BigBroKuang
  1. I used LogisticRegression to test the code. I set C (which is the inverse of lambda) as np.logspace(-2,2,30), I observed that the index of the highest selection frequency is always from the highest C (lowest lambda). It seems that the Lasso method is not working?
  2. According to the definition of fj, when a feature j enters the Lasso path at a specific lambda (lets say L0, from high to low value), For L1<L0, the coefficient beta of the feature is non-zero. Theoretically speaking, if L2<L1<L0, we can say that the fj(L2)>fj(L1)>fj(L0), which means that smaller lambda produces higher selection frequency? It seems that there is no need to test multiple lambda values and if we set lambda to an extreme small value, we will be getting the best selection frequency?
  3. I also tries to rerun the code with different random seed, it seems that the results at each run is totally different because the knockoffs generated from each run is different.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions