ENH test isotonic calibration on simulation data & FIX correct joblib dependency error#4
ENH test isotonic calibration on simulation data & FIX correct joblib dependency error#4jzheng17 wants to merge 10 commits intoneurodata:mainfrom
Conversation
Accommodate running HF in Jupyter Notebook Environment
Added calibrated HF to overlapping gaussian
Tests done for Isotonic Calibrated HF
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
It seems like the failing is happening in the original test_tree and test_forest files with their import modules (possibly caused by some dependency issues?). I have not touched those. |
try to pass pytest checks for committing (these files exist in the original repo already, I did not touch them)
try to pass pytest checks for committing (these files exist in the original repo already, I did not touch them)
try to pass pytest checks for committing (these files exist in the original repo already, I did not touch them)
fix dependency issues with the new version of joblib (no longer uses **_joblib_parallel_args)
|
The HonestForest class implemented by Ronan used joblib's backend for parallelization, which involved an import from sklearn's util files. However, sklearn's util file changed between the time Ronan wrote his HF class and the time I was making this commit, in which the util file abandoned its outdated usage of the previous version of joblib ( _joblib_parallel_args deprecated and is no longer in use). This caused dependency issues because Ronan's implementation still used the outdated version. I fixed the above issue. |
rflperry
left a comment
There was a problem hiding this comment.
Overall looks good. It seems like there is one extra file or something incorporated? Normally it would be better to make 2 PRs. One for the joblib change and one for the notebook. But this is fine.
| ( | ||
| "Iso-HonestRF", | ||
| CalibratedClassifierCV( | ||
| base_estimator=HonestForestClassifier( | ||
| n_estimators=n_estimators // clf_cv, | ||
| max_features=max_features, | ||
| n_jobs=n_jobs, | ||
| ), | ||
| method="isotonic", | ||
| cv=clf_cv, | ||
| ), | ||
| ), |
There was a problem hiding this comment.
I would say don't change this file? Leave the honest + IRF for just the notebook. This way the main figure in the repo reflects the paper and isn't as confusing to first time viewers.
There was a problem hiding this comment.
Okay, will revert the changes.
| """Module for forest-based estimators""" | ||
| """I'm just using this version to facilitate future changes -Audrey""" |
There was a problem hiding this comment.
What is this file? Is it the same as the estimators/forest.py file? It seems like this was maybe accidentally left in?
There was a problem hiding this comment.
This file is created because it seems like if I import HF directly from the forest.py in a Jupyter Notebook will cause dependency errors. I've experimented a bunch of ways and it seems the only fix that works (a pretty dumb way, I have to admit) is to combine your forest.py and tree.py into one file. I personally suspect it's because the compiler for Jupyter Notebook is doing weird things if the file you are importing requires another import from another file you wrote.
There was a problem hiding this comment.
@jzheng17 you should install the package & then use the honest_forests library.
There was a problem hiding this comment.
I might be able to do that too. I went for a quick fix at that time.
There was a problem hiding this comment.
Oh wait. I remembered that installing dependency for Jupyter Notebooks and for .py scripts might be a little different. I guess I will still try after I recover from bronchitis.
| Parallel( | ||
| n_jobs=n_jobs, | ||
| verbose=self.verbose, | ||
| **_joblib_parallel_args(require="sharedmem") |
There was a problem hiding this comment.
Amazing, thanks for this fix. An alternative fix would be to require a lower version of sklearn but this is better. I assume I had a lower version. I assume this requires sklearn > 1.0? Can you add this to the requirments.txt file?
There was a problem hiding this comment.
I believe it does require sklearn > 1.0. Note that this is because sklearn ppl changed their util.py to bump joblib version dependency to 1.0.0.
There was a problem hiding this comment.
okay. We should make sure the requirements.txt file states the proper requirements. We don't want errors because this code isn't compatible with old versions of joblib or sklearn permitted by our requirements.txt file
Simulation test on #2
I did the overlapping gaussian tests for the Iso-Honest Forest in a Jupyter Notebook and had to adjust the original honest forest classes a bit for importing them into a Jupyter Notebook.