Conversation
Any notebooks that require datasets to run should place code for downloading\extracting that data into download_data.sh. This allows systematic downloading and caching of datasets for testing.
The path construction for the dataset was prepending the users $HOME directory. I removed this to make it more transportable and similar to the other notebooks.
This makes it easier to track if the dataset has been downloaded already or not so we can cache things on della.
htfa notebook expects the dataset to be in /data. This is where it resides in the Docker image. I extracted the download commands for the data files and put them into download_data.sh. I modified the commands to extract the data into a local data/ folder. I then modified the notebook to look for the data in both of these locations and throw an exception otherwise.
The rt-cloud repo is a dependency of the the notebook. It is not pip installable so I added it as a git sub-module. I have also extracted a dependency list from rt-cloud/environment.yml to include with the other notebook dependencies.
…into tests Conflicts: notebooks/real-time/rtcloud_notebook.ipynb
One final issue to fix is that the last cell in the notebook executes the a python script on the command line. Even when this script fails this does not cause the cell in the notebook to fail.
Importing the main function and running directly ensures that tests fail if there are errors in sample.py. This is because if a cell has a command line execution that fails this is not considered a cell failure for testbook (is this a bug?)
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
notebooks/isc/ISC.ipynb
Outdated
| "!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n", | ||
| "!tar -xzf brainiak-aperture-isc-data.tgz\n", | ||
| "!rm brainiak-aperture-isc-data.tgz" | ||
| "!tar -xzf brainiak-aperture-isc-data.tgz\n" |
There was a problem hiding this comment.
How about adding --skip-old-files to minimize I/O? Same goes for any other extraction code in notebooks.
notebooks/isc/ISC.ipynb
Outdated
| @@ -92,8 +92,7 @@ | |||
| "source": [ | |||
| "# Download and extract example data from Zenodo\n", | |||
| "!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n", | |||
There was a problem hiding this comment.
Should we have --no-clobber so the file is not downloaded again? Same goes for any other download code in notebooks.
requirements.txt
Outdated
| @@ -0,0 +1,32 @@ | |||
| testbook | |||
There was a problem hiding this comment.
I think it is easier for notebook authors in the future if each notebook has its own requirements file.
| "*1.2 Load participant data*<a id=\"load_ppt\"></a>\n", | ||
| "\n", | ||
| "Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed in the home directory with the name Corr_MVPA " | ||
| "Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed same directory as this notebook with the name Corr_MVPA " |
There was a problem hiding this comment.
@CameronTEllis FYI: please note a minor update to the data directory to make it compatible for automated testing.
|
Ok, I think I have addressed your comments @mihaic. Can you take a look? |
| "Clear any pre-existing plot for this run using 'clearRunPlot(runNum)'\n", | ||
| "###################################################################################\n", | ||
| "/tmp/notebook-simdata/labels.npy\n", | ||
| "Collected training data for TR 0\n", |
There was a problem hiding this comment.
Hi David, Can you clear the "outputs" created from running the notebook before checking in? I think it's in the Jupiter menu Cell->clear->all_outputs.
Setup pytests using testbook for all notebooks.