This project is mainly presented in a jupyter notebook (i.e. an instance of Jupyter Lab; found in the main folder). The notebook functions as a "main" script, calling function that are in the modules folder.
This set-up installs all python packages in a virtual environment inside a container. Packages are stored on a persistent volume. First run may take some time to install all packages.
Requirements:
dockerdocker-composemake
Run the project:
cp .env.sample .env
make devNote that it is also at the moment needed to check/update the project name in the following places:
- Dockerfile
This spins up a Jupyter server, which can be accessed at localhost:<PORT>, where PORT is specified in the environment.
For opening the command line in the container:
make devcliNote: requires the container to be running.
For a solo cli (without the container generated by compose) use make runcli (requires an image to exist). This currently does not bind to the source code folder.
For tearing down the project:
make alldownNote:
alldownalso removes the named volume that stores the python packages, be prepared to install again after this command. For general docker tidyness use simplydocker-compose down.
More info:
When editing the Docker file, not all changes invalidate the cache it seems. Add the flag --no-cache to compose or build commands if necessary.
Finally, if changing the directory structure, make sure docker compose only binds to existing directories on the host, otherwise you will run into permission issues with the binds inside the container.
Python packages used for this project are isolated from system packages via a virtual environment. To activate this environment, use the shortcut . jupdev from anywhere in the container. The most common and important use-case for this is to install new libraries and update the requirements file.
While working, it will be often needed to install new libraries. The script update.sh (simply run . update from anywhere in the container) will update the requirements file in the container after installing new libraries, and the updates will propagate to the local copy through the bind volume defined by docker-compose. When pulling new code from version control, the container installs changes to the requirements file upon start-up. By use of the persistent storage volume, this should not take too long after the first install.
The project folder has been added to the user's PATH, so any tools/scripts created in this folder are easily accessible. However, for persistence, such should either be built into the image (i.e. update the Dockerfile), or a new local volume can be specified to facilitate an updated workflow.
An example workflow:
make dev # starts jupyter server
### open another terminal:
make devcli # opens a terminal in the container
. jupdev # activates the python virtual environment
pip install pandas && . update # installs a dependency (for example 'pandas')
# to the volume and updates requirements.txt in the container and host
### after closing the processes in both terminals (or just stop the server):
docker-compose down # removes the container and network (image and volume remain)