Tutorial to run MolMIM NIM on HiPerGator
MolMIM is a state-of-the-art generative model for small molecule drug development that learns an informative and clustered latent space. It is a probabilistic auto-encoder that provides a fixed-length representation of variable-length SMILES strings. MolMIM is trained with Mutual Information Machine (MIM) learning and can sample valid SMILES strings from perturbations of its clustered latent space.
- Latent Space Learning: Learn an informative and meaningfully clustered latent space.
- Molecule Sampling: Generate valid molecules from the latent space using an initial seed molecule.
- Novel Molecule Generation: Generate small molecules with desired properties under specific constraints.
- Hardware: Single NVIDIA GPU with at least 3GB of memory and compute capability >7.0.
- Storage: At least 50GB of free hard drive space.
- Go to OOD and launch the HiPerGator Desktop.
Note: Remember to update the SLURM Account and QoS to match your group, and adjust the job time accordingly.
- Start a terminal and run the following commands:
mkdir -p /blue/groupname/gatorlink/.cache/nim/molmim # Run only the first time export LOCAL_NIM_CACHE=/blue/groupname/gatorlink/.cache/nim/molmim ml molmim-nim molmim start_server
-
Open a New Terminal
Keep the original terminal running with the launched service. -
Check Service Status
Wait for the health check endpoint to return{"status":"ready"}:curl -X 'GET' 'http://localhost:8000/v1/health/ready' -H 'accept: application/json'
-
Navigate to your DESIRED job running directory
cd /blue/groupname/gatorlink/... -
Run Inference
Obtain embeddings for a molecule using its SMILES string representation:curl -X 'POST' \ 'http://localhost:8000/embedding' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}' > output.json
-
View the Outputs
Print the output to the terminal:cat output.json
For better readability, use jq:
jq . output.jsonor you can pipe the output directly to jq as in the following command:
curl -X 'POST' \ 'http://localhost:8000/embedding' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}' > output.json
MolMIM provides the following endpoints and associated functions:
-
/embedding- Retrieve the embeddings from MolMIM for a given input molecule. -
/hidden- Retrieve the hidden state from MolMIM for a given input molecule (shown as the “latent code” in Figure 1 of the MolMIM manuscript). -
/decode- Decode a hidden state representation into a SMILES string sequence. -
/sampling- Sample the latent space within a given scaled radius from a seed molecule. This method generates new molecule samples from the given input in an unguided fashion. -
/generate- Generate novel molecules (optionally while optimizing against a certain property). This method generates new optimized molecules if CMA-ES-guided sampling is enabled.
Request Body:
sequences: Array of strings (SMILES strings).
Response:
embeddings: Array of arrays of floating point numbers (embeddings).
Example command:
curl -X 'POST' \
-i \
"http://localhost:8000/embedding" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}'For more examples, visit the NVIDIA MolMIM Documentation.
-
Set up a Conda environment:
ml conda conda create -n molmim-nim python=3.12 conda activate molmim-nim conda install requests pandas numpy seaborn matplotlib rdkit -c conda-forge conda install jupyterlab conda install ipykernel
-
Configure the Jupyter kernel:
cp -r /apps/jupyterhub/template_kernel/ ~/.local/share/jupyter/kernels/molmim-nim nano ~/.local/share/jupyter/kernels/molmim-nim/kernel.json
Add the following to
kernel.json:{ "language": "python", "display_name": "molmim-nim", "argv": [ "~/.local/share/jupyter/kernels/molmim-nim/run.sh", "-f", "{connection_file}" ] } -
Create the
run.shscript:nano ~/.local/share/jupyter/kernels/molmim-nim/run.shAdd the following content (do not forget change to your own dir):
#!/usr/bin/bash exec /blue/groupname/gatorlink/conda/envs/molmim-nim/bin/python -m ipykernel "$@"
-
Satrt the JupyerLab server on Desktop
Open a new terminal and run the command:
conda activate molmim-nim jupyter lab
After launching JupyterLab, ensure you select the 'molmim-nim' kernel before running the notebooks.
To stop the NIM service, simply close the terminal window.
It is recommended to clean your cache files every time you stop the server to ensure it won't affect your next run. You can do this by removing the cache directory:
rm -rf /blue/groupname/gatorlink/.cache/nim/molmim/*-
Submit a SLURM batch job
Usesbatchto start the NIM service with GPU resources, and record the name of the node where the service is running. -
Open a terminal or Jupyter session
Start an SSH terminal or a Jupyter session using any preferred method (e.g., Open OnDemand,srun, etc.), with minimal resource allocation (no GPU required) to run inference. -
Run on the same node
Ensure that the SSH terminal or Jupyter session for inference runs on the same node as the service.

