MolMIM-NIM

Tutorial to run MolMIM NIM on HiPerGator

MolMIM is a state-of-the-art generative model for small molecule drug development that learns an informative and clustered latent space. It is a probabilistic auto-encoder that provides a fixed-length representation of variable-length SMILES strings. MolMIM is trained with Mutual Information Machine (MIM) learning and can sample valid SMILES strings from perturbations of its clustered latent space.

Features

Latent Space Learning: Learn an informative and meaningfully clustered latent space.
Molecule Sampling: Generate valid molecules from the latent space using an initial seed molecule.
Novel Molecule Generation: Generate small molecules with desired properties under specific constraints.

Prerequisites

Hardware: Single NVIDIA GPU with at least 3GB of memory and compute capability >7.0.
Storage: At least 50GB of free hard drive space.

Launch MolMIM NIM on HPG

Go to OOD and launch the HiPerGator Desktop.

Note: Remember to update the SLURM Account and QoS to match your group, and adjust the job time accordingly.

Start a terminal and run the following commands:

mkdir -p /blue/groupname/gatorlink/.cache/nim/molmim  # Run only the first time
export LOCAL_NIM_CACHE=/blue/groupname/gatorlink/.cache/nim/molmim
ml molmim-nim
molmim
start_server

Running Inference

Open a New Terminal
Keep the original terminal running with the launched service.

Check Service Status
Wait for the health check endpoint to return {"status":"ready"}:

curl -X 'GET' 'http://localhost:8000/v1/health/ready' -H 'accept: application/json'

Navigate to your DESIRED job running directory
```
cd /blue/groupname/gatorlink/...
```

Run Inference
Obtain embeddings for a molecule using its SMILES string representation:

curl -X 'POST' \
'http://localhost:8000/embedding' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}' > output.json

View the Outputs
Print the output to the terminal:

cat output.json

For better readability, use jq:

jq . output.json

or you can pipe the output directly to jq as in the following command:

curl -X 'POST' \
 'http://localhost:8000/embedding' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}' > output.json

Endpoints Usage

MolMIM provides the following endpoints and associated functions:

/embedding - Retrieve the embeddings from MolMIM for a given input molecule.
/hidden - Retrieve the hidden state from MolMIM for a given input molecule (shown as the “latent code” in Figure 1 of the MolMIM manuscript).
/decode - Decode a hidden state representation into a SMILES string sequence.
/sampling - Sample the latent space within a given scaled radius from a seed molecule. This method generates new molecule samples from the given input in an unguided fashion.
/generate - Generate novel molecules (optionally while optimizing against a certain property). This method generates new optimized molecules if CMA-ES-guided sampling is enabled.

Bash

Embedding Endpoint

Request Body:

sequences: Array of strings (SMILES strings).

Response:

embeddings: Array of arrays of floating point numbers (embeddings).

Example command:

curl -X 'POST' \
-i  \
"http://localhost:8000/embedding" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"sequences": ["CC(Cc1ccc(cc1)C(C(=O)O)C)C"]}'

For more examples, visit the NVIDIA MolMIM Documentation.

Jupyter Notebooks

Set up a Conda environment:

ml conda
conda create -n molmim-nim python=3.12
conda activate molmim-nim
conda install requests pandas numpy seaborn matplotlib rdkit -c conda-forge
conda install jupyterlab
conda install ipykernel

Configure the Jupyter kernel:

cp -r /apps/jupyterhub/template_kernel/ ~/.local/share/jupyter/kernels/molmim-nim
nano ~/.local/share/jupyter/kernels/molmim-nim/kernel.json

Add the following to kernel.json:

{
    "language": "python",
    "display_name": "molmim-nim",
    "argv": [
        "~/.local/share/jupyter/kernels/molmim-nim/run.sh",
        "-f",
        "{connection_file}"
    ]
}

Create the run.sh script:

nano ~/.local/share/jupyter/kernels/molmim-nim/run.sh

Add the following content (do not forget change to your own dir):

#!/usr/bin/bash
exec /blue/groupname/gatorlink/conda/envs/molmim-nim/bin/python -m ipykernel "$@"

Satrt the JupyerLab server on Desktop

Open a new terminal and run the command:
```
conda activate molmim-nim
jupyter lab
```
After launching JupyterLab, ensure you select the 'molmim-nim' kernel before running the notebooks.

Stopping the NIM Service

To stop the NIM service, simply close the terminal window.

Important Note

It is recommended to clean your cache files every time you stop the server to ensure it won't affect your next run. You can do this by removing the cache directory:

rm -rf /blue/groupname/gatorlink/.cache/nim/molmim/*

Another way to run MolMIM NIM on HPG

Submit a SLURM batch job
Use sbatch to start the NIM service with GPU resources, and record the name of the node where the service is running.
Open a terminal or Jupyter session
Start an SSH terminal or a Jupyter session using any preferred method (e.g., Open OnDemand, srun, etc.), with minimal resource allocation (no GPU required) to run inference.
Run on the same node
Ensure that the SSH terminal or Jupyter session for inference runs on the same node as the service.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
images		images
01_ClusterMolMIMEmbeddings.ipynb		01_ClusterMolMIMEmbeddings.ipynb
02_Endpoints_Bechmarking.ipynb		02_Endpoints_Bechmarking.ipynb
03_MolMIMGeneration.ipynb		03_MolMIMGeneration.ipynb
04_MolMIMInterpolation.ipynb		04_MolMIMInterpolation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MolMIM-NIM

Features

Prerequisites

Launch MolMIM NIM on HPG

Running Inference

Endpoints Usage

Bash

Embedding Endpoint

Jupyter Notebooks

Stopping the NIM Service

Important Note

Another way to run MolMIM NIM on HPG

References

About

Uh oh!

Releases

Packages

Languages

UFResearchComputing/MolMIM-NIM

Folders and files

Latest commit

History

Repository files navigation

MolMIM-NIM

Features

Prerequisites

Launch MolMIM NIM on HPG

Running Inference

Endpoints Usage

Bash

Embedding Endpoint

Jupyter Notebooks

Stopping the NIM Service

Important Note

Another way to run MolMIM NIM on HPG

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages