agentD is an open-source Python package designed to accelerate drug discovery workflows using Large Language Models (LLMs) and AI-driven tools. It provides modular agents and utilities for tasks such as literature extraction, molecular property prediction, molecule generation, and more. agentD integrates with external APIs (e.g., OpenAI, Serper) and cheminformatics libraries, enabling both automated and interactive research pipelines.
-
Clone the repository:
git clone https://github.com/hoon-ock/llm-dd.git cd llm_dd -
Create and activate a conda environment (recommended):
conda create -n agentd python=3.10 -y conda activate agentd
-
Install dependencies in editable mode:
pip install -e .Or, to install all dependencies directly:
pip install -r requirements.txt
-
Install REINVENT4 (required for some tools):
git clone https://github.com/MolecularAI/REINVENT4.git cd REINVENT4 python install.py --help python install.py cu124 # or rocm6.2.4, cpu, mac, etc.
-
API Keys:
After installation, copy the template file and fill in your API keys:cp configs/secret_keys.py.example configs/secret_keys.py
Then edit
configs/secret_keys.pywith your Serper API key and OpenAI API key:# configs/secret_keys.py serper_api_key = "YOUR_SERPER_API_KEY" openai_api_key = "YOUR_OPENAI_API_KEY"
-
Global Variables:
The file configs/tool_globals.py contains global variables used by the tools. You can edit this file to adjust default behaviors and settings.
Example Jupyter notebooks demonstrating the main workflows are provided in the test_case directory:
1. extraction.ipynb– Data extraction and retrieval2. qna.ipynb– Domain-specific question answering3. pooling.ipynb– Molecule pooling4. prediction.ipynb– Molecular property prediction5. refinement.ipynb– SMILES refinement6. generation.ipynb– Protein-ligand 3D structure generation
You can run these notebooks step-by-step to see how to use the agent for various drug discovery tasks.
This project is licensed under the MIT License.
- Make sure to set up your API keys before running any LLM agent notebooks.
- For any additional dependencies (e.g., REINVENT4), follow the instructions above.
- If you encounter missing package errors, check that all dependencies in requirements.txt are installed.
If you use agentD in your research or project, please cite:
(soon to be updated)
@misc{ock2025agentD,
title={Large Language Model Agent for Modular Task Execution in Drug Discovery},
author={Janghoon Ock and Radheesh Sharma Meda and Srivathsan Badrinarayanan and Neha S. Aluru and Achuth Chandrasekhar and Amir Barati Farimani},
year={2025},
eprint={2507.02925},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2507.02925},
}For questions, suggestions, or support, please contact:
Email: jock@andrew.cmu.edu
