Forked from the official code for the paper, "Sparse Training of Discrete Diffusion Models for Graph Generation," available here. Checkpoints to reproduce the results can be found at this link. Please refer to the updated version here.
Environment installation (Modified from README.md of SparseDiff)
This code was tested with PyTorch 2.4.1, cuda 12.1 and torch_geometrics 2.4.0
- Download anaconda/miniconda if needed
- Conda environment building:
conda create -c conda-forge -n digress rdkit=2023.03.2 python=3.9 - Activate the environment:
conda activate digress - Install graph-tool:
conda install -c conda-forge graph-tool=2.45 - Verify the installation:
python3 -c 'from rdkit import Chem'python3 -c 'import graph_tool as gt'
- Install the nvcc drivers:
conda install -c "nvidia/label/cuda-12.1.0" cuda - Install Pytorch:
(python -m) pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121 - Install PyG related packages:
(python -m) pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.0+cu121.html - Install DGL (for SparseDiff):
conda install -c dglteam/label/th24_cu121 dgl - Please ensure the synchronization of the versions of nvcc drivers, Pytorch, PyG, and DGL!
- Install the rest packages:
pip install -r requirements.txt - Install mini-moses (optional):
pip install git+https://github.com/igor-krawczuk/mini-moses - Navigate to the directory
./sparse_diffusion/analysis/orcaand compile orca.cpp:g++ -O2 -std=c++11 -o orca orca.cpp
- Use config files in folder
config/experiments. - Example command for execution:
CUDA_VISIBLE_DEVICES=0 python main.py +experiment=ego.yaml- All code is currently launched through
python3 main.py. Check hydra documentation (https://hydra.cc/) for overriding default parameters. - To run the debugging code:
python3 main.py +experiment=debug.yaml. We advise to try to run the debug mode first before launching full experiments. - To run a code on only a few batches:
python3 main.py general.name=test. - You can specify the dataset with
python3 main.py dataset=guacamol. Look atconfigs/datasetfor the list of datasets that are currently available - You can specify the edge fraction (denoted as
$\lambda$ in the paper) withpython3 main.py model.edge_fraction=0.2to control the GPU-usage
- All code is currently launched through
@misc{qin2023sparse,
title={Sparse Training of Discrete Diffusion Models for Graph Generation},
author={Yiming Qin and Clement Vignac and Pascal Frossard},
year={2023},
eprint={2311.02142},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
PermissionError: [Errno 13] Permission denied: 'SparseDiff/sparse_diffusion/analysis/orca/orca': You probably did not compile orca.