Implementation of LocalMapper developed by Prof. Yousung Jung group at Seoul National University (contact: yousung@gmail.com).
The license has been updated to CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International). This means:
- Academic use: Free to use, share, and adapt with attribution
- Commercial use: Not permitted without prior written approval from the copyright holder
- Derivative works: Must be shared under the same license
We encourage academic use of this model, but we wish this will not be used for any commercial used without our permission. For commercial licensing inquiries, please contact the developer.
- Developer
- OS Requirements
- Python Dependencies
- Installation Guide
- Usage
- Data
- Reproduce the results
- Publication
- License
Shuan Chen (shuan.micc@gmail.com)
This repository has been tested on both Linux and Windows operating systems.
- Python (version >= 3.6)
- Numpy (version >= 1.16.4)
- Matplotlib (version >=3.3.4)
- PyTorch (version >= 1.0.0)
- RDKit (version >= 2019)
- DGL (version >= 0.5.2)
- DGLLife (version >= 0.2.6)
conda create -n localmapper python=3.6 -y
conda activate localmapper
pip install localmapper
git clone https://github.com/snu-micc/LocalMapper.git
cd LocalMapper
conda create -n localmapper python=3.6 -y
conda activate localmapper
pip install -e .
from localmapper import localmapper
mapper = localmapper()
rxn = 'CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F'
result = mapper.get_atom_map(rxn)
The expected output of result should be
'[CH3:1][CH:2]([CH3:3])[SH:4].CN(C)C=O.[F:11][c:10]1[cH:9][cH:8][cH:7][n:6][c:5]1F.O=C([O-])[O-].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]'
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', CCOCC.C[Mg+].O=Cc1ccc(F)cc1Cl.[Br-]>>CC(O)c1ccc(F)cc1Cl']
results = mapper.get_atom_map(rxns)
The expected output of results should be
['[CH3:1][CH:2]([CH3:3])[SH:4].CN(C)C=O.[F:11][c:10]1[cH:9][cH:8][cH:7][n:6][c:5]1F.O=C([O-])[O-].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
'CCOCC.[CH3:1][Mg+].[O:3]=[CH:2][c:4]1[cH:5][cH:6][c:7]([F:8])[cH:9][c:10]1[Cl:11].[Br-]>>[CH3:1][CH:2]([OH:3])[c:4]1[cH:5][cH:6][c:7]([F:8])[cH:9][c:10]1[Cl:11]']
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', CCOCC.C[Mg+].O=Cc1ccc(F)cc1Cl.[Br-]>>CC(O)c1ccc(F)cc1Cl']
results = mapper.get_atom_map(rxns, return_dict=True)
The expected output of results should be
[{'rxn': 'CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F',
'mapped_rxn': '[CH3:1][CH:2]([CH3:3])[SH:4].CN(C)C=O.[F:11][c:10]1[cH:9][cH:8][cH:7][n:6][c:5]1F.O=C([O-])[O-].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
'template': '[S:1].F-[c:2]>>[S:1]-[c:2]',
'confident': True},
{'rxn': 'CCOCC.C[Mg+].O=Cc1ccc(F)cc1Cl.[Br-]>>CC(O)c1ccc(F)cc1Cl',
'mapped_rxn': 'CCOCC.[CH3:1][Mg+].[O:3]=[CH:2][c:4]1[cH:5][cH:6][c:7]([F:8])[cH:9][c:10]1[Cl:11].[Br-]>>[CH3:1][CH:2]([OH:3])[c:4]1[cH:5][cH:6][c:7]([F:8])[cH:9][c:10]1[Cl:11]',
'template': '[C:1]-[Mg+].[C:2]=[O:3]>>[C:1]-[C:2]-[O:3]',
'confident': True}]
See Demo.ipynb for more running instructions and plotting the results.
The raw reactions of USPTO 50K and USPTO FULL are downloaded from the github repo of RXNMapper.
The mapped reactions of USPTO 50K and USPTO FULL are available at Figshare.
AAM predictions on reactions sampled from USPTO 50K, Golden dataset, and Jaworski et al. generated by LocalMapper, RXNMapper, and GraphormerMapper are provided here.
Go to LocalMapper/manual/ folder and change name of file User.user to [your-name].user.
Downlaod raw data of USPTO_50K from
Go to LocalMapper/scripts/ folder and run Sample.py with -i (iteration) = 1
python Sample.py -i 1
Back to LocalMapper/manual/ folder and use Check_atom_mapping.ipynb to correct the sampled reactions (0: reject and remap, 1: accept, 2: reject and skip).
Make sure the templates you generate are chemically correct. The model is very sensitive to these templates.
Go to the LocalMapper/scripts/ folder, and run following training code
python Train.py -i 1
This training process usually takes 3~6 hours to complete using cuda-supporting GPU depending on the number of training reactions.
To use the model to predict the atom-mapping on raw reactions, simply run
python Test.py -i 1
To sample more data for training, sample the data again and train-test the LocalMapper model by changing the arguement -i
To start, you should run
python Sample.py -i 2
@article{chen2024precise,
title={Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learning},
author={Chen, Shuan and An, Sunggi and Babazade, Ramil and Jung, Yousung},
journal={Nature Communications},
volume={15},
number={1},
pages={2250},
year={2024},
publisher={Nature Publishing Group UK London}
}This project is covered under the CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International) license. See the LICENSE file for details.