-
Notifications
You must be signed in to change notification settings - Fork 10
Description
1.I understand that the script ‘paper_experiments/run_inference_gdb_conditional_x4.py’ directly corresponds to the conditional generation (P(x1|x4)), which corresponds to the two box plots on the upper right of Figure 3. Is this correct?
By running this script, I will generate 20 molecular analogs for 100 template molecules, retain the valid molecules after filtering, and then calculate the 3D pharmacophore similarity score according to the demonstration calculation method in shepherd-score.
Is the process I mentioned correct? If so, are the parameters in run_inference_gdb_conditional_x4.py consistent with those in your test? Do I need to change any parameters? The overall distribution and median of the boxplot I got are worse than those in your paper, and the median is about 0.1 lower. I wonder if I have set it incorrectly.
Code Supplement:
The first step is to run ‘paper_experiments/run_inference_gdb_conditional_x4.py’
The second step I will extract the molecules stored in the pickle
`for i in trange(100):
with open(f'samples/GDB_conditional/x4/samples_{i}.pickle', 'rb') as f:
molblocks_and_charges = pickle.load(f)
from rdkit.Chem import SDWriter
sdf_writer = SDWriter(f'samples/GDB_conditional/x4/sdfs/sample{i}.sdf')
for b,sample_dict in enumerate(molblocks_and_charges[7]):
xyz = ''
x_ = sample_dict['x1']['atoms']
pos_ = sample_dict['x1']['positions']
xyz += f'{len(x_)}\n{b+1}\n'
for a in range(len(x_)):
atomic_number_ = int(x_[a])
position_ = pos_[a]
xyz+= f'{rdkit.Chem.Atom(atomic_number_).GetSymbol()} {str(position_[0].round(3))} {str(position_[1].round(3))} {str(position_[2].round(3))}\n'
xyz+= '\n'
try:
mol_ = rdkit.Chem.MolFromXYZBlock(xyz)
except Exception as e:
mol_ = None
print(f'invalid molecule: {e}')
continue
try:
for c in [0, 1, -1, 2, -2]:
mol__ = deepcopy(mol_)
try:
rdkit.Chem.rdDetermineBonds.DetermineBonds(mol__, charge = c, embedChiral = True)
except:
continue
if mol__ is not None:
print(c)
break
except Exception as e:
mol_ = None
print(f'invalid molecule: {e}')
continue
mol_ = mol__
try:
assert sum([a.GetNumRadicalElectrons() for a in mol_.GetAtoms()]) == 0, 'has radical electrons'
mol_.UpdatePropertyCache()
rdkit.Chem.GetSymmSSSR(mol_)
except Exception as e:
mol_ = None
print(f'invalid molecule: {e}')
continue
try:
sdf_writer.write(mol_)
except Exception as e:
print(f'Failed to write molecule to SDF: {e}')
continue`
Finally, caculate the 3D pharmacophore similarity score
`with open(f'conformers/gdb/molblock_charges_9_test100.pkl', 'rb') as f:
molblocks_and_charges = pickle.load(f)
record = {f'{i}':[] for i in name_list}
for idx in trange(70,100):
ref_mol_rdkit = rdkit.Chem.MolFromMolBlock(molblocks_and_charges[idx][0], removeHs = False)
ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
fit_mol_rdkits = Chem.SDMolSupplier(f"/data/lbh/Code/shepherd/samples/GDB_conditional/x4/sdfs/sample{idx}.sdf", removeHs=False)
for i,fit_mol_rdkit in enumerate(fit_mol_rdkits):
# Local relaxation with xTB
# ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
try:
fit_mol, _, fit_charges = optimize_conformer_with_xtb(fit_mol_rdkit)
# Extract interaction profiles
ref_molec = Molecule(ref_mol,
num_surf_points=200,
partial_charges=ref_charges,
pharm_multi_vector=False)
fit_molec = Molecule(fit_mol,
num_surf_points=200,
partial_charges=fit_charges,
pharm_multi_vector=False)
# Centers the two molecules' COM's to the origin
mp = MoleculePair(ref_molec, fit_molec, num_surf_points=200, do_center=True)
# Compute the similarity score for each interaction profile
shape_score = mp.score_with_surf(ALPHA(mp.num_surf_points))
esp_score = mp.score_with_esp(ALPHA(mp.num_surf_points), lam=0.3)
pharm_score = mp.score_with_pharm()
record[f'template{idx}'].append(pharm_score)
except:
print(i)
continue`