Skip to content

About the results reproduction #5

@MoonLBH

Description

@MoonLBH

1.I understand that the script ‘paper_experiments/run_inference_gdb_conditional_x4.py’ directly corresponds to the conditional generation (P(x1|x4)), which corresponds to the two box plots on the upper right of Figure 3. Is this correct?
By running this script, I will generate 20 molecular analogs for 100 template molecules, retain the valid molecules after filtering, and then calculate the 3D pharmacophore similarity score according to the demonstration calculation method in shepherd-score.

Is the process I mentioned correct? If so, are the parameters in run_inference_gdb_conditional_x4.py consistent with those in your test? Do I need to change any parameters? The overall distribution and median of the boxplot I got are worse than those in your paper, and the median is about 0.1 lower. I wonder if I have set it incorrectly.

Code Supplement:

The first step is to run ‘paper_experiments/run_inference_gdb_conditional_x4.py’

The second step I will extract the molecules stored in the pickle

`for i in trange(100):
with open(f'samples/GDB_conditional/x4/samples_{i}.pickle', 'rb') as f:
molblocks_and_charges = pickle.load(f)

from rdkit.Chem import SDWriter

sdf_writer = SDWriter(f'samples/GDB_conditional/x4/sdfs/sample{i}.sdf')


for b,sample_dict in enumerate(molblocks_and_charges[7]):
    
    xyz = '' 
    
    x_ = sample_dict['x1']['atoms']
    pos_ = sample_dict['x1']['positions']
    
    xyz += f'{len(x_)}\n{b+1}\n'
    for a in range(len(x_)):
        atomic_number_ = int(x_[a])
        position_ = pos_[a]
        
        xyz+= f'{rdkit.Chem.Atom(atomic_number_).GetSymbol()} {str(position_[0].round(3))} {str(position_[1].round(3))} {str(position_[2].round(3))}\n'
    xyz+= '\n'
    
    try:
        mol_ = rdkit.Chem.MolFromXYZBlock(xyz)
    except Exception as e:
        mol_ = None
        print(f'invalid molecule: {e}')
        continue
    
    try:
        for c in [0, 1, -1, 2, -2]:
            mol__ = deepcopy(mol_)
            try:
                rdkit.Chem.rdDetermineBonds.DetermineBonds(mol__, charge = c, embedChiral = True)
            except:
                continue
            if mol__ is not None:
                print(c)
                break 
    except Exception as e:
        mol_ = None
        print(f'invalid molecule: {e}')
        continue
    
    mol_ = mol__
    try:
        assert sum([a.GetNumRadicalElectrons() for a in mol_.GetAtoms()]) == 0, 'has radical electrons'
        mol_.UpdatePropertyCache()
        rdkit.Chem.GetSymmSSSR(mol_)
        
    except Exception as e:
        mol_ = None
        print(f'invalid molecule: {e}')
        continue

    try:

        sdf_writer.write(mol_)
    except Exception as e:
        print(f'Failed to write molecule to SDF: {e}')
        continue`

Finally, caculate the 3D pharmacophore similarity score

`with open(f'conformers/gdb/molblock_charges_9_test100.pkl', 'rb') as f:
molblocks_and_charges = pickle.load(f)
record = {f'{i}':[] for i in name_list}
for idx in trange(70,100):
ref_mol_rdkit = rdkit.Chem.MolFromMolBlock(molblocks_and_charges[idx][0], removeHs = False)
ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
fit_mol_rdkits = Chem.SDMolSupplier(f"/data/lbh/Code/shepherd/samples/GDB_conditional/x4/sdfs/sample{idx}.sdf", removeHs=False)
for i,fit_mol_rdkit in enumerate(fit_mol_rdkits):
# Local relaxation with xTB
# ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
try:
fit_mol, _, fit_charges = optimize_conformer_with_xtb(fit_mol_rdkit)

        # Extract interaction profiles
        ref_molec = Molecule(ref_mol,
                            num_surf_points=200,
                            partial_charges=ref_charges,
                            pharm_multi_vector=False)
        fit_molec = Molecule(fit_mol,
                            num_surf_points=200,
                            partial_charges=fit_charges,
                            pharm_multi_vector=False)

        # Centers the two molecules' COM's to the origin
        mp = MoleculePair(ref_molec, fit_molec, num_surf_points=200, do_center=True)

        # Compute the similarity score for each interaction profile
        shape_score = mp.score_with_surf(ALPHA(mp.num_surf_points))
        esp_score = mp.score_with_esp(ALPHA(mp.num_surf_points), lam=0.3)
        pharm_score = mp.score_with_pharm()
        record[f'template{idx}'].append(pharm_score)
    except:
        print(i)
        continue`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions