About the results reproduction

1.I understand that the script **‘paper_experiments/run_inference_gdb_conditional_x4.py’**  directly corresponds to the conditional generation  (P(x1|x4)), which corresponds to the two box plots on the upper right of Figure 3. Is this correct?
By running this script, I will generate 20 molecular analogs for 100 template molecules, retain the valid molecules after filtering, and then calculate the 3D pharmacophore similarity score according to the demonstration calculation method in **shepherd-score**.

Is the process I mentioned correct? If so, are the parameters in **run_inference_gdb_conditional_x4.py** consistent with those in your test? Do I need to change any parameters? The overall distribution and median of the boxplot I got are worse than those in your paper, and the median is about 0.1 lower. I wonder if I have set it incorrectly.

Code Supplement：

> The first step is to run  **‘paper_experiments/run_inference_gdb_conditional_x4.py’** 

> The second step I will extract the molecules stored in the pickle

`for i in trange(100):
    with open(f'samples/GDB_conditional/x4/samples_{i}.pickle', 'rb') as f:
        molblocks_and_charges = pickle.load(f)

    from rdkit.Chem import SDWriter

    sdf_writer = SDWriter(f'samples/GDB_conditional/x4/sdfs/sample{i}.sdf')
    
    
    for b,sample_dict in enumerate(molblocks_and_charges[7]):
        
        xyz = '' 
        
        x_ = sample_dict['x1']['atoms']
        pos_ = sample_dict['x1']['positions']
        
        xyz += f'{len(x_)}\n{b+1}\n'
        for a in range(len(x_)):
            atomic_number_ = int(x_[a])
            position_ = pos_[a]
            
            xyz+= f'{rdkit.Chem.Atom(atomic_number_).GetSymbol()} {str(position_[0].round(3))} {str(position_[1].round(3))} {str(position_[2].round(3))}\n'
        xyz+= '\n'
        
        try:
            mol_ = rdkit.Chem.MolFromXYZBlock(xyz)
        except Exception as e:
            mol_ = None
            print(f'invalid molecule: {e}')
            continue
        
        try:
            for c in [0, 1, -1, 2, -2]:
                mol__ = deepcopy(mol_)
                try:
                    rdkit.Chem.rdDetermineBonds.DetermineBonds(mol__, charge = c, embedChiral = True)
                except:
                    continue
                if mol__ is not None:
                    print(c)
                    break 
        except Exception as e:
            mol_ = None
            print(f'invalid molecule: {e}')
            continue
        
        mol_ = mol__
        try:
            assert sum([a.GetNumRadicalElectrons() for a in mol_.GetAtoms()]) == 0, 'has radical electrons'
            mol_.UpdatePropertyCache()
            rdkit.Chem.GetSymmSSSR(mol_)
            
        except Exception as e:
            mol_ = None
            print(f'invalid molecule: {e}')
            continue

        try:

            sdf_writer.write(mol_)
        except Exception as e:
            print(f'Failed to write molecule to SDF: {e}')
            continue`

> Finally, caculate the 3D pharmacophore similarity score

`with open(f'conformers/gdb/molblock_charges_9_test100.pkl', 'rb') as f:
    molblocks_and_charges = pickle.load(f) 
record = {f'{i}':[] for i in name_list}
for idx in trange(70,100):
    ref_mol_rdkit = rdkit.Chem.MolFromMolBlock(molblocks_and_charges[idx][0], removeHs = False)
    ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
    fit_mol_rdkits = Chem.SDMolSupplier(f"/data/lbh/Code/shepherd/samples/GDB_conditional/x4/sdfs/sample{idx}.sdf", removeHs=False)
    for i,fit_mol_rdkit in enumerate(fit_mol_rdkits):
        # Local relaxation with xTB
        # ref_mol, _, ref_charges = optimize_conformer_with_xtb(ref_mol_rdkit)
        try:
            fit_mol, _, fit_charges = optimize_conformer_with_xtb(fit_mol_rdkit)
        

            # Extract interaction profiles
            ref_molec = Molecule(ref_mol,
                                num_surf_points=200,
                                partial_charges=ref_charges,
                                pharm_multi_vector=False)
            fit_molec = Molecule(fit_mol,
                                num_surf_points=200,
                                partial_charges=fit_charges,
                                pharm_multi_vector=False)

            # Centers the two molecules' COM's to the origin
            mp = MoleculePair(ref_molec, fit_molec, num_surf_points=200, do_center=True)

            # Compute the similarity score for each interaction profile
            shape_score = mp.score_with_surf(ALPHA(mp.num_surf_points))
            esp_score = mp.score_with_esp(ALPHA(mp.num_surf_points), lam=0.3)
            pharm_score = mp.score_with_pharm()
            record[f'template{idx}'].append(pharm_score)
        except:
            print(i)
            continue`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the results reproduction #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

About the results reproduction #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions