Search code examples
pythonrdkit

Python argument types in rdkit.Chem.rdmolfiles.MolToMolBlock(NoneType)


I am trying to convert inchi to sdf format using rdkit python library. I am running following line of python code.

#convert inchi to sdf

def MolFromInchi(id,inchi):
    mol = Chem.MolFromInchi(inchi)
    mol_block = Chem.MolToMolBlock(mol)
    print (id, mol_block)
    print ('$$$$')
    
with open (r'C:/Users/inchi_canonize') as f:                                                                                   
    for line in f:
        lst=line.split(' ')
        elements = [x for x in lst if x]   #remove empty elements and get id (elements[0]) and inchis (elements[1])
        elements[1] = ('\''+elements[1].strip()+'\'')
        id = elements[0]
        inchi = elements[1].rstrip("\n")
        print (inchi)
        MolFromInchi(id,inchi)


The input file (inchi_canonize) has following fields.

D08520   InChI=1S/C10H18O2/c1-7-4-5-8(6-9(7)11)10(2,3)12/h4,8-9,11-12H,5-6H2,1-3H3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
D07548   InChI=1S/C17H25NO4.ClH/c1-20-13-11-15(21-2)17(16(12-13)22-3)14(19)7-6-10-18-8-4-5-9-18;/h11-12H,4-10H2,1-3H3;1H                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
D10000   (null)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     

Below is the error:

ArgumentError: Python argument types in
    rdkit.Chem.rdmolfiles.MolToMolBlock(NoneType)
did not match C++ signature:
    MolToMolBlock(class RDKit::ROMol mol, bool includeStereo=True, int confId=-1, bool kekulize=True, bool forceV3000=False)

Any help is highly appreciated


Solution

  • The problem is elements[1] = ('\''+elements[1].strip()+'\'').

    The InChI is already a string and you add '" "' to it.

    Your InChI is now "'InChI=1S/C10H18O2/c1-7-4-5-8(6-9(7)11)10(2,3)12/h4,8-9,11-12H,5-6H2,1-3H3'"

    and not InChI=1S/C10H18O2/c1-7-4-5-8(6-9(7)11)10(2,3)12/h4,8-9,11-12H,5-6H2,1-3H3.

    Additionally you should insert a check, because otherwise you try to convert also (null) to a molblock.

    And by the way, you can use Chem.SDWriter for writting a SDF.

    from rdkit import Chem
    
    mols = []
    ids = []
    inchis = []
    
    with open(r'D:\Z\inchi_canonize.txt') as f:                                                                                   
        for line in f:
            lst=line.split(' ')
            elements = [x for x in lst if x]
            inchi = elements[1].rstrip("\n")
            mol = Chem.MolFromInchi(inchi)
            if mol is not None:
                mols.append(mol)
                ids.append(elements[0])
                inchis.append(inchi)
    
    w = Chem.SDWriter('foo.sdf')
    
    for n in range(len(mols)):
        mols[n].SetProp('_Name', inchis[n]) # set a title line
        mols[n].SetProp('ID', ids[n] ) # set an associated data
        w.write(mols[n])
    
    w.close()