Search code examples
pythonpython-3.xrdkit

Sanitization error applying reaction to molecule RDKit


Sanitization error while applying a reaction to an molecule with wedged bond. I am getting this error while applying a proton removal reaction to a molecule but I do not see any error in MolBlock information.

This is for a reaction problem in which I am trying to apply a simple reaction (proton removal) to a molecule given its isomeric SMILES.

I create a function to apply reaction using SMARTS and SMILES but I am getting the following error which I could not fixed.

I am using the following code to load my inputs.

smile = rdkit.Chem.rdmolfiles.MolToSmiles(mol,isomericSmiles=True)

which leads to:

C/C1=C\\C[C@@H]([C+](C)C)CC/C(C)=C/CC1

I create the following dictionary to use my SMILES and SMARTS:

reaction_smarts = {}

# proton removal reaction

reaction_smarts["proton_removal"] = "[Ch:1]-[C+1:2]>>[C:1]=[C+0:2].[H+]" 

reactions = {name: AllChem.ReactionFromSmarts(reaction_smarts[name])      for name in reaction_smarts}

# function to run reactions

def run_reaction(molecule, reaction):
    products = []
    for product in reaction.RunReactant(molecule, 0):
        Chem.SanitizeMol(product[0])
        products.append(product[0])
    return products

# apply reaction


products = run_reaction(cation_to_rdkit_mol["mol_name"],  reactions["proton_removal"])

At this step I am getting this error but I cannot fix it. RDKit ERROR: [10:43:23] Explicit valence for atom # 0 C, 5, is greater than permitted

Expected results should be the the molecule with the double bond and its stereoisomers:

First product: CC(C)=C1C/C=C(\\C)CC/C=C(\\C)CC1

Second product: C=C(C)[C@@H]1C/C=C(\\C)CC/C=C(\\C)CC1

Third product: C=C(C)[C@H]1C/C=C(\\C)CC/C=C(\\C)CC1

I am using Chem.EnumerateStereoisomers.EnumerateStereoisomers() to get all stereoisomers but I am just getting the first and second product. I also added your initial proposal product[0].GetAtomWithIdx(0).SetNumExplicitHs(0) which actually fix the Explicit valence error. But now I am trying to figure it out how to get all that three stereoisomers.

Any hint why this is happening?, cause if I check the mol block with all the info about valence it seems to be fine.


Solution

  • The Error is stating that the Explicit valence for atom 0 (Carbon) is 5, this would suggest that the explicit hydrogen hasn't been removed although the bond is now a double bond, hence a valence of 5. I am not too familiar with reaction SMARTS although an easy way to fix this manually would be to set the number of explicit hydrogens on atom 0 to 0 before you sanitize:

    product.GetAtomWithIdx(0).SetNumExplicitHs(0)
    Chem.SanitizeMol(product)
    

    Edit 1: Scratch that, I did some experimentation, try this reaction:

    rxn = AllChem.ReactionFromSmarts('[#6@@H:1]-[#6+:2] >> [#6H0:1]=[#6+0:2]')
    

    This way in the reaction definition we explicitly state that a hydrogen is lost and the resultant molecule will sanitize. Does this work for you?

    Edit 2: When I run this reaction the product does not seem to contain a cation:

    mol = Chem.MolFromSmiles('C/C1=C\\C[C@@H]([C+](C)C)CC/C(C)=C/CC1')
    rxn  = AllChem.ReactionFromSmarts('[#6@@H:1]-[#6+:2] >> [#6H0:1]=[#6+0:2]')
    products = list()
    for product in rxn.RunReactant(mol, 0):
        Chem.SanitizeMol(product[0])
        products.append(product[0])
    print(Chem.MolToSmiles(products[0]))
    
    Output:
    'CC(C)=C1C/C=C(\\C)CC/C=C(\\C)CC1'
    

    Edit 3: I think I now understand what you are looking for:

    mol = Chem.MolFromSmiles('C/C1=C\\C[C@@H]([C+](C)C)CC/C(C)=C/CC1')
    
    # Reactant SMARTS
    reactant_smarts = '[CH3:1][C+:2][C@@H:3]'
    
    # Product SMARTS
    product_smarts = [
    '[CH2:1]=[CH0+0:2][CH:3]',
    '[CH2:1]=[CH0+0:2][C@H:3]',
    '[CH2:1]=[CH0+0:2][C@@H:3]',
    ]
    
    # Reaction SMARTS
    reaction_smarts = str(reactant_smarts + '>>' + '.'.join(product_smarts))
    
    # RDKit Reaction
    rxn = AllChem.ReactionFromSmarts(reaction_smarts)
    
    # Get Products
    results = list()
    for products in rxn.RunReactant(mol, 0):
        for product in products:
            Chem.SanitizeMol(product)
            results.append(product)
            print(Chem.MolToSmiles(product))
    
    Output:
    'C=C(C)C1C/C=C(\\C)CC/C=C(\\C)CC1'
    'C=C(C)[C@H]1C/C=C(\\C)CC/C=C(\\C)CC1'
    'C=C(C)[C@@H]1C/C=C(\\C)CC/C=C(\\C)CC1'
    'C=C(C)C1C/C=C(\\C)CC/C=C(\\C)CC1'
    'C=C(C)[C@H]1C/C=C(\\C)CC/C=C(\\C)CC1'
    'C=C(C)[C@@H]1C/C=C(\\C)CC/C=C(\\C)CC1'
    

    Note that we get the same products twice, I think this is because the reactant SMARTS matches both CH3 groups, hence the reaction is applied to both. I hope this is what you are looking for.