I am new to rdkit. So excuse me if the question sounds very basic.I have a sdf file with several molecules. I would like to add certain properties to each entry. How can I achieve this? My sample data looks like this.
D00AAN
-OEChem-10101305022D
100108 0 1 0 0 0 0 0999 V2000
2.0000 5.1929 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
5.2896 2.9173 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
6.3905 -0.2731 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.8629 -5.1929 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1 53 1 0 0 0 0
2 5 1 0 0 0 0
2 6 2 0 0 0 0
M END
$$$$
D00AAU
-OEChem-10101305022D
42 43 0 1 0 0 0 0 0999 V2000
6.3301 3.2500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.0000 -3.2500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
4.5981 0.2500 0.0000 C 0 0 3 0 0 0 0 0 0 0 0 0
1 15 1 0 0 0 0
1 41 1 0 0 0 0
2 16 1 0 0 0 0
2 42 1 0 0 0 0
3 4 1 0 0 0 0
3 5 1 0 0 0 0
3 8 1 0 0 0 0
M END
$$$$
I would like to add a line after each molecule entry.
> <ID> id
The expected output is:
D00AAN
-OEChem-10101305022D
100108 0 1 0 0 0 0 0999 V2000
2.0000 5.1929 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
5.2896 2.9173 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
6.3905 -0.2731 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.8629 -5.1929 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1 53 1 0 0 0 0
2 5 1 0 0 0 0
2 6 2 0 0 0 0
M END
> <ID> D00AAN
$$$$
D00AAU
-OEChem-10101305022D
42 43 0 1 0 0 0 0 0999 V2000
6.3301 3.2500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.0000 -3.2500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
4.5981 0.2500 0.0000 C 0 0 3 0 0 0 0 0 0 0 0 0
1 15 1 0 0 0 0
1 41 1 0 0 0 0
2 16 1 0 0 0 0
2 42 1 0 0 0 0
3 4 1 0 0 0 0
3 5 1 0 0 0 0
3 8 1 0 0 0 0
M END
> <ID> D00AAU
$$$$
To get the title and turn it to a property,
read the .sdf with Chem.SDMolSupplier()
write or overwrite the .sdf with Chem.SDWriter('old.sdf | new.sdf')
get the title with GetProp('_Name')
set the title as a property SetProp('ID', 'title')
This code should work:
from rdkit import Chem
suppl = Chem.SDMolSupplier('old.sdf')
w = Chem.SDWriter('new.sdf') # or old.sdf to overwrite
for m in suppl:
n = m.GetProp('_Name') # title
m.SetProp('ID', n) # associated data
w.write(m)
w.close()