Okay. Let me explain the things first. I have used a specific module named Biopython
in this code. I am explaining the necessary details to solve the problem if you are not accustomed with the module.
The code is:
#!/usr/bin/python
from Bio.PDB.PDBParser import PDBParser
import numpy as np
parser=PDBParser(PERMISSIVE=1)
structure_id="mode_7"
filename="mode_7.pdb"
structure=parser.get_structure(structure_id, filename)
model1=structure[0]
s=(124,3)
newc=np.zeros(s,dtype=np.float32)
coord=[]
#for chain1 in model1.get_list():
# for residue1 in chain1.get_list():
# ca1=residue1["CA"]
# coord1=ca1.get_coord()
# newc.append(coord1)
for i in range(0,29):
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc=np.add(newc,coord)
print newc
print "END"
PDB file is the protein data bank file. The file I'm working with can be downloaded from https://drive.google.com/open?id=0B8oUhqYoEX6YVFJBTGlNZGNBdlk
If you remove the hashes from the first for loop, you'll find that get_coord()
returns a (124,3)
array with dtype float32
. Likewise, the next for loop is supposed to return the same.
It gives out a strange error:
Traceback (most recent call last):
File "./average.py", line 27, in <module>
newc=np.add(newc,coord)
ValueError: operands could not be broadcast together with shapes (124,3) (248,3)
I am absolutely clueless how it manages to make a 248,3
array. I just want to add the array coord over itself. I tried with another modification of the code:
#!/usr/bin/python
from Bio.PDB.PDBParser import PDBParser
import numpy as np
parser=PDBParser(PERMISSIVE=1)
structure_id="mode_7"
filename="mode_7.pdb"
structure=parser.get_structure(structure_id, filename)
model1=structure[0]
s=(124,3)
newc=np.zeros(s,dtype=np.float32)
coord=[]
newc2=[]
#for chain1 in model1.get_list():
# for residue1 in chain1.get_list():
# ca1=residue1["CA"]
# coord1=ca1.get_coord()
# newc.append(coord1)
for i in range(0,29):
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc2=np.add(newc,coord)
print newc
print "END"
It gives out the same error. Can you help???
I'm not sure I fully understand what you're doing, but it looks like you need to reset the coords
list at the start of every iteration:
for i in range(0,29):
coords = []
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc=np.add(newc,coord)
If you keep appending without clearing the list you add 124 items to coords
at every iteration of the outer loop. The exception you see is likely raised during the second iteration.