I want to rename the chains of a PDB file '6gch' - https://www.rcsb.org/structure/6GCH.
I have checked the Biopython manual and can't seem to find anything. Any input would be of great help!
You can indeed just change the id
attribute of the chain elements. After that you can use PDBIO to save the modified structure.
Note however, that this process modifies the PDB quite a bit. PDBIO does not store entries like REMARKs, SHEETs and SSBONDs. If you know that you need those, you must be careful. Also this process moves the HETATMs at the end of the corresponding chain while the original PDB had them located at the end of the file.
As 6GCH has 3 chains, I am using the dictionary renames
to configure the mapping of old to new chain name. If a chain name is not included in this dict, no renaming will be done.
from Bio.PDB import PDBList, PDBIO, PDBParser
pdbl = PDBList()
io = PDBIO()
parser = PDBParser()
pdbl.retrieve_pdb_file('6gch', pdir='.', file_format="pdb")
# pdb6gch.ent is the filename when retrieved by PDBList
structure = parser.get_structure('6gch', 'pdb6gch.ent')
renames = {
"E": "A",
"F": "B",
"G": "C"
}
for model in structure:
for chain in model:
old_name = chain.get_id()
new_name = renames.get(old_name)
if new_name:
print(f"renaming chain {old_name} to {new_name}")
chain.id = new_name
else:
print(f"keeping chain name {old_name}")
io.set_structure(structure)
io.save('6gch_renamed.pdb')