Search code examples
pythonbiopythonpdb-files

How do I change the chain name of a pdb file?


I want to rename the chains of a PDB file '6gch' - https://www.rcsb.org/structure/6GCH.

I have checked the Biopython manual and can't seem to find anything. Any input would be of great help!


Solution

  • You can indeed just change the id attribute of the chain elements. After that you can use PDBIO to save the modified structure.

    Note however, that this process modifies the PDB quite a bit. PDBIO does not store entries like REMARKs, SHEETs and SSBONDs. If you know that you need those, you must be careful. Also this process moves the HETATMs at the end of the corresponding chain while the original PDB had them located at the end of the file.

    As 6GCH has 3 chains, I am using the dictionary renames to configure the mapping of old to new chain name. If a chain name is not included in this dict, no renaming will be done.

    from Bio.PDB import PDBList, PDBIO, PDBParser
    
    pdbl = PDBList()
    
    io = PDBIO()
    parser = PDBParser()
    pdbl.retrieve_pdb_file('6gch', pdir='.', file_format="pdb")
    
    # pdb6gch.ent is the filename when retrieved by PDBList
    structure = parser.get_structure('6gch', 'pdb6gch.ent')
    
    renames = {
        "E": "A",
        "F": "B",
        "G": "C"
    }
    
    for model in structure:
        for chain in model:
            old_name = chain.get_id()
            new_name = renames.get(old_name)
            if new_name:
                print(f"renaming chain {old_name} to {new_name}")
                chain.id = new_name
            else:
                print(f"keeping chain name {old_name}")
    
    io.set_structure(structure)
    io.save('6gch_renamed.pdb')