Search code examples
pythonbiopythondistance-matrix

How to calculate the distance between all atoms in a PDB file and create a distance matrix from that


I would like to calculate the distances between all atom in a pdb file and then create a distance matrix from the result of the PDB.

I currently have all the x, y and z coordinates but I am struggling to to do this distance calculation for all atoms.

distance = sqrt((x1-x2)^2+(y1-y2)^2+(z1-z2)^2)

For example:

Distance between Atom 1 and Atom 2 ,3 ,4...

Distance between Atom 2 and Atom 3, 4, 5...

And so forth for every Atom in the PDB file. I'm new to coding so any method to achieve the end result would be great.

pdb file in question - https://files.rcsb.org/download/6GCH.pdb


Solution

  • considering your code, you can:

    x_y_z_ = list()
    ...
             for atom in residue:
                x = (atom.coord[0]) 
                y = (atom.coord[1])
                z = (atom.coord[2])
                x_y_z_.append([x,y,z])
    ...
    x_y_z_ = np.array(x_y_z_)
    print( pairwise_distances(x_y_z_,x_y_z_) )
    

    and them use pairwise_distances from sklearn, like:

    from sklearn.metrics import pairwise_distances
    import numpy as np
    
    x_y_z_ = np.array([[120,130,123],
        [655,123,666],
        [111,444,333],
        [520,876,222]])
    
    print( pairwise_distances(x_y_z_,x_y_z_) )
    
    out:
    
    [[  0.         762.31423967 377.8584391  852.24233643]
    [762.31423967   0.         714.04901793 884.51681725]
    [377.8584391  714.04901793   0.         605.1660929 ]
    [852.24233643 884.51681725 605.1660929    0.        ]]