Search code examples
pythonmdanalysis

How to instantiate Atoms using MDAnalysis?


After looking over the documentation a few times and spending a handful of hours trying various ideas, I've decided I need a hand. I'm writing Python software that steps through NAMD simulations of RADA proteins, computing different values we're interested in.

Currently, my code steps through each time step, then steps through each atom in the system, performing various analytical steps.

What I need to be able to do is consolidate each amino acid into its own "atom" (single-site representation of the amino acid, located at the residue's center of mass).

Am I able to instantiate new Atoms, and add them to an MDAnalysis Universe? Can I populate those new Atoms with data such as my amino acid's center of mass, a human-readable name, etc?

Something like:

u = MDAnalysis.Universe(psf, coordDcd)
ag = u.selectAtoms(" the atoms in my amino acid ")
amino_acid = MDAnalysis.Atom
amino_acid.pos = ag.centerOfMass()

I know how to read the NAMD simulation (.dcd files) and all the atoms are represented fine, but ultimately, I need to turn ~20 atoms into one "averaged" atom (for computational simplification).


Solution

  • You can create a new Universe with empty atoms with the Universe.empty() method. In the following example I am using example files from the tests (data.PSF and data.DCD but you can use your own files).

    import MDAnalysis as mda
    from MDAnalysis.tests import datafiles as data   # only needed for the example files
    
    u = mda.Universe(data.PSF, data.DCD)
    
    # create the "coarse grained" universe with one Atom per Residue
    cg = mda.Universe.empty(n_atoms=u.residues.n_residues, n_residues=u.residues.n_residues, 
                            atom_resindex=u.residues.ix, trajectory=True)
    
    # add total mass for each residue
    cg.add_TopologyAttr("masses", u.residues.masses)
    
    # add total charge per residue
    cg.add_TopologyAttr("charges", u.residues.charges)
    
    # add names etc... so that cg.select_atoms("protein") works
    cg.add_TopologyAttr("resnames", u.residues.resnames)
    cg.add_TopologyAttr("resnums", u.residues.resnums)
    
    # use the residue center of mass for each residue as the position of the CG particle
    centers = u.residues.center_of_mass(compound="residues")
    cg.atoms.positions = centers
    

    The empty universe is bare bones; Using Universe.add_TopologyAttr() we can add topology attributes such as mass and charge, which might make some sense for the coarse grained particles. We initialize these attributes with aggregated per-residue values (u.residues.masses contains the mass of each residue whereas u.atoms.masses contains the masses of each atom). Adding residue names enables selections such as

    protein = cg.select_atoms("protein")
    

    to also work on the cg universe. Other topology attributes can be added in a similar fashion. (The documentation for the available attributes is not great at the moment so perhaps ask on the user mailing list or check out the User Guide: Topology System (work in progress as of Nov 2019)).

    We can also assign positions to the CG "atoms" (because we added an empty single-frame trajectory with trajectory=True) so that typical analysis works, e.g., the radius of gyration of the protein residues that we selected:

    rgyr = protein.radius_of_gyration()
    

    The above should answer the question as originally posted.


    However, note that creating a Universe for each timestep in a trajectory might be inefficient. It would be convenient to create a coarse grained Universe with a in-memory trajectory attached but the current Universe.empty() method (MDAnalysis 0.20.1) does not support this functionality — mainly because no-one asked for it. It would be trivial to implement so if you need it, please raise an issue in our issue tracker https://github.com/mdanalysis/mdanalysis/issues and suggest the feature.