Clustering Results Postprocessing

MDANCE provides a set of tools to postprocess the clustering results. This snippet demonstrates how to write out trajectories for each cluster.

The pwd of this script is $PATH/MDANCE/examples.

Imports

numpy for manipulating arrays.
MDAnalysis for reading and writing trajectory files.

import MDAnalysis as mda
import numpy as np

from mdance import data

Read the original trajectory file with MDAnalysis.

input_top is the path to the topology file. Check here for all accepted formats.
input_traj is the path to the trajectory file. Check here for all accepted formats.

input_top = data.top
input_traj = data.traj

u = mda.Universe(input_top, input_traj)
print(f'Number of atoms in the trajectory: {u.atoms.n_atoms}')

/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/topology/PDBParser.py:350: UserWarning: Element information is missing, elements attribute will not be populated. If needed these can be guessed using universe.guess_TopologyAttrs(context='default', to_guess=['elements']).
  warnings.warn("Element information is missing, elements attribute "
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:165: DeprecationWarning: DCDReader currently makes independent timesteps by copying self.ts while other readers update self.ts inplace. This behavior will be changed in 3.0 to be the same as other readers. Read more at https://github.com/MDAnalysis/mdanalysis/issues/3889 to learn if this change in behavior might affect you.
  warnings.warn("DCDReader currently makes independent timesteps"
Number of atoms in the trajectory: 217

Extract frames for each cluster using the cluster assignments from the previous step.

cluster_assignments is the path to the cluster assignment.
It will take this list of frame and convert to a trajectory for each unique cluster.
This can also work for ../scripts/nani/outputs/labels_6.csv.

cluster_assignment = '../scripts/nani/outputs/best_frames_indices_6.csv'

Define the frames to extract

x is the frame number.
y is the cluster number.
Output will be written to a DCD file for each cluster. Check here for all accepted formats.

x, y = np.loadtxt(cluster_assignment, delimiter=',', skiprows=2, dtype=int, unpack=True)

# get x value in a list for every unique y value
frames = [x[y == i] for i in np.unique(y)]

for i, frame in enumerate(frames):
   # write trajectory with only the selected frames in frames[i]
    with mda.Writer(f'best_frames_{i}.dcd', u.atoms.n_atoms) as W:
        for ts in u.trajectory[frame]:
            W.write(u.atoms)

/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:463: UserWarning: No dimensions set for current frame, zeroed unitcell will be written
  warnings.warn(wmsg)

Total running time of the script: (0 minutes 0.023 seconds)

Download Jupyter notebook: postprocessing.ipynb

Download Python source code: postprocessing.py

Gallery generated by Sphinx-Gallery