Note
Go to the end to download the full example code.
Clustering Results Postprocessing
MDANCE provides a set of tools to postprocess the clustering results. This snippet demonstrates how to write out trajectories for each cluster.
The pwd of this script is $PATH/MDANCE/examples.
- Imports
numpy for manipulating arrays.
MDAnalysis for reading and writing trajectory files.
import MDAnalysis as mda
import numpy as np
from mdance import data
- Read the original trajectory file with MDAnalysis.
input_top = data.top
input_traj = data.traj
u = mda.Universe(input_top, input_traj)
print(f'Number of atoms in the trajectory: {u.atoms.n_atoms}')
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/topology/PDBParser.py:350: UserWarning: Element information is missing, elements attribute will not be populated. If needed these can be guessed using universe.guess_TopologyAttrs(context='default', to_guess=['elements']).
warnings.warn("Element information is missing, elements attribute "
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:165: DeprecationWarning: DCDReader currently makes independent timesteps by copying self.ts while other readers update self.ts inplace. This behavior will be changed in 3.0 to be the same as other readers. Read more at https://github.com/MDAnalysis/mdanalysis/issues/3889 to learn if this change in behavior might affect you.
warnings.warn("DCDReader currently makes independent timesteps"
Number of atoms in the trajectory: 217
- Extract frames for each cluster using the cluster assignments from the previous step.
cluster_assignmentsis the path to the cluster assignment.It will take this list of frame and convert to a trajectory for each unique cluster.
This can also work for
../scripts/nani/outputs/labels_6.csv.
cluster_assignment = '../scripts/nani/outputs/best_frames_indices_6.csv'
- Define the frames to extract
xis the frame number.yis the cluster number.Output will be written to a DCD file for each cluster. Check here for all accepted formats.
x, y = np.loadtxt(cluster_assignment, delimiter=',', skiprows=2, dtype=int, unpack=True)
# get x value in a list for every unique y value
frames = [x[y == i] for i in np.unique(y)]
for i, frame in enumerate(frames):
# write trajectory with only the selected frames in frames[i]
with mda.Writer(f'best_frames_{i}.dcd', u.atoms.n_atoms) as W:
for ts in u.trajectory[frame]:
W.write(u.atoms)
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:463: UserWarning: No dimensions set for current frame, zeroed unitcell will be written
warnings.warn(wmsg)
Total running time of the script: (0 minutes 0.021 seconds)