Note
Go to the end to download the full example code.
Preprocessing of Molecular Dynamics Data
MDANCE provides a set of tools to preprocess molecular dynamics trajectories before clustering. This includes reading the trajectories, normalizing them, and aligning them. This snippet demonstrates how to read a trajectory and save it as a numpy array.
- Imports
numpy for manipulating and saving arrays.
gen_traj_numpyfor using the MDAnalysis library to read the trajectories and save them as numpy arrays.
import numpy as np
from mdance import data
from mdance.inputs.preprocess import gen_traj_numpy
- Inputs
input_topis the path to the topology file. Check here for all accepted formats.input_trajis the path to the trajectory file. Check here for all accepted formats.The trajectory file should be aligned and centered beforehand if needed!
output_nameis the name of the output file. The output file will be saved as{output_name}.npyfor faster loading in the future.atomSelectionis the atom selection used for clustering that must be compatible with the MDAnalysis Atom Selections Language.gen_traj_numpywill convert the trajectory to a numpy array with the shape(n_frames, n_atoms * 3)for comparison purposes.
input_top = data.top
input_traj = data.traj
output_base_name = 'backbone'
atomSelection = 'resid 3 to 12 and name N CA C O H'
traj_numpy = gen_traj_numpy(input_top, input_traj, atomSelection)
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/topology/PDBParser.py:350: UserWarning: Element information is missing, elements attribute will not be populated. If needed these can be guessed using universe.guess_TopologyAttrs(context='default', to_guess=['elements']).
warnings.warn("Element information is missing, elements attribute "
/home/docs/checkouts/readthedocs.org/user_builds/mdance/envs/latest/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:165: DeprecationWarning: DCDReader currently makes independent timesteps by copying self.ts while other readers update self.ts inplace. This behavior will be changed in 3.0 to be the same as other readers. Read more at https://github.com/MDAnalysis/mdanalysis/issues/3889 to learn if this change in behavior might affect you.
warnings.warn("DCDReader currently makes independent timesteps"
Number of atoms in trajectory: 217
Number of frames in trajectory: 6001
Number of atoms in selection: 50
- Outputs
The output is a numpy array of shape
(n_frames, n_atoms * 3).
output_name = output_base_name + '.npy'
np.save(output_name, traj_numpy)
Total running time of the script: (0 minutes 1.178 seconds)