Molecular Dynamics Simulations Gaucher Disease
In this task, we carried out molecular dynamics simulations on the LRZ supercomputer which were analysed in the subsequent task. Simulating the the motions of proteins under different conditions can give insights about the functions of proteins and is used to study diseases which are caused by misfolded proteins. Here, we employed molecular dynamics simulations to investigate the impact of two point-mutations in glycosylceramidase.
Contents
Input structures
We applied molecular dynamics simulations for three structures of glycosylceramidase: the wildtype 2nt0_A which served as reference and the two mutants W209R and L470P which were generated by FoldX. W209R is a disease-causing mutation which might severely change the protein structure since the site is part of a alpha-helix and the negative arginine is likely to interfere with the hydrophobic environment. L470P is not listed in the HGMD and is therefore not disease-causing. However, both the sequence-based mutation analysis and the structure-based mutation analysis suggested that this mutation might be deleterious. By simulating structural changes caused by this mutations, we hope to get a better understanding about its impact on the phenotype.
Preparation
The input structures and output files are stored in a repository which can be checked out by:
git clone /mnt/home/student/angermue/mp/tasks/task08
We followed this description for preparing the molecular dynamics simulation:
- We checked out AGroS from https://github.com/offmarc/AGroS.
- We downloaded Scwrl4 from http://dunbrack.fccc.edu/scwrl4 and installed it into the same directory as AGroS.
- We prepared the job scripts.
- We submitted the scripts via sbatch SCRIPT-FILE on host lxa1 which runs the SLURM scheduler
Each job performs a molecular dynamics simulation via AGroS.
The AGroS molecular dynamics simulation pipeline
AGroS is a program that controls the three major steps of a molecular dynamics stimulation via GROMACS:
- Preparation
- Cleaning the PDB
- Adding missing side-chains
- Adding the solvent: H2O + NaCl
- Equilibration
- Minimizing the potential energy by varying
- the solvent
- the solvent and side-chains
- Short NVT run (50 ps)
- Short NPT run (50 ps)
- Minimizing the potential energy by varying
- Production MD simulation
- The main MD simulation (10 ns)
In the first steps, the structure is modified such that it is tolerated by GROMACS. In the equilibrium stage, the potential energy is minimized by slightly changing the position of solvent molecules and side-chains. The short NVT and NPT run brings the system into a state which meets certain input conditions, i.e. the starting temperature. In essence, equilibrating the system should avoid strong forces acting acting on particles at the beginning of the simulation. The equilibrated system is then used to carry out the main simulation.
Intermediate PDB files
AGroS creates several intermediate PDB files in the following order:
1. | NAME.pdb | The input structure |
2. | NAME_repair.pdb | NAME.pdb with common PDB problems repaired by repairPDB |
3. | NAME_br.pdb | Just the protein in NAME.pdb |
4. | NAME_water.pdb | Just the water in NAME.pdb |
5. | NAME_dna.pdb | Just the DNA in NAME.pdb |
6. | NAME_sc.pdb | NAME.pdb with side-chains optimized by SCWRL |
7. | NAME_nh.pdb | NAME_sc.pdb without hydrogen atoms with the water as well as DNA molecules of NAME.pdb included |
8. | NAME_solv.pdb | NAME_nh.pdb with the solvent H20 and 0.1 NaCl included |
9. | NAME_solv_min.pdb | NAME_solv.pdb with the potential energy of the solvent minimized |
10. | NAME_solv_min2.pdb | NAME_solv_min.pdb with the potential energy of the solvent and side-chains minimized |
11. | NAME_solv_min3.pdb | NAME_solv_min2.pdb with the potential energy of the solvent and side-chains minimized again |
Our input structure 2nt0 was not modified by repairPDB such that 2nt0.png and 2nt0_repair.pdb were identical. Since 2nt0 contained neither water nor DNA, 2nt0_water.pdb and 2nt0_dna.pdb were empty. The side-chain optimization by SCWRL led to slightly different secondary structure assignments (cf. <xr id="fig:pdb_repair_sc"/>). <xr id="fig:pdb_solv"/> depicts 2nt0_solv.pdb and the added Na+ and Cl- ions without water. The minimization of the solvent changed the position of Na+ and Cl- ions but not the protein (cf. <xr id="fig:pdb_solv_min"/>). The subsequent second minimization of the solvent and the side-chains did not alter the position of Na+ and Cl- ions any more. However, minimal deviations between the side-chains of 2nt0_solv_min.pdb and 2nt0_solv_min2.pdb could be observed (cf. <xr id="fig:pdb_min_min2"/>). <xr id="fig:pdb_min2_min3"/> shows that the third minimization run marginally changed some side-chains.
</figure> </figure> </figure> </figure> </figure><figure id="fig:pdb_repair_sc"> |
<figure id="fig:pdb_solv"> |
<figure id="fig:pdb_solv_min"> |
<figure id="fig:pdb_min_min2"> |
<figure id="fig:pdb_min2_min3"> |
Runtime
The jobs were queued for about 48h. The actual runtime is listed in <xr id="tab:runtime"/>. <figtable id="tab:runtime">
Wiltype | 16h 51min |
L470P | 9h 36min |
W209P | 9h 35min |
Runtime of the MD simulations. </figtable>
Validation
We did a course validation by looking at the output files and calling the tool gmxcheck -f 2nt0_A_md.xtc. According to the output, all simulations terminated successfully. All simulations produced 2001 frames.